Summary #2

alexander-dubinski · 2019-01-23T09:31:01Z

Rubric Score

Criteria 1: Valid Python Code

Score Level: 4 (Meets Expectations)
Comment(s): All code is valid without errors.

Criteria 2: Exploration of Data

Score Level: 4 (Exceeds Expectations)
Comment(s): Data is explored well and visualization is useful. I will just add that some things like the correlation table should have been included in the presentation and commented on. This would help me (as the reader) understand what you were thinking while going through this data while also giving me an idea of what the data looks like. In the world of professional data science, only about 30-40% of the job is coding and analysis, the other 60-70% is explaining and visualizing for non-technical people.

Criteria 3: Machine Learning Techniques used correctly

Score Level: 3 (Meets Expectations)
Comment(s): ML techniques are used correctly and a good variety of different models are used. I have two criticisms on this though. First, results of your models were not very useful and in most cases, just guessing would have been a better option for classification. This bad result warrants some discussion on what went wrong in the results section of the presentation. Also, you used regression for classification, but you also reported the r-squared. In the case of classification, r-squared is meaningless and will usually be low because of the nature of classification. The only metric which matters for classification (whether regression or others) is accuracy or the percent of correct classifications on test data.

Criteria 4: Report - Are conclusions clear and supported by data?

Score Level: 3
Comment(s): Conclusions are present and contain a good amount of discussion. My main issue is that there are no results and figures to backup your conclusion and your results warrant a much deeper discussion on why the results were so useless.

Criteria 5: Code formatting

Score Level: 4
Comment(s): Code is great! Good job using a notebook and not a python script.

Overall Score: 18/20

Overall this project is very well done. The biggest issue is that the results were so bad but this can be investigated deeper in the future. Great job and happy coding!

andrewhercules · 2019-01-23T23:34:37Z

@addubinski, for Multiple Linear Regression and K Nearest Neighbors Regression, I computed the r-squared value using the .score function based on the lessons. But in your comment, I think I should have used a different value for the K Nearest Neighbors Regression. Could you please tell me which value to use?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Summary #2

Summary #2

alexander-dubinski commented Jan 23, 2019

andrewhercules commented Jan 23, 2019

Summary #2

Summary #2

Comments

alexander-dubinski commented Jan 23, 2019

Rubric Score

Criteria 1: Valid Python Code

Criteria 2: Exploration of Data

Criteria 3: Machine Learning Techniques used correctly

Criteria 4: Report - Are conclusions clear and supported by data?

Criteria 5: Code formatting

Overall Score: 18/20

andrewhercules commented Jan 23, 2019