Skip to content

Latest commit

 

History

History
25 lines (13 loc) · 1.46 KB

README.md

File metadata and controls

25 lines (13 loc) · 1.46 KB

The summary and the relevant blog post for this project can be found here

This notebook aims to classify 7 different types of trees and give some clues about where to find them. I built an extra random forest classifier to detect fantastic trees in he Roosevelt National Forest of northern Colorado. I was able to classify the test set consisting 500.000 rows with 78% acuracy, placing this kernel among 28% among all competitors.

The notebook will follow the workflow suggested by Will Koehrsen in this article.

  1. Undserstand, Clean and Format Data

  2. Exploratory Data Analysis

  3. Feature Engineering & Selection

  4. Compare Several Machine Learning Models

  5. Perform Hyperparameter Tuning on the Best Model

  6. Evaluate the Best Model with Test Data

  7. Interpret Model Results

  8. Summary & Conclusions

Original kaggle kernel is here.

It is one big notebook, for the summary and results you can move directly to the 8. Summary & Conclusions but I cannot gurantee that you are not going to miss some beautiful visualizations and interesting insights about data science and machine learning. Enjoy Reading!