The summary and the relevant blog post for this project can be found here

This notebook aims to classify 7 different types of trees and give some clues about where to find them. I built an extra random forest classifier to detect fantastic trees in he Roosevelt National Forest of northern Colorado. I was able to classify the test set consisting 500.000 rows with 78% acuracy, placing this kernel among 28% among all competitors.

The notebook will follow the workflow suggested by Will Koehrsen in this article.

Undserstand, Clean and Format Data
Exploratory Data Analysis
Feature Engineering & Selection
Compare Several Machine Learning Models
Perform Hyperparameter Tuning on the Best Model
Evaluate the Best Model with Test Data
Interpret Model Results
Summary & Conclusions

Original kaggle kernel is here.

It is one big notebook, for the summary and results you can move directly to the 8. Summary & Conclusions but I cannot gurantee that you are not going to miss some beautiful visualizations and interesting insights about data science and machine learning. Enjoy Reading!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Files

README.md

Latest commit

History

README.md

File metadata and controls