Skip to content

EORez/Tree-Classification-ML-Model

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

97 Commits
 
 
 
 
 
 

Repository files navigation

The summary and the relevant blog post for this project can be found here

This notebook aims to classify 7 different types of trees and give some clues about where to find them. I built an extra random forest classifier to detect fantastic trees in he Roosevelt National Forest of northern Colorado. I was able to classify the test set consisting 500.000 rows with 78% acuracy, placing this kernel among 28% among all competitors.

The notebook will follow the workflow suggested by Will Koehrsen in this article.

  1. Undserstand, Clean and Format Data

  2. Exploratory Data Analysis

  3. Feature Engineering & Selection

  4. Compare Several Machine Learning Models

  5. Perform Hyperparameter Tuning on the Best Model

  6. Evaluate the Best Model with Test Data

  7. Interpret Model Results

  8. Summary & Conclusions

Original kaggle kernel is here.

It is one big notebook, for the summary and results you can move directly to the 8. Summary & Conclusions but I cannot gurantee that you are not going to miss some beautiful visualizations and interesting insights about data science and machine learning. Enjoy Reading!