Alzheimer Disease Classification with XGBoost and KNN

This project aims to classify Alzheimer disease status using XGBoost and KNN machine learning algorithms. The interactive analysis and modeling pipeline is implemented using marimo, enabling visual and exploratory steps for understanding data and model performance.

Project Members

Minuettaro
...
...

Overview

This project covers the following stages:

Data Exploration: Includes loading data, checking for null values, and inspecting columns.
Correlation Analysis: Creates heatmaps and scatter plots for understanding feature relationships.
Model Construction: Splits the dataset into training and testing, normalizes data, and creates models.
Model Evaluation: Evaluates different classifiers, including XGBoost, KNN, SVC, Random Forest, Gradient Boosting, and Extra Tree classifiers.
Conclusion: Summarizes the model performances and compares accuracy across models.

Requirements

To run this project, ensure you have the following Python packages installed:

marimo==0.9.14
numpy
pandas
matplotlib
seaborn
altair
scikit-learn
xgboost

Install dependencies using:

pip install marimo numpy pandas matplotlib seaborn altair scikit-learn xgboost

Usage

Clone the Repository

git clone <repository-link>
cd <repository-directory>

Run the Application

marimo edit main.py

Navigate through the analysis:
- Data Exploration: Review data structure and statistics.
- Correlation Analysis: View relationships between continuous variables with scatter plots and heatmaps.
- Model Construction: Choose models and tune parameters such as KNN neighbors interactively.
- Model Evaluation: Compare the accuracy of different models using bar charts.
- Conclusion: Read the summarized results and model performance insights.

Conclusion

In the conclusion, models based on decision tree methods (e.g., Random Forest, XGBoost, Gradient Boosting) tend to perform better, achieving over 80% accuracy, while distance-based models like KNN and SVC show lower accuracy.

License

This project is licensed under the MIT License.

Acknowledgements

This project is built with marimo, a Python library designed for data science applications.

References

@misc{rabie_el_kharoua_2024,
title          = {Alzheimer's Disease Dataset},
url            = {https://www.kaggle.com/dsv/8668279},
DOI            = {10.34740/KAGGLE/DSV/8668279},
publisher      = {Kaggle},
author         = {Rabie El Kharoua},
year           = {2024}}

@misc{agrawal_scolnick_2023,
  title        = {marimo - an open-source reactive notebook for Python},
  url          = {https://marimo.io/},
  DOI          = {10.5281/zenodo.12735329},
  publisher    = {Zenodo},
  author       = {Agrawal, Akshay and Scolnick, Myles},
  year         = {2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
layouts		layouts
LICENSE		LICENSE
README.md		README.md
alzheimers_disease_data.csv		alzheimers_disease_data.csv
converted.py		converted.py
main.ipynb		main.ipynb
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Alzheimer Disease Classification with XGBoost and KNN

Project Members

Overview

Requirements

Usage

Conclusion

License

Acknowledgements

References

About

Releases

Packages

Languages

License

bintangyosua/alzheimer-risk-prediction

Folders and files

Latest commit

History

Repository files navigation

Alzheimer Disease Classification with XGBoost and KNN

Project Members

Overview

Requirements

Usage

Conclusion

License

Acknowledgements

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages