Members of the group:
- Feras Rafeh
- Bruna Santos
- Iranel González
We are doing a modeling project using logistic regression that predicts diabetes among patients.
Dataset Info:
- Source: Diabetes Dataset
- Number of rows: 100,000
- Number of features: 9
Project Planning:
Day 1
- EDA
Day 2
- Data Cleaning:
- Remove typos
- Correct datatypes
- Replace or Drop NaN's
Day 3
- Data Transformation:
- Split data into numerical and categorical
- Scale Numerical Features
- Encode Categorical Features
Day 4
- Data Modelling
- Create and train the model
- Test the model