Utilize the tree-based models, including XGBoost, LightGBM and CatBoost, to predict the resale price. We start with Feature engineering, followed by Model implementation and tunning, and end with evaluation.
This personalized recommendation system will help the new customer to explore their interests and find the best car that they might be interested. The entire system address the issues of cold-start, overspecialization. It`s robust enough to adjust the recommended items based on the accumulated browsing history and prevent to recommend the same item that has been clicked by the customer.
train.csv - the training set test.csv - the test set
Please refer to https://www.kaggle.com/competitions/cs5228-2021-semester-1-final-project/data
Missing Value; Duplicates; Outlier;
- Data Transformation: Continous Features; Categorical Feature; Free Text Feature; Date-related Feature
- Feature selection based on Correlation Matrix
- Baseline Model: XGboost
- Other Models: LightGBM and CatBoost.
Instructions
- Data Preprocess: Run "Feature engineering.ipynb"
- Model Selections: Run"Pycaret Models.ipynb" --> Utilize the Pycaret package to test the different models with their default parameters on the training set and the results. The results show that CatBoost was the best, followed by LightGBM and finally XGBoost
- XGBoost model tunning: Run "train_xgb.ipynb" --> Utilize the GridSearchCV package to find the best parameters of the XGBoost Model.
- Design for new customer: This recommendation system can do recommendation for new customers by implementing popular-based engines.
- Create Customer Profile: This recommendation system can remember customers' browsing history and make recommendations based on item-item similarities
- Prevent Overspecialization: This recommendation system can prevent overspecializa- tion by adding some Popular item into the recommenda- tion system.
- Prevent Repeating: Prevent to recommend the same item that has been clicked by the customer.
- Robustness: The recommendation system can adjust the recommended items based on the accumulated browsing history. Recomemndation System Workflow: