Repo for working and collaborate with other team members on task, given in MEPhI at ai in buisness course
This is the classification based E-commerce text dataset for 4 categories:
- "Electronics"
- "Household"
- "Books"
- "Clothing & Accessories",
which almost cover 80% of any E-commerce website.
The dataset is in ".csv" sored in data
directory format with two columns - the first column is the class name and the second one is the datapoint of that class. The data point is the product and description from the e-commerce website.
The dataset has the following features:
- Data Set Characteristics: Multivariate
- Number of Instances: 50 425
- Number of classes: 4
- EDA
- Text processing
- TfIdf + LogReg with
f1_weighted = 0.925
- Fasttext
f1_weighted = 1.0