This repository contains the code for the paper Pablo Zivic, Hernán C. Vazquez and Jorge Sanchez (2024) "Scaling Sequential Recommendation Models with Transformers" published at The 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024)
- Installing
- Scaling Datasets
- Pretrain a model
- Use MLFLOW server to check metrics
- Fine-tune a model
- Sample logs
- Recbole fork
- Contact
- Citation
To install the required packages you need to use either requirements_cpu.txt
if you are installing on a CPU
or requirements_gpu.txt
if you are installing on a GPU. Using a computer with many CPU cores will speed up
ETL since it is parallelized via multicore. For pre-training and fine-tuning GPU is required.
- CPU:
pip install -r requirements_cpu.txt
- GPU:
pip install -r requirements_gpu.txt
To compute the scaled datasets from the Amazon Product Data run
python scripts/scaling_datasets.py --step all
Grab a coffe because this will take about 9 hours on a 8 core machine.
The pre-training script reads a configuration file and trains a model on the specified dataset.
There are three configuration files on the confs/pre_train
folder to pre-train on 100K samples, 1M samples, and 10M samples.
python scripts/pre_train.py -c confs/pre_train/amazon-100K -o pretrain_checkpoint.pth
The model will be written on the file pretrain_checkpoint.pth
Either while pre-training, fine-tuning or after training a model you can use MLFLOW to check the metrics. To start the MLFLOW server run
mlflow server --backend-store-uri sqlite:///mlruns.sqlite
Make sure to have the mlruns.sqlite
file on the same folder where you run the command.
The fine-tuning script is a little more complex. To train a model from scratch run
python scripts/fine_tune.py -c confs/fine_tune/from_scratch.json -s amazon-beauty
The dataset can be either amazon-beauty
or amazon-sports
which are created by the ETLs.
Also it can be any amazon dataset available on Recbole
To fine tune a pre-trained model run
python scripts/fine_tune.py -c <config_file> -s amazon-beauty -l $PRETRAIN_DATASET -i $CHECKPOINT_FNAME
-p $PRETRAIN_CONFIG -o fine_tuned_checkpoint.pth
For example
python scripts/fine_tune.py -c confs/fine_tune/progressive.json -s amazon-beauty -l amazon-100K -i test.pth -p confs/pre_train/config_dict_100K.json -o fine_tuned_checkpoint.pth
This fine tunes a model pretrained on $PRETRAIN_DATASET
(e.g. amazon-1M) with the checkpoint $CHECKPOINT_FNAME
and the configuration file $PRETRAIN_CONFIG
(e.g. confs/config_dict_1M.json
)
There are 3 log files on the logs
folder so you can see the expected logs for computing datasets, pre-training, fine-tuning.
There were some limitations on Recbole so we decide to fork it. We forked from the commit 321bff8fc
To summarize the changes are
- Better implementation of negative sampling (to use only one forward pass instead of one for each negative sample)
- Evaluate NDCG both on train, valid and test to check for overfitting
- Log metrics into mlflow, which is more convenient to analyze the results
- Compute total parameters and non-embedding parameters
- Implement one cycle learning rate scheduler
For comments, questions, or suggestions please contact me. You can also reach me on twitter at @ideasrapidas
This project uses several dependencies as seen in requirements files. The developers of this project are not responsible for any potential problems or vulnerabilities that may arise when using this code.
This project is licensed under the terms of the MIT License. See LICENSE.
If you use the code, data or conclusions of the paper, please cite us:
@inproceedings{10.1145/3626772.3657816,
author = {Zivic, Pablo and Vazquez, Hernan and S\'{a}nchez, Jorge},
title = {Scaling Sequential Recommendation Models with Transformers},
year = {2024},
isbn = {9798400704314},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3626772.3657816},
doi = {10.1145/3626772.3657816},
booktitle = {Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval},
pages = {1567–1577},
numpages = {11},
keywords = {scaling laws, sequential recommendation, transfer learning, transformers},
location = {Washington DC, USA},
series = {SIGIR '24}
}