LLMCL is a repository based on the Hugging Face Transformers library, designed to assess the continuous learning capability of large language models. Through this repository, users can easily customize datasets, specify models, and experiment with existing classical continuous learning methods.
- Continual Learning Methods: The repository includes several classical continuous learning methods for users to reference and use.
- Model Customization: You can easily customize the model you want to use, and the repository will automatically download the corresponding model.
1.Clone the repository
git clone https://github.com/which47/LLMCL.git
2.Install dependencies
pip install -r requirements.txt
3.Start Training
deepspeed main.py \
--model_name_or_path 'meta-llama/Llama-2-7b-hf' \
--output_dir "./outputs/models/seq" \
--dataset_name "C-STANCE,FOMC,MeetingBank,ScienceQA,NumGLUE-cm,20Minuten,medmcqa,jecqa" \
--per_device_train_batch_size 16 \
--adapter lora
1.Request the access to llama2
model and download TRACE Benchmark , MedMCQA,JEC-QA to ./data_files
folder.
2.run scripts customize your training scripts and run it.
If you find this repository helpful, please consider citing our work.
@misc{ren2024analyzing,
title={Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning},
author={Weijieying Ren and Xinlong Li and Lei Wang and Tianxiang Zhao and Wei Qin},
year={2024},
eprint={2402.18865},
archivePrefix={arXiv},
primaryClass={cs.LG}
}