Skip to content

Latest commit

 

History

History
431 lines (309 loc) · 11.9 KB

README_en-US.md

File metadata and controls

431 lines (309 loc) · 11.9 KB

AMchat

📝 Contents

📖 Introduction

AM (Advanced Mathematics) Chat is a large-scale language model that integrates mathematical knowledge, advanced mathematics problems, and their solutions. This model utilizes a dataset that combines Math and advanced mathematics problems with their analyses. It is based on the InternLM2-Math-7B model and has been fine-tuned with xtuner, specifically designed to solve advanced mathematics problems.

If you find this project helpful, feel free to ⭐ Star it and help more people discover it!

route

🚀 News

[2024.08.09] We have released the Q8_0 quantization model AMchat-q8_0.gguf.

[2024.06.23] InternLM2-Math-Plus-20B model fine-tuning.

[2024.06.22] InternLM2-Math-Plus-1.8B model fine-tuning, open-sourced a small-scale dataset.

[2024.06.21] Updated the README. Performed fine-tuning on the InternLM2-Math-Plus-7B model.

[2024.03.24] 2024 InternLM Challenge (Spring Split) | Innovation and Creativity Award.

[2024.03.14] The model has been uploaded to HuggingFace.

[2024.03.08] The README was enhanced with the addition of a table of contents and a technical roadmap. Additionally, a new document, README_en-US.md, was created.

[2024.02.06] Docker deployment is now supported.

[2024.02.01] The first version of AMchat is deployed online at https://openxlab.org.cn/apps/detail/youngdon/AMchat 🚀

🛠️ Usage

Quick Start

  1. Download the Model
From ModelScope

Refer to Downloading Models.

pip install modelscope
from modelscope.hub.snapshot_download import snapshot_download
model_dir = snapshot_download('yondong/AMchat', cache_dir='./')
From OpenXLab

Refer to Downloading Models.

pip install openxlab
from openxlab.model import download
download(model_repo='youngdon/AMchat', 
        model_name='AMchat', output='./')
  1. Local Deployment
git clone https://github.com/AXYZdong/AMchat.git 
python start.py
  1. Docker Deployment
docker run -t -i --rm --gpus all -p 8501:8501 guidonsdocker/amchat:latest bash start.sh

Retraining

Environment Setup

  1. Clone this project
git clone https://github.com/AXYZdong/AMchat.git 
cd AMchat
  1. Create a virtual environment
conda env create -f environment.yml
conda activate AMchat
pip install xtuner

XTuner Fine-tuning

  1. Prepare configuration files
# List all built-in configurations
xtuner list-cfg

mkdir -p /root/math/data
mkdir /root/math/config && cd /root/math/config

xtuner copy-cfg internlm2_chat_7b_qlora_oasst1_e3 .
  1. Model Download
mkdir -p /root/math/model

download.py

import torch
from modelscope import snapshot_download, AutoModel, AutoTokenizer
import os
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2-math-7b', cache_dir='/root/math/model')
  1. Modify configuration files

A fine-tuned configuration file is already provided under the config directory, you can refer to internlm_chat_7b_qlora_oasst1_e3_copy.py. It can be used directly; make sure to adjust the paths for pretrained_model_name_or_path and data_path accordingly.

cd /root/math/config
vim internlm_chat_7b_qlora_oasst1_e3_copy.py
# Change the model to local path
- pretrained_model_name_or_path = 'internlm/internlm-chat-7b'
+ pretrained_model_name_or_path = './internlm2-math-7b'

# Change the training dataset to local path
- data_path = 'timdettmers/openassistant-guanaco'
+ data_path = './data'
  1. Start fine-tuning
xtuner train /root/math/config2/internlm2_chat_7b_qlora_oasst1_e3_copy.py
  1. Convert PTH model to HuggingFace model
mkdir hf
export MKL_SERVICE_FORCE_INTEL=1
export MKL_THREADING_LAYER=GNU
xtuner convert pth_to_hf ./internlm2_chat_7b_qlora_oasst1_e3_copy.py \
                         ./work_dirs/internlm2_chat_7b_qlora_oasst1_e3_copy/epoch_3.pth \
                         ./hf
  1. Merge HuggingFace model into a large language model
# Original model parameter location
export NAME_OR_PATH_TO_LLM=/root/math/model/Shanghai_AI_Laboratory/internlm2-math-7b

# Hugging Face format parameter location
export NAME_OR_PATH_TO_ADAPTER=/root/math/config/hf

# Final merged parameter location
mkdir /root/math/config/work_dirs/hf_merge
export SAVE_PATH=/root/math/config/work_dirs/hf_merge

# Execute parameter merge
xtuner convert merge \
    $NAME_OR_PATH_TO_LLM \
    $NAME_OR_PATH_TO_ADAPTER \
    $SAVE_PATH \
    --max-shard-size 2GB
  1. Demo
streamlit run web_demo.py --server.address=0.0.0.0 --server.port 7860

OpenXLab Deployment

To deploy AMchat on OpenXLab, simply fork this repository and then create a new project on OpenXLab. Associate the forked repository with the newly created project, and you will be able to deploy AMchat on OpenXLab.

Demo

  • AMchat and InternLM2-Math-7B answer the same integral problem. AMchat answers correctly, while InternLM2-Math-7B answers incorrectly.

Demo Demo

LMDeploy Quantization

  • First, install LMDeploy
pip install -U lmdeploy
  • Then, convert the model to turbomind format

--dst-path: You can specify the storage location for the converted model.

lmdeploy convert internlm2-chat-7b  Model address to be converted --dst-path Converted model address
  • LMDeploy Chat
lmdeploy chat turbomind Converted turbomind model address

OpenCompass Evaluation

  • Install OpenCompass
git clone https://github.com/open-compass/opencompass 
cd opencompass
pip install -e .
  • Download and unzip the dataset
cp /share/temp/datasets/OpenCompassData-core-20231110.zip /root/opencompass/
unzip OpenCompassData-core-20231110.zip
  • Start evaluation!
python run.py \
    --datasets math_gen \
    --hf-path Model address \
    --tokenizer-path Tokenizer address \
    --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True \
    --model-kwargs device_map='auto' trust_remote_code=True \
    --max-seq-len 2048 \
    --max-out-len 16 \
    --batch-size 2  \
    --num-gpus 1 \
    --debug

LMDeploy & OpenCompass Quantization and Evaluation

W4 Quantization Evaluation
  • W4 Quantization
lmdeploy lite auto_awq Model address to be quantized --work-dir Quantized model address
  • Convert to TurbMind
lmdeploy convert internlm2-chat-7b Quantized model address --model-format awq --group-size 128 --dst-path Converted model address
  • Evaluation config writing
from mmengine.config import read_base
from opencompass.models.turbomind import TurboMindModel

with read_base():
 # choose a list of datasets   
 from .datasets.ceval.ceval_gen import ceval_datasets 
 # and output the results in a chosen format
#  from .summarizers.medium import summarizer

datasets = [*ceval_datasets]

internlm2_chat_7b = dict(
     type=TurboMindModel,
     abbr='internlm2-chat-7b-turbomind',
     path='Converted model address',
     engine_config=dict(session_len=512,
         max_batch_size=2,
         rope_scaling_factor=1.0),
     gen_config=dict(top_k=1,
         top_p=0.8,
         temperature=1.0,
         max_new_tokens=100),
     max_out_len=100,
     max_seq_len=512,
     batch_size=2,
     concurrency=1,
     #  meta_template=internlm_meta_template,
     run_cfg=dict(num_gpus=1, num_procs=1),
)
models = [internlm2_chat_7b]
  • Start evaluation!
python run.py configs/eval_turbomind.py -w Specify the result save path
KV Cache Quantization Evaluation
  • Convert to TurbMind
lmdeploy convert internlm2-chat-7b Model path --dst-path Converted model path
  • Calculate and obtain quantization parameters
# Calculate
lmdeploy lite calibrate Model path --calib-dataset 'ptb' --calib-samples 128 --calib-seqlen 2048 --work-dir Parameter save path
# Get quantization parameters
lmdeploy lite kv_qparams Parameter save path Converted model path/triton_models/weights/ --num-tp 1
  • Change quant_policy to 4, change the path in the above config
  • Start evaluation!
python run.py configs/eval_turbomind.py -w Result save path
  • Result files and evaluation datasets can be obtained in the same directory results.

💕 Acknowledgements

InternLM-tutorial

InternStudio

xtuner

InternLM-Math

🖊️ Citation

@misc{2024AMchat,
    title={AMchat: A large language model integrating advanced math concepts, exercises, and solutions},
    author={AMchat Contributors},
    howpublished = {\url{https://github.com/AXYZdong/AMchat}},
    year={2024}
}

License

This project is released under the Apache License 2.0. Please also adhere to the Licenses of models and datasets being used.