Set up the environment using the following command after setting up the environment for SLAM-LLM:
# there may be conflicts, but runs well on my machine
pip install -r requirements.txt
# or
pip install -r requirements.txt --no-dependencies
or you can set up another environment, read voicebench for more detail. This way, you need to switch your environment between inference and marking.
Use the same environment as Slam-omni
Set up the environment according to Llama-omni
Currently, we support evaluation for 10 datasets. Model's responses are evaluated in 4 different modes.
alpacaeval_test, commoneval_test, wildchat_test
storal_test, summary_test, truthful_test
gaokao_test, gsm8k_test, mlc_test
repeat_test
In non-asr mode, we directly evaluate the output text of LLM.
Run the following command:
# choose ${val_data_name}
bash ./scripts/eval/eval.sh
or run inference and marking separately
# choose ${val_data_name}
bash ./scripts/eval/inference_for_eval.sh
conda activate voicebench
bash ./scripts/eval/mark_only.sh
In asr mode, we use whisper-large-v3 for asr and evaluate the transcription of the output speech.
Run the following command:
# choose ${val_data_name}
bash ./scripts/eval/eval_with_asr.sh
or run inference and marking separately
# choose ${val_data_name}
bash ./scripts/eval/inference_for_eval.sh
conda activate voicebench
bash ./scripts/eval/asr_for_eval.sh
For non-asr mode, run the following command:
# choose ${val_data_name}
bash ./scripts/eval/mini-omni-eval.sh
For asr mode, just uncomment corresponding code in mini-omni-eval.sh
Attention! You need to switch to your Llama-Omni environment
For non-asr mode, run the following command:
conda activate llama-omni
# choose ${val_data_name}
bash ./scripts/eval/llama-omni-eval.sh
For asr mode, just uncomment corresponding code in llama-omni-eval.sh