Skip to content

Latest commit

 

History

History
28 lines (25 loc) · 1.69 KB

Meta-Llama-3-8B-Instruct-acc.md

File metadata and controls

28 lines (25 loc) · 1.69 KB

Due to licensing restrictions, we are unable to release the model. lm-eval 0.4.2 is used

For evaluating w4g128 without quantized lm-head,

lm_eval --model hf --model_args pretrained="./",autogptq=True,gptq_use_triton=True --device cuda:0 --tasks lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,rte,arc_easy,arc_challenge,mmlu --batch_size 16

for evaluation with quantized lm-head

git clone https://github.com/intel/auto-round
cd auto-round/examples/language-modeling
python3 eval_042/evaluation.py --model_name "./" --eval_bs 16
Metric BF16 w4g128 w/o lm-head w4g128 with lm-head
Avg. 0.6352 0.6312 0.6303
mmlu 0.6386 0.6306 0.6243
winogrande 0.7143 0.7238 0.7261
truthfulqa_mc1 0.3623 0.3537 0.3574
rte 0.6751 0.6859 0.6715
piqa 0.7867 0.7797 0.7775
openbookqa 0.3400 0.3300 0.3340
lambada_openai 0.7182 0.7200 0.7118
hellaswag 0.5769 0.5699 0.5686
boolq 0.8297 0.8309 0.8266
arc_easy 0.8152 0.8089 0.8123
arc_challenge 0.5299 0.5102 0.5111