-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance on MMLU #13
Comments
Hi, we have noticed slightly lower MMLU scores for declare-lab/flan-alpaca-xl compared to google/flan-t5-xl. This may be due to the zero-shot format of the Alpaca data compared to few-shot for MMLU, and we are benchmarking multiple models here: |
Thanks a lot for the efforts |
Are they evaluated using CoT prompting? |
Hi, the evaluation is using direct prompting for MMLU |
Btw, It seems you are doing few-shot prompting, am I right? |
Yes, we used 5-shot prompting for MMLU based on the Flan-T5 paper |
Have you evaluated the performance on MMLU compared to original?
The text was updated successfully, but these errors were encountered: