Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prompt versions of non-instruction-tuned LLaMA models #89

Open
ikuyamada opened this issue Sep 4, 2023 · 1 comment
Open

Prompt versions of non-instruction-tuned LLaMA models #89

ikuyamada opened this issue Sep 4, 2023 · 1 comment

Comments

@ikuyamada
Copy link

ikuyamada commented Sep 4, 2023

It appears that the leaderboard results for non-instruction-tuned LLaMA models (e.g., meta-llama/Llama-2-7b-hf) in the jp-stable branch are measured using prompt version 0.3. However, according to the documentation, this prompt was designed for instruction-tuned models.

Should we consider using version 0.1 or 0.2 for these non-instruction-tuned models instead?

@mkshing
Copy link

mkshing commented Oct 11, 2023

@ikuyamada yes, you're correct. The "base" models should be evaluated by 0.1 or 0.2. And, we have already noticed this mistake in https://github.com/Stability-AI/lm-evaluation-harness/blob/jp-stable/models/llama2/llama2-7b/harness.sh#L2 and will update it soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants