Different evaluate result between different VLMEvalKit version #523

terry-for-github · 2024-10-16T04:34:20Z

The official LLaVA-v1.5-7B model gets 1362 points in MME Perception scores under the newest code 69b7b5e.

But it turns out to be 1497 under the 027e38c commit.

I was sure about the consistent LLaVA code between this two experiement, why would this happened?

The text was updated successfully, but these errors were encountered:

terry-for-github · 2024-10-16T06:54:14Z

I've check that once again. The problem still exists.

terry-for-github mentioned this issue Oct 16, 2024

[Help Wanted] the alignment with official accuracy in llama3.2-vision #493

Open

terry-for-github mentioned this issue Oct 17, 2024

关于llava-v1.5-7b复现 #391

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different evaluate result between different VLMEvalKit version #523

Different evaluate result between different VLMEvalKit version #523

terry-for-github commented Oct 16, 2024

terry-for-github commented Oct 16, 2024

Different evaluate result between different VLMEvalKit version #523

Different evaluate result between different VLMEvalKit version #523

Comments

terry-for-github commented Oct 16, 2024

terry-for-github commented Oct 16, 2024