-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rename PawsX to PawsXPairClassification #2
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Out of curiosity, how did the names not end up the same as in MTEB? EDIT: read the other issue, I understand now :)
Nice! I think we also have to update https://github.com/embeddings-benchmark/leaderboard/blob/c83ff74f9f40a9cc85bda773cdfb639b0f4d82aa/config.yaml#L233 (but in a backward compatible way such that the old name is still okay too, e.g. like here for MLSumClustering where the name also changed: https://github.com/embeddings-benchmark/leaderboard/blob/c83ff74f9f40a9cc85bda773cdfb639b0f4d82aa/refresh.py#L213) |
Like so: embeddings-benchmark/leaderboard@9c65fce ? (sorry I realised this should have been a PR and not a push to main - maybe we should enable no pushes to main just so that I don't make that mistake again...) |
Looks good; I think we also need to add it here: https://github.com/embeddings-benchmark/leaderboard/blob/9c65fce82c7f4d3f80c168bbd145df39a80c80ac/refresh.py#L322 |
Oh and I think it should be the other way round i.e.
|
Added a fix here: embeddings-benchmark/leaderboard#5 |
I think we may need to update this on the leaderboard as well? The CI failed due to EDIT: were there other tasks as well we need to update for the leaderboard or just this one @KennethEnevoldsen? |
Just this one |
Hmm maybe we should add a test such that embeddings-benchmark/leaderboard#5 would have failed? |
If we want we could even just add the same task that runs overnight, the refresh. It takes about 10 minutes though, is that too long? I do agree we should add some test. |
10 minutes is ok, though it would be nice if it was faster (since it makes debugging harder). Anything in the process that can be skipped (or potentially we can downsample in some way for the test) |
Since some file system are case-insensitive I here fix the naming of the files here.
Required for embeddings-benchmark/mteb#1038