Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-registration request #3

Open
VladimirMic opened this issue Apr 1, 2023 · 3 comments
Open

Pre-registration request #3

VladimirMic opened this issue Apr 1, 2023 · 3 comments
Labels
Pre-registration request Expression of interest for participate in the challenge

Comments

@VladimirMic
Copy link

VladimirMic commented Apr 1, 2023

team

DISA-CRANBERRY

corresponding

[email protected]

tasks

Task A

subsets and projections

100M, 30M, 10M

comments

  • We focus mainly on the indexing of 100M 768-dimensional vectors.

  • The algorithm can search an arbitrary random subset of the 100M dataset, e.g. the 30M and 10M datasets provided. The average recall should always be around 90 % and above 90 % on the 3 sets provided (100M, 30M, 10M).

  • Datasets with less or precisely 50M objects are loaded into the main memory during the build and searched there. However, I did not test datasets of the size around 50M.

  • We do not use any of the provided PCA projections even though our Python code downloads the PCA96 datasets - sorry for that.

  • In case of need, Jan Sedmidubsky, who made the GHA part with Python, will be available from the 12th of July. Vladimir is online anytime except for the 11th of July.

members

Vladimir Mic
Jan Sedmidubsky
Pavel Zezula

github link

https://github.com/xsedmid/sisap23-laion-challenge-CRANBERRY/

@VladimirMic VladimirMic added the Pre-registration request Expression of interest for participate in the challenge label Apr 1, 2023
@VladimirMic
Copy link
Author

VladimirMic commented Jul 7, 2023

GitHub link:
https://github.com/xsedmid/sisap23-laion-challenge-CRANBERRY/

Commit hash:
[28e9941]

The params are set to search 100k subset, and they are insufficient to get 90 % recall. Instead, they are set to deal with the 7 GB of RAM in github.

@VladimirMic
Copy link
Author

For the organizers: consider this submission to be final just after the final deadline, i.e. on the July 16 AoE.

@maumueller
Copy link
Contributor

Dear team DISA-CRANBERRY (@VladimirMic),

Thank you very much for your submission. We are now in the process of evaluating your solution.

Please be reminded of the short paper deadline that is coming up on July 31st (AoE). See https://sisap-challenges.github.io/#reports for a short summary of the goals of this paper and please use the general submission guidelines at https://sisap.org/2023/guidelines.html (short research paper) to prepare your submission.

Please send the PDF of your submission via mail to the organizers:

Thanks again for your submission, and please reach out if you have any questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Pre-registration request Expression of interest for participate in the challenge
Projects
None yet
Development

No branches or pull requests

2 participants