Pre-registration request #3

VladimirMic · 2023-04-01T20:29:28Z

team

DISA-CRANBERRY

corresponding

[email protected]

tasks

Task A

subsets and projections

100M, 30M, 10M

comments

We focus mainly on the indexing of 100M 768-dimensional vectors.
The algorithm can search an arbitrary random subset of the 100M dataset, e.g. the 30M and 10M datasets provided. The average recall should always be around 90 % and above 90 % on the 3 sets provided (100M, 30M, 10M).
Datasets with less or precisely 50M objects are loaded into the main memory during the build and searched there. However, I did not test datasets of the size around 50M.
We do not use any of the provided PCA projections even though our Python code downloads the PCA96 datasets - sorry for that.
In case of need, Jan Sedmidubsky, who made the GHA part with Python, will be available from the 12th of July. Vladimir is online anytime except for the 11th of July.

members

Vladimir Mic
Jan Sedmidubsky
Pavel Zezula

github link

https://github.com/xsedmid/sisap23-laion-challenge-CRANBERRY/

VladimirMic · 2023-07-07T16:43:54Z

GitHub link:
https://github.com/xsedmid/sisap23-laion-challenge-CRANBERRY/

Commit hash:
[28e9941]

The params are set to search 100k subset, and they are insufficient to get 90 % recall. Instead, they are set to deal with the 7 GB of RAM in github.

VladimirMic · 2023-07-07T20:12:43Z

For the organizers: consider this submission to be final just after the final deadline, i.e. on the July 16 AoE.

maumueller · 2023-07-20T19:47:06Z

Dear team DISA-CRANBERRY (@VladimirMic),

Thank you very much for your submission. We are now in the process of evaluating your solution.

Please be reminded of the short paper deadline that is coming up on July 31st (AoE). See https://sisap-challenges.github.io/#reports for a short summary of the goals of this paper and please use the general submission guidelines at https://sisap.org/2023/guidelines.html (short research paper) to prepare your submission.

Please send the PDF of your submission via mail to the organizers:

Edgar Chavez ([email protected])
Eric S. Tellez ([email protected])
Martin Aumüller ([email protected])

Thanks again for your submission, and please reach out if you have any questions.

VladimirMic added the Pre-registration request Expression of interest for participate in the challenge label Apr 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre-registration request #3

Pre-registration request #3

VladimirMic commented Apr 1, 2023 •

edited

Loading

VladimirMic commented Jul 7, 2023 •

edited

Loading

VladimirMic commented Jul 7, 2023

maumueller commented Jul 20, 2023

Pre-registration request #3

Pre-registration request #3

Comments

VladimirMic commented Apr 1, 2023 • edited Loading

team

corresponding

tasks

subsets and projections

comments

members

github link

VladimirMic commented Jul 7, 2023 • edited Loading

VladimirMic commented Jul 7, 2023

maumueller commented Jul 20, 2023

VladimirMic commented Apr 1, 2023 •

edited

Loading

VladimirMic commented Jul 7, 2023 •

edited

Loading