Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor search sync to try to reduce memory usage and hash ids #662

Merged
merged 5 commits into from
Jan 6, 2025

Conversation

alfredgrip
Copy link
Contributor

@alfredgrip alfredgrip commented Dec 20, 2024

Title speaks for itself.

Idea is to divide data in batches and send it one at a time, this way the garbage collector should be able to drop objects since they go out of scope.

Current batch size is 1000, which I think should be fine.

Also hashes ids which should be a (temporary) fix to #661

@alfredgrip alfredgrip changed the title Refactor search sync to try to reduce memory usage Refactor search sync to try to reduce memory usage and hash ids Dec 20, 2024
@alfredgrip alfredgrip force-pushed the search-sync-fixes branch 2 times, most recently from 57d2ecb to 7371b4c Compare December 20, 2024 14:39
@danieladugyan danieladugyan requested review from danieladugyan and removed request for Isak-Kallini December 30, 2024 20:56
@danieladugyan
Copy link
Member

There's a lot of code in sync.ts and searchTypes.ts , to be honest it's a bit hard for me to grasp it all😅

src/lib/search/searchHelpers.ts Outdated Show resolved Hide resolved
src/lib/search/sync.ts Outdated Show resolved Hide resolved
src/lib/search/searchHelpers.ts Show resolved Hide resolved
@alfredgrip
Copy link
Contributor Author

There's a lot of code in sync.ts and searchTypes.ts , to be honest it's a bit hard for me to grasp it all😅

Yeah searchTypes.ts is a lot...
Essentially, for every index in Meili there is a type for:

  1. What attributes is stored in Meili for that index
  2. Which attributes can a user perform a search after
  3. Which attributes are returned

Of course, all attributes that can be searched on, or are returned, must be stored in Meilisearch. It not as simple as doing a union of 2. and 3. to get 1. however, since some attributes (likestartDatetime for events) are used purely for sorting and ranking purposes internally by Meili.

Then there are objects like const memberMeilisearchConstants: MemberConstantsMeilisearch = {... which wraps things related to an index in a single object. Here we can specify custom ranking and sorting rules for Meili, such as giving newer members a higher ranking, and tweak which typo tolerance is allowed.

I know that the file is full of types, but it is to prevent us developers from accidentally trying to e.g. do custom ranking rules on an attribute that isn't even stored in Meilisearch.

As for sync.ts, it's basically just dump the data and attributes defined in searchTypes.ts to Meilisearch, but do so in batches. When all the data is dumped, tweak the rankings based on the values and types also defined in searchTypes.ts.

@danieladugyan
Copy link
Member

There's a lot of code in sync.ts and searchTypes.ts , to be honest it's a bit hard for me to grasp it all😅

Yeah searchTypes.ts is a lot... Essentially, for every index in Meili there is a type for:

  1. What attributes is stored in Meili for that index
  2. Which attributes can a user perform a search after
  3. Which attributes are returned

Of course, all attributes that can be searched on, or are returned, must be stored in Meilisearch. It not as simple as doing a union of 2. and 3. to get 1. however, since some attributes (likestartDatetime for events) are used purely for sorting and ranking purposes internally by Meili.

Then there are objects like const memberMeilisearchConstants: MemberConstantsMeilisearch = {... which wraps things related to an index in a single object. Here we can specify custom ranking and sorting rules for Meili, such as giving newer members a higher ranking, and tweak which typo tolerance is allowed.

I know that the file is full of types, but it is to prevent us developers from accidentally trying to e.g. do custom ranking rules on an attribute that isn't even stored in Meilisearch.

As for sync.ts, it's basically just dump the data and attributes defined in searchTypes.ts to Meilisearch, but do so in batches. When all the data is dumped, tweak the rankings based on the values and types also defined in searchTypes.ts.

Thanks! I added that as a comment to the top of the file since it helped a lot.

@danieladugyan danieladugyan merged commit 20a0874 into main Jan 6, 2025
3 checks passed
@danieladugyan danieladugyan deleted the search-sync-fixes branch January 6, 2025 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants