Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ensure repairkit compatibility with large-doc nodes #77

Merged
merged 4 commits into from
Sep 19, 2023
Merged

Conversation

alexdunnjpl
Copy link
Contributor

🗒️ Summary

Repairkit hammers OpenSearch with queries requesting many (10k doc) docs, and doc updates. For nodes with large documents, this can cause errors both on the query side (450MB responses are not viable and cause overflows) and the update side (triggering indexing on 4.5GB of docs every couple of minutes also appears to be non-viable).

This PR implements repairkit-specific constraints to reduce page size, and limit the update throughput.

Logging is also improved significantly.

Of note, repairkit will now log an error at @jordanpadams request if it has to actually do anything, as this represents a failure of node-users to use an up-to-date version of harvest.

⚙️ Test Data and/or Report

Unit tests pass. Tested live against ATM-PROD and GEO-PROD

♻️ Related Issues

fixes #61

@alexdunnjpl alexdunnjpl merged commit 60a4f66 into main Sep 19, 2023
1 check passed
@alexdunnjpl alexdunnjpl deleted the big-docs-fix branch September 19, 2023 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deploy repairkit sweeper to delta and prod
2 participants