-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to DuckDB #146
base: master
Are you sure you want to change the base?
Switch to DuckDB #146
Conversation
…and daily resolutions
I trust you to do this merge, since you have the freshest understanding of the code. Perhaps loop in HTRC people like @borice? How does DuckDB perform? |
This is not yet completely ready for review, but close enough that I want to put it in tracking. I'm still generally finding duckdb to work at, oh about 1.5x faster on standard queries on the Rate My Professor bookworm, and much faster on ingest. I just made a major change to the sort code though, by letting duckdb handle the word sorting (the stage that used to be index building in mysql, so often 6-12 hours.) Duckdb has also just added forms of compression on numbers that drop the disk space requirements compared to MySQL significantly--rough guess, databases should be one-third the size they were with MySQL. |
Once merged, MySQL is done with. With bigrams restored, I think it's pretty close to being ready.