Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Database upgrades #136

Merged
merged 17 commits into from
Dec 8, 2024
Merged

WIP: Database upgrades #136

merged 17 commits into from
Dec 8, 2024

Conversation

ReCore-sys
Copy link
Collaborator

@ReCore-sys ReCore-sys commented Dec 2, 2024

This PR provides many changes to the database system, primarily removing the multiple database options and using only LMDB through the heed crate. We've had great success with this database before and multiple databases has gotten very difficult to manage.

Description

This PR removes the other database options and sets LMDB as the only database backend. We used LMDB before and it worked very well, but when rewriting we tried some other databases and had some initial success but eventually they all had too many drawbacks (redb being unreliable, surrealkv and sled using heaps of memory, rockdb requiring long compilation steps, etc.) so it was decided to only use LMDB since it offered very little drawbacks at the cost of having minimal documentation. So far this has lead to a significant reduction in memory use and no noticeable change in performance, though LMDB does use a memory mapping strategy that causes extremely high virtual memory usage. This doesn't have any real effect since the OS will reclaim it as soon as it needs it and all it really represents is the maximum size of the database, but it make our stats look concerning.

This PR also implements simple caching with the moka crate. Currently nothing is being bottlenecked by IO speeds, but this will alleviate a lot of those issues before they arise. The size of the cache is configurable with in the config file, with higher limits allowing for less disk accesses at the cost of higher memory use. The cache drops entries based on a chunk's size in memory and how long it's been since the entry was accessed. The current default is 20MB with a TTL of 60 seconds, meaning chunks are dropped from the cache if they have been there the longest when the cache exceeds 20MB of memory use, or if a chunk hasn't been accessed for 60 seconds. Increasing the size will allow for more chunks to sit in cache at once, while increasing the TTL will mean chunks can stay in cache longer without being dropped, allowing less frequently accessed chunks to maintain fast access speeds.

Some minor API and documentation changes have been made to the anvil parser as well.

Motivation and Context

The database has been a source of annoyance for a while now, requiring different implementations of features for each database. This should allow for a more unified approach.
Caching was implemented to reduce the performance hit that disk reads was having.

How has this been tested?

No tests for now, but proper database tests will be included in future commits.

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (restructuring code, without changing its behavior)

Checklist:

  • My code follows the code style of this project.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • Clippy passes with no warnings.

@ReCore-sys ReCore-sys linked an issue Dec 2, 2024 that may be closed by this pull request
Copy link
Contributor

@0xnim 0xnim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just these changes. And that question

src/lib/utils/config/src/server_config.rs Outdated Show resolved Hide resolved
.etc/example-config.toml Outdated Show resolved Hide resolved
src/lib/adapters/anvil/src/lib.rs Show resolved Hide resolved
@ReCore-sys ReCore-sys merged commit 0fd4360 into master Dec 8, 2024
9 checks passed
@ReCore-sys ReCore-sys deleted the feature/lmdb branch December 8, 2024 04:51
* 1024
* 1024
* 1024;
let rounded_map_size = (map_size as f64 / page_size::get() as f64).round() as usize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    let page_size = page_size::get();
    let rounded_map_size = ((map_size as f64 / page_size as f64).round() as usize * page_size) as usize;

@0xnim
Copy link
Contributor

0xnim commented Dec 8, 2024

Yeah its good

@vortexreanimated
Copy link

can i just fucking ask who we is??

We've had great success with this database before and multiple databases has gotten very difficult to manage.

@ReCore-sys
Copy link
Collaborator Author

can i just fucking ask who we is??

We've had great success with this database before and multiple databases has gotten very difficult to manage.

First off, chill.
Secondly, we have tried a few databases in the past and LMDB is the only one that hasn't given us more headaches than it's worth. It's fast, uses minimal memory and doesn't have any weird caveats so it's a pretty good choice.

@Sweattypalms
Copy link
Member

@vortexreanimated lmaooo "we" as in the people who worked for ferrumc, duh?

LMDB worked pretty good last time: it has low memory usage, and has good access/write speeds.
Although if we see better options, we would most likely change our db then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using LMDB
4 participants