Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharding - Concept #1758

Open
juliangruendner opened this issue May 28, 2024 · 0 comments
Open

Sharding - Concept #1758

juliangruendner opened this issue May 28, 2024 · 0 comments
Assignees
Labels
epic A large body of work that can be broken down into a number of smaller issues. performance Performance improvement

Comments

@juliangruendner
Copy link
Contributor

Sharding concept information

Sharding provides the potential for large performance improvements as the overhead for creating the federated query accross each shard and the merging of the results is significantly less costly than the query execution on a server.

Given the potential size of the data we would start with shard sizes of 250.000 patients per shard each shard with 8 cores and 64GB of RAM (see also: Tuning Guide).
=> for 2 million patients this would result in 8 shards with a total cost of 64 cores and 512GB of RAM.

In a first step we will investigate sharding based on tooling around the standard blaze server, see:
medizininformatik-initiative/fdpg-plus#13

In a next step sharding should be implemented in blaze directly allowing the users of blaze to use the sharded installation analogous to a non-sharded installation.

Sharding should be based on patient compartments and in case resources are used across multiple patient compartments be duplicated to each shard.

As a first step the concepts sorrounding this type of sharding should be developed and the implications this has for the fhir api (for example parallel paging across shards) should be investigated.

@alexanderkiel alexanderkiel added epic A large body of work that can be broken down into a number of smaller issues. performance Performance improvement labels Jul 18, 2024
@alexanderkiel alexanderkiel changed the title Performance: Sharding - Concept Sharding - Concept Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic A large body of work that can be broken down into a number of smaller issues. performance Performance improvement
Projects
None yet
Development

No branches or pull requests

2 participants