Skip to content

Commit

Permalink
Bloom-compactor Sharding (#11154)
Browse files Browse the repository at this point in the history
**What this PR does / why we need it**:
This PR adds tenant and fingerprint (FP) sharding to bloom compactors.
Note that the bloom-compactor doesn't yet perform any compaction, but
iterates through all tables, tenants, and series checking if the
compactor owns the tenant and the series (by the series FP). Actual
compaction will be implemented with
#11115.

A new structure `Job` is added which will carry around all the context
for a compaction job such as the tenant ID, the table name, and the
series FP. The sharding strategy has two methods:
- `OwnsTenant(tenant string)`: Checks if the compactor shard owns the
tenant.
- `OwnsJob(job Job)`: Checks (again) if the compactor owns the job's
tenant. Then, it checks if the compactor owns the job's fingerprint by
looking inside the tenant subring.

We add a new per-tenant limit: `bloom_compactor_shard_size`. If it's 0,
the tenant can use all compactors (i.e. `OwnsTenant` will always return
`true`), otherwise, only `bloom_compactor_shard_size` out of the total
number of compactors will own the tenant. A given job's FP will be owned
by exactly one compactor within the tenant shard.

**Special notes for your reviewer**:
- Added a bunch of metrics in `metrics.go`
- Added a test for the sharding strategy
  • Loading branch information
salvacorts authored Nov 10, 2023
1 parent 0bc38e5 commit 4248825
Show file tree
Hide file tree
Showing 13 changed files with 1,110 additions and 223 deletions.
41 changes: 41 additions & 0 deletions docs/sources/configure/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2527,7 +2527,31 @@ ring:
# CLI flag: -bloom-compactor.enabled
[enabled: <boolean> | default = false]
# Directory where files can be downloaded for compaction.
# CLI flag: -bloom-compactor.working-directory
[working_directory: <string> | default = ""]
# Interval at which to re-run the compaction operation.
# CLI flag: -bloom-compactor.compaction-interval
[compaction_interval: <duration> | default = 10m]
# Minimum backoff time between retries.
# CLI flag: -bloom-compactor.compaction-retries-min-backoff
[compaction_retries_min_backoff: <duration> | default = 10s]
# Maximum backoff time between retries.
# CLI flag: -bloom-compactor.compaction-retries-max-backoff
[compaction_retries_max_backoff: <duration> | default = 1m]
# Number of retries to perform when compaction fails.
# CLI flag: -bloom-compactor.compaction-retries
[compaction_retries: <int> | default = 3]
# Maximum number of tables to compact in parallel. While increasing this value,
# please make sure compactor has enough disk space allocated to be able to store
# and compact as many tables.
# CLI flag: -bloom-compactor.max-compaction-parallelism
[max_compaction_parallelism: <int> | default = 1]
```

### limits_config
Expand Down Expand Up @@ -2914,6 +2938,23 @@ shard_streams:
# CLI flag: -bloom-gateway.shard-size
[bloom_gateway_shard_size: <int> | default = 1]

# The shard size defines how many bloom compactors should be used by a tenant
# when computing blooms. If it's set to 0, shuffle sharding is disabled.
# CLI flag: -bloom-compactor.shard-size
[bloom_compactor_shard_size: <int> | default = 1]

# The maximum age of a table before it is compacted. Do not compact tables older
# than the the configured time. Default to 7 days. 0s means no limit.
# CLI flag: -bloom-compactor.max-table-age
[bloom_compactor_max_table_age: <duration> | default = 168h]

# The minimum age of a table before it is compacted. Do not compact tables newer
# than the the configured time. Default to 1 hour. 0s means no limit. This is
# useful to avoid compacting tables that will be updated with out-of-order
# writes.
# CLI flag: -bloom-compactor.min-table-age
[bloom_compactor_min_table_age: <duration> | default = 1h]

# Allow user to send structured metadata in push payload.
# CLI flag: -validation.allow-structured-metadata
[allow_structured_metadata: <boolean> | default = false]
Expand Down
Loading

0 comments on commit 4248825

Please sign in to comment.