Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

suggestion: make the elision compaction trigger ratio configurable or include L6 in tombstone density compaction #4211

Open
ever0de opened this issue Dec 24, 2024 · 0 comments

Comments

@ever0de
Copy link

ever0de commented Dec 24, 2024

Pebble Options
	[Version]
	  pebble_version=0.1
	
	[Options]
	  bytes_per_sync=524288
	  cache_size=1073741824
	  cleaner=delete
	  compaction_debt_concurrency=1073741824
	  comparer=leveldb.BytewiseComparator
	  disable_wal=false
	  flush_delay_delete_range=10s
	  flush_delay_range_key=0s
	  flush_split_bytes=2097152
	  format_major_version=16
	  key_schema=DefaultKeySchema(leveldb.BytewiseComparator,16)
	  l0_compaction_concurrency=10
	  l0_compaction_file_threshold=500
	  l0_compaction_threshold=2
	  l0_stop_writes_threshold=1000
	  lbase_max_bytes=67108864
	  max_concurrent_compactions=3
	  max_concurrent_downloads=1
	  max_manifest_file_size=134217728
	  max_open_files=16384
	  mem_table_size=67108864
	  mem_table_stop_writes_threshold=4
	  min_deletion_rate=134217728
	  merger=pebble.concatenate
	  multilevel_compaction_heuristic=wamp(0.00, false)
	  read_compaction_rate=16000
	  read_sampling_multiplier=-1
	  num_deletions_threshold=100
	  deletion_size_ratio_threshold=0.500000
	  tombstone_dense_compaction_threshold=0.050000
	  strict_wal_tail=true
	  table_cache_shards=12
	  validate_on_ingest=false
	  wal_dir=
	  wal_bytes_per_sync=0
	  max_writer_concurrency=0
	  force_writer_parallelism=false
	  secondary_cache_size_bytes=0
	  create_on_shared=0
	
	[Level "0"]
	  block_restart_interval=16
	  block_size=32768
	  block_size_threshold=90
	  compression=Snappy
	  filter_policy=rocksdb.BuiltinBloomFilter
	  filter_type=table
	  index_block_size=262144
	  target_file_size=2097152
	
	[Level "1"]
	  block_restart_interval=16
	  block_size=32768
	  block_size_threshold=90
	  compression=Snappy
	  filter_policy=rocksdb.BuiltinBloomFilter
	  filter_type=table
	  index_block_size=262144
	  target_file_size=4194304
	
	[Level "2"]
	  block_restart_interval=16
	  block_size=32768
	  block_size_threshold=90
	  compression=Snappy
	  filter_policy=rocksdb.BuiltinBloomFilter
	  filter_type=table
	  index_block_size=262144
	  target_file_size=8388608
	
	[Level "3"]
	  block_restart_interval=16
	  block_size=32768
	  block_size_threshold=90
	  compression=Snappy
	  filter_policy=rocksdb.BuiltinBloomFilter
	  filter_type=table
	  index_block_size=262144
	  target_file_size=16777216
	
	[Level "4"]
	  block_restart_interval=16
	  block_size=32768
	  block_size_threshold=90
	  compression=Snappy
	  filter_policy=rocksdb.BuiltinBloomFilter
	  filter_type=table
	  index_block_size=262144
	  target_file_size=33554432
	
	[Level "5"]
	  block_restart_interval=16
	  block_size=32768
	  block_size_threshold=90
	  compression=Snappy
	  filter_policy=rocksdb.BuiltinBloomFilter
	  filter_type=table
	  index_block_size=262144
	  target_file_size=67108864
	
	[Level "6"]
	  block_restart_interval=16
	  block_size=32768
	  block_size_threshold=90
	  compression=Snappy
	  filter_policy=none
	  filter_type=table
	  index_block_size=262144
	  target_file_size=134217728

Currently, elision compaction is triggered only when the ratio of NumDeletions to NumEntries exceeds a certain threshold. However, this condition is not always met, especially when range deletions are not used, causing elision compaction to not be triggered as expected.

pebble/compaction_picker.go

Lines 1432 to 1433 in 78d5345

return f.Stats.RangeDeletionsBytesEstimate*10 >= f.Size || f.Stats.NumDeletions*10 > f.Stats.NumEntries, true
},

One option could be to make the ratio configurable, allowing more flexibility in triggering elision compaction based on different system requirements.

Alternatively, another approach could be to extend the tombstone-density compaction logic to include L6, instead of limiting it to L0–L5, and if you think it’s unnecessary, removing elision compaction to avoid redundant operations.

Before applying the patch, I observed that the tombstone keys started at approximately 70,000 and increased in a nearly linear fashion over the span of 7 days, reaching about 650,000. The graph below illustrates this linear increase.
-> It is likely because it did not reach 10% of the number of entries.

image

After applying the patch, where tombstone-density compaction was extended to include L6, the tombstone count stabilized between 3,000 and 40,000 keys. This change was intended to ensure latency remained consistent by sacrificing some CPU and disk resources for more frequent compaction.

Conclusion

I would like the tombstone count in L6 to be kept consistent (at a ratio lower than 10%) depending on the option, but it seems that there is no current method to achieve this. Is there any aspect of the two suggestions that could be considered? If so, I will try to submit a related PR.

EDIT)
I also tested a version with the modified elision ratio, and as expected, it seems to show better resource usage compared to including tombstone-density up to L6 (if we use the number of compactions as a metric).

Jira issue: PEBBLE-316

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Incoming
Development

No branches or pull requests

2 participants