-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MDEV-35049: Improve adaptive hash index scalability #3562
base: 10.6
Are you sure you want to change the base?
Conversation
|
Sorry, this needs some more work:
However, this should not prevent any performance testing. |
Thanks to @montywi for pointing out my mistake: all updates of ./mtr --mem --parallel=5 --suite=innodb --mysqld=--loose-innodb-adaptive-hash-index=on |
buf_block_t *block= buf_block_alloc(); | ||
auto part= btr_search_sys.get_part(*index); | ||
|
||
part->latch.wr_lock(SRW_LOCK_CALL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a closer thought, we may need rd_lock(SRW_LOCK_CALL)
here in order to prevent a race condition with mem_heap_free(part->heap)
or mem_heap_empty(part->heap)
(which I think should be covered by an exclusive latch). I am not sure if there can be such calls, so I must check this, and add a source code comment, with or without the latch acquisition, as appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We indeed must check btr_search_enabled
while holding part->latch
in order to avoid reintroducing a race condition with a concurrent btr_search_disable()
, which had been fixed in ad2bf11.
MEM_HEAP_BTR_SEARCH: Remove. Let us handle this special type of mem_heap_t allocations in the only compilation unit, btr0sea.cc. mem_block_info_t::ahi_block: Replaces free_block. This caches one buffer page for use in adaptive hash index allocations. This is protected by btr_search_sys_t::partition::latch. It only is Atomic_relaxed because btr_search_free_space() is following a pattern of test, lock, and test. btr_search_check_free_space(): Protect the ahi_block with a shared AHI partition latch. We must recheck btr_search_enabled after acquiring the latch in order to avoid a race condition with btr_search_disable(). Using a shared latch instead of an exclusive one should reduce contention with btr_search_guess_on_hash() and other operations when running with innodb_adaptive_hash_index=ON. This has been tested by running the regression test suite with the adaptive hash index enabled: ./mtr --mysqld=--loose-innodb-adaptive-hash-index=ON
This turned out to be an independent bug, which I fixed in cc70ca7. |
994a740 reverts another attempt to use lock upgrade again. It would lead to occasional deadlocks when running |
for (ulint i = 0; i < btr_ahi_parts && btr_search_enabled; ++i) { | ||
const auto part= &btr_search_sys.parts[i]; | ||
part->latch.rd_lock(SRW_LOCK_CALL); | ||
ut_ad(part->heap->type == MEM_HEAP_FOR_BTR_SEARCH); | ||
fprintf(file, "Hash table size " ULINTPF | ||
", node heap has " ULINTPF " buffer(s)\n", | ||
part->table.n_cells, | ||
part->heap->base.count - !part->heap->free_block); | ||
part->heap->base.count - !part->heap->ahi_block); | ||
part->latch.rd_unlock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If btr_search_enabled
had been cleared between our check and the acquisition of part->latch
, we’d be dereferencing part->heap=nullptr
here.
A possible fix could be to ensure that part->heap
will always remain allocated. In that way, we would not have to check for btr_search_enabled
here at all, and btr_search_check_free_space_in_heap()
could safely use atomic memory access instead of acquiring part->latch
.
/** Get an adaptive hash index partition */ | ||
partition *get_part(index_id_t id, ulint space_id) const | ||
partition *get_part(index_id_t id, ulint space_id) const noexcept | ||
{ | ||
return parts + ut_fold_ulint_pair(ulint(id), space_id) % btr_ahi_parts; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This had better be just
return parts + ulint(id) % btr_ahi_parts;
Description
MEM_HEAP_BTR_SEARCH
: Remove. Let us handle this special type ofmem_heap_t
allocations in the only compilation unit,btr0sea.cc
, specifically,btr_search_sys_t::partition::insert()
, which replaces the functionha_insert_for_fold()
.mem_block_info_t::ahi_block
: Replacesfree_block
. This caches one buffer page for use in adaptive hash index allocations. This is protected bybtr_search_sys_t::partition::latch
. It only isAtomic_relaxed
becausebtr_search_free_space()
is following a pattern of test, lock, and test.btr_search_check_free_space()
: Protect theahi_block
with a shared AHI partition latch instead of an exclusive one. We must recheckbtr_search_enabled
after acquiring the latch in order to avoid a race condition withbtr_search_disable()
. Using a shared latch instead of an exclusive one should reduce contention withbtr_search_guess_on_hash()
and other operations when running withinnodb_adaptive_hash_index=ON
.Release Notes
The performance of
innodb_adaptive_hash_index
with larger numbers of threads was improved at high concurrency.How can this PR be tested?
I tested this with and without
cmake -DWITH_INNODB_AHI=OFF
.Using the
innodb-hashtest.sh
I still see some waits for an AHI partition latch. Here are all functions that exceed 1% ofperf record
samples:It should be noted that conflicts are inevitable under this workload, because
ha_delete_hash_node()
is covered by an exclusive AHI partition latch.I repeated the experiment with an additional patch that disables the spin loops in this subsystem:
With this additional patch, the spin loop is gone:
The
ssux_lock_impl<false>::rd_wait()
above will involve context switches due tofutex
system calls.This needs to be tested with a larger workload that originally reproduced the scalability issue.
Basing the PR against the correct MariaDB version
main
branch.This should be directly applicable to 10.5, but the patch is currently is based on the 10.6 branch. For 10.5, we might also limit the change to
btr_search_check_free_space_in_heap()
.PR quality check