Give PendingRow
its BTreeMap
back... or don't? 😶
#8788
Draft
+13
−8
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a fun one: it's by all account a mistake, and therefore a bug... but actually it's pretty nice, so I'm not quite sure I want to fix it 😶.
I was looking into the micro-batcher's code for unrelated reasons, when I stumbled upon this:
Looks legit at first glance, except...
row.components
is aHashmap
. Or aIntMap
, rather. AndHashMap
s have a random iteration order per-instance per-execution (it is then fixed for the lifetime of that instance).The reason it's a
IntMap
is because I went a bit too far with my search-n-replace during #8207:So, that raises the question... why on earth is the micro-batcher still working? The reason it's still working is that, by virtue of not doing any hashing nor salting, an
IntMap
ends up with stronger guarantees that a vanillaHashMap
. Specifically, allIntMap
s that share the same keys have the same iteration order, iff these keys were inserted in the same order.Put differently, this test always passes:
whereas this one will always fail, as you'd expect:
And that's why batching still works: the order in which you insert your components into your
PendingRow
is always the same during the execution of your program.Not only it works, but it's 50% faster than with the BTree-based approach, as we know. Eh.