Replace the yield with an isb on Arm. #17

AGSaidi · 2023-01-26T04:23:07Z

The yield instruction is treated as a nop on Arm processors which is very different than the x86 pause instruction that stalls execution for ~40 cycles.

An ISB serializes the pipeline and has been shown to be roughly analogous to the pause delays and is used is other databases for spinloops and adaptive spin loops where not hammering the cache line is important.

The yield instruction is treated as a nop on Arm processors which is very different than the x86 pause instruction that stalls execution for ~40 cycles. An ISB serializes the pipeline and has been shown to be roughly analogous to the pause delays and is used is other databases for spinloops and adaptive spin loops where not hammering the cache line is important.

BrianNichols · 2023-01-27T21:21:41Z

@AGSaidi We are evaluating this pull request. I did run a multi-threaded spinlock test on macOS m1, and both "yield" and "isb" instructions resulted in nearly 100% cpu usage on each of the blocked threads. I realize cpu usage and power consumption are not exactly correlated, so this test might be misleading or Apple silicon might have a "pause like" implementation of "yield".

Can you point to a spinlock test and platform that demonstrates the advantage of "isb" on power consumption?

AGSaidi · 2023-01-27T21:27:52Z

@BrianNichols you'll still see 100% cpu utilization, the application is still using the core completely, but the key point here is it's going to iterate around the spinloop fewer timer. These fewer times mean less loads for the memory location that is being spun on into the memory system and that generally saves power and improves performance. The Performance improvement comes from two angles. 1. If there are any adaptive spin loops that have been tuned for 'pause' on intel this will make teh same tuning apply for Arm as opposed to ending early. 2. The fact that the memory system isn't saturated with loads from the fast loop means that the unlock is observed more quickly.

BrianNichols · 2023-01-27T21:43:47Z

@AGSaidi Sounds reasonable. We should have a decision by next week.

AGSaidi · 2023-01-27T21:54:33Z

Great. Please run your performance tests on Graviton and see if there are tests that improve. We've seen substantial improvements for lock contention workloads in other databases.

BrianNichols · 2023-02-01T21:06:56Z

The pull request has been accepted.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace the yield with an isb on Arm. #17

Replace the yield with an isb on Arm. #17

AGSaidi commented Jan 26, 2023

BrianNichols commented Jan 27, 2023

AGSaidi commented Jan 27, 2023

BrianNichols commented Jan 27, 2023

AGSaidi commented Jan 27, 2023

BrianNichols commented Feb 1, 2023

Replace the yield with an isb on Arm. #17

Are you sure you want to change the base?

Replace the yield with an isb on Arm. #17

Conversation

AGSaidi commented Jan 26, 2023

BrianNichols commented Jan 27, 2023

AGSaidi commented Jan 27, 2023

BrianNichols commented Jan 27, 2023

AGSaidi commented Jan 27, 2023

BrianNichols commented Feb 1, 2023