-
Notifications
You must be signed in to change notification settings - Fork 339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate some C code to Ruby #617
Conversation
0e8e317
to
5876f4a
Compare
This helps very marginally with allocation speed.
Most of these classes and modules don't need to be global variables
If we assume that most of the time the `opts` hash is small it's faster to go over the provided keys with a `case` than to test all possible keys one by one. Before: ``` == Encoding small nested array (121 bytes) ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- json 156.832k i/100ms oj 209.769k i/100ms rapidjson 162.922k i/100ms Calculating ------------------------------------- json 1.599M (± 2.5%) i/s (625.34 ns/i) - 7.998M in 5.005110s oj 2.137M (± 1.5%) i/s (467.99 ns/i) - 10.698M in 5.007806s rapidjson 1.677M (± 3.5%) i/s (596.31 ns/i) - 8.472M in 5.059515s Comparison: json: 1599141.2 i/s oj: 2136785.3 i/s - 1.34x faster rapidjson: 1676977.2 i/s - same-ish: difference falls within error == Encoding small hash (65 bytes) ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- json 216.464k i/100ms oj 661.328k i/100ms rapidjson 324.434k i/100ms Calculating ------------------------------------- json 2.301M (± 1.7%) i/s (434.57 ns/i) - 11.689M in 5.081278s oj 7.244M (± 1.2%) i/s (138.05 ns/i) - 36.373M in 5.021985s rapidjson 3.323M (± 2.9%) i/s (300.96 ns/i) - 16.871M in 5.081696s Comparison: json: 2301142.2 i/s oj: 7243770.3 i/s - 3.15x faster rapidjson: 3322673.0 i/s - 1.44x faster ``` After: ``` == Encoding small nested array (121 bytes) ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- json 168.087k i/100ms oj 208.872k i/100ms rapidjson 149.909k i/100ms Calculating ------------------------------------- json 1.761M (± 1.1%) i/s (567.90 ns/i) - 8.909M in 5.059794s oj 2.144M (± 0.9%) i/s (466.37 ns/i) - 10.861M in 5.065903s rapidjson 1.692M (± 1.7%) i/s (591.04 ns/i) - 8.545M in 5.051808s Comparison: json: 1760868.2 i/s oj: 2144205.9 i/s - 1.22x faster rapidjson: 1691941.1 i/s - 1.04x slower == Encoding small hash (65 bytes) ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- json 242.957k i/100ms oj 675.217k i/100ms rapidjson 355.040k i/100ms Calculating ------------------------------------- json 2.569M (± 1.5%) i/s (389.22 ns/i) - 12.877M in 5.013095s oj 7.128M (± 2.3%) i/s (140.30 ns/i) - 35.787M in 5.023594s rapidjson 3.656M (± 3.1%) i/s (273.50 ns/i) - 18.462M in 5.054558s Comparison: json: 2569217.5 i/s oj: 7127705.6 i/s - 2.77x faster rapidjson: 3656285.0 i/s - 1.42x faster ```
5876f4a
to
a75c9ec
Compare
Nice cleanup! |
Similar code in the JRuby extension could probably also be removed with this change. |
I took a quick stab at it but there's some complexities mixing the pure-Ruby and extension versions of State. This PR only moves some of State into Ruby, so I attempted to only remove those parts of the JRuby State implementation. It failed because of a few things:
Mostly the JRuby ext would need to switch to instance vars for all accesses. That would be slower than using Java object fields, but it may not be important for The patch, such as it is: https://gist.github.com/headius/40b69293f3d34a9bfc1373675369dba8 |
All these methods aren't hotspot and much simpler to implement in Ruby than C. Some code could even be shared with the "pure" implementation but it's not a big deal.
Also this whole "
Generator::State
can be used as a Hash" scream YAGNI, not sure if we want to deprecate it, but definitely something we don't care about optimizing.Overall this optimize the happy path for instantiating
Generator::State
, which is about half the time spent in micro-benchmarks:Before:
After: