-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handrolled Parser #729
Handrolled Parser #729
Conversation
Alright, the remaining failures are error messages being a bit different. I haven't tried to profile or optimize yet, but it's already a bit faster:
|
Actually, there's one correctness bug left, we don't validate that string don't contain unescaped control characters. That may explain part of the difference. |
aacfff7
to
7ae7644
Compare
Yeah, no longer look as good after the fix: == Parsing activitypub.json (58160 bytes)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
after 874.000 i/100ms
Calculating -------------------------------------
after 8.695k (± 0.9%) i/s (115.01 μs/i) - 43.700k in 5.026376s
Comparison:
before: 9413.8 i/s
after: 8694.8 i/s - 1.08x slower
== Parsing twitter.json (567916 bytes)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
after 91.000 i/100ms
Calculating -------------------------------------
after 907.893 (± 1.0%) i/s (1.10 ms/i) - 4.550k in 5.012050s
Comparison:
before: 900.9 i/s
after: 907.9 i/s - same-ish: difference falls within error
== Parsing citm_catalog.json (1727030 bytes)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
after 44.000 i/100ms
Calculating -------------------------------------
after 450.179 (± 0.9%) i/s (2.22 ms/i) - 2.288k in 5.082934s
Comparison:
before: 425.8 i/s
after: 450.2 i/s - 1.06x faster But I haven't profiled on anything yet, so I think we can get some speed out of this. |
I fixed the last few tests that were failing because of different error messages, and cleaned up the code a bit. Before merging I'd like to figure out how to get at least on par on It's really when I added back the |
Yep, removing that check (hence no longer being a valid JSON parser), put us back way above:
|
Maybe it's worth storing the result of |
I just tried that, it made the perf slightly worse :/ |
Lookup tables to the rescue:
|
On the 3 main benchmarks:
|
It also progressed on all micro-benchmarks but one:
So I think this is good to go. I'll cleanup the git history though. |
And get rid of the Ragel parser. This is 7% faster on activitypub, 15% after on twitter and 11% faster on citm_catalog. There might be some more optimization opportunities, I did a quick optimization pass to fix a regression in string parsing, but other than that I haven't dug much in performance.
cc @kddnewton