-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce generated C++ code for bytes parsing #1837
Conversation
376faab
to
0c04741
Compare
@sethhall Take a look at this, it should help with the amount of generated code with your bytes fields. |
ea1d447
to
6f8b4f0
Compare
Oh nice!!!! I'll definitely be trying this soon. |
Ok, I tried this briefly. I generated a parser with one unit and 300 bytes fields in it. It took ~35s to compile (just compile, no linking) with the spicy that zeek is currently using and it took ~12.5s with this branch. The code is clearly much shorter too. It looks nice! |
Hm, now what I can't explain is that the Krb5 analyzer is taking longer to compile with spicyc (when I compile all the way from spicy code to an hlto file). This makes no sense at all. I'm going to have to dig a bit more on that to figure out what might be contributing to it taking longer. It's not massively longer, but it's definitely consistently longer. |
Does |
Nope. Nothing that stood out to me. I dug a little more last night but still nothing stood out. I'll keep digging. |
Oh, one other thought. It could be interesting to get that selective on-heap branch and this one together. I tried merging them yesterday but they have merge conflicts. |
Is this branch just waiting on code review or is there more work to do on it? Looks like it's lingered enough already that it has a conflict with main. |
Mostly I just didn't get to wrapping it up yet, but we do also still need to understand any compile-time differences, I've been meaning to look into that. |
Cool, no rush. I was just curious. |
I've tried it now, and I don't see a slow-down; in fact it's slightly faster for me with the branch. (I used |
74134b4
to
d697c2e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. I just realized I noted down a lot, but most of these should be cosmetic.
d697c2e
to
e64facf
Compare
For AST nodes, we can/should use nullptrs instead of unset optionals. The versions using `optional` were still a left-over from the old AST code.
This splits out the `&size/&max-size` handling so that we can special-case that later.
So far we had a boolean flag to differentiate between "normal" and "try" parsing. We turn this flag into an enum now so that we can more easily extend the set of modes later. No functional change otherwise.
This provides the surrounding infrastructure, but does not yet implement it for any type.
There are two changes in here: 1. For grammars that don't use look-ahead, we skip the runtime check for a pending look-ahead symbol, because we know we will never have one. This removes generated code of the form `if ( _lah ) { ... }`. This change needs a bit of machinery unfortunately because we need to get the information about look-ahead usage over into the codegen for literal parsing. 2. For bytes literals, we now outsource their parsing to a runtime function to make the generated code simpler. As a side effect this also provides more informative error messages when the literal isn't found. Taking the two together means that the code for parsing a plain `b"Foo"` literal may look like this now: spicy_rt::expectBytesLiteral(__data, __cur, b"Foo", "b.spicy:4:10-4:15", Null); __cur = __cur.advance(3); if ( __trim ) (*__data).trim(begin(__cur)); (*self).foo = b"Foo";
This allows to write tests that use internal IDs.
We use this to remove two statement constructs that the main optimizer may leave behind: 1. `default<void>()` 2. `(*self).__error = __error; __error = (*self).__error;` The second case is a quite specific situation that eventually, once we have CFG/DFG tracking, the main optimizer should be able to cover more generically. However, for now, it's just not nice to always have these blocks in the generated C++ code, so adding this special case seems useful. Couples notes on (2): - Per #1592, case 2 may also have overhead. Closes #1592. - Technically, this optimization isn't always correct: subsequent code could assume that `(*self).__error` is set, whereas after removal it's not (or not to the expected value). However, `__error` is strictly-internal state, and we know that we don't use it any different, so this seems ok until we have more general optimizer logic.
The optimization turns this: ``` function void foo() { try { ... } catch { throw; } } ``` into ``` function void foo() { { ... } } ``` It would be even nicer if we didn't need the braces around the remaining block, but it's generally not safe to remove them because if the block declares any locals their life-times and visibility would change.
This now optimizes the generated code for parsing a `bytes` literal based on what we're going to use it for, or not (require literal vs. look-ahead vs skip). The standard case of a parsing an expected literal now looks like this: ``` # Begin parsing production: Ctor: b1 -> b"abc" (const bytes) (*self).b1 = spicy_rt::expectBytesLiteral(__data, __cur, b"abc", "../tests/spicy/types/bytes/parse-length.spicy:20:10-20:15", Null); __cur = __cur.advance(3); if ( __trim ) (*__data).trim(begin(__cur)); # End parsing production: Ctor: b1 -> b"abc" (const bytes) ```
94494a6
to
1278d63
Compare
This reduces the amount of C++ code we generate for some cases of
common fields, in particular simple
bytes
fields and literals wherewe don't need to account for look-ahead parsing. This introduces new
machinery for implementing such narrow optimizations, which isn't quite
as easy as one would hope because we need to reliably identify the cases
when we can apply them.