Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasmparser: optimize type section valiadation by specializing over Wasm feature subset #1906

Merged

Conversation

Robbepop
Copy link
Contributor

@Robbepop Robbepop commented Nov 13, 2024

Closes #1701.

cc @alexcrichton : Are there any other Wasm proposals that might conflict?

This implements a lightweight type section validation for a subset of Wasm proposals.

Benchmarks

I compared main against PR and changed the used WasmFeatures set for both:

Validator::new_with_features(
    WasmFeatures::WASM2
        .union(WasmFeatures::CUSTOM_PAGE_SIZES)
        .union(WasmFeatures::EXTENDED_CONST)
        .union(WasmFeatures::MEMORY64)
        .union(WasmFeatures::MULTI_MEMORY)
        .union(WasmFeatures::RELAXED_SIMD)
        .union(WasmFeatures::TAIL_CALL)
        .union(WasmFeatures::THREADS)
        .union(WasmFeatures::WIDE_ARITHMETIC);
)

image

@Robbepop Robbepop changed the title Add specialized type section validation for Wasm feature subset wasmparser: Add specialized type section validation for Wasm feature subset Nov 13, 2024
@Robbepop Robbepop changed the title wasmparser: Add specialized type section validation for Wasm feature subset wasmparser: optimize type section valiadation by specializing over Wasm feature subset Nov 13, 2024
@Robbepop
Copy link
Contributor Author

I also ran the wam-tools Wasm testsuite for validation and disabled the function-references and gc proposals to trigger the fast-validation path. As expected, the testsuite did not pass with the following failures:

failures:
    "tests/local/component-model/gc.wast"
    "tests/local/exnref/try-table.wast"
    "tests/local/folding/array_fill.wast"
    "tests/local/folding/array_init_data.wast"
    "tests/local/folding/array_copy.wast"
    "tests/local/folding/array_init_elem.wast"
    "tests/local/folding/array.wast"
    "tests/local/function-reference-structural-matching.wat"
    "tests/local/folding/struct.wast"
    "tests/local/folding/fold-stack-switching-gc.wast"
    "tests/local/folding/fold-stack-switching.wast"
    "tests/local/instance.wast"
    "tests/local/invalid-init-expr.wast"
    "tests/local/shared-everything-threads/polymorphism.wast"
    "tests/local/shared-everything-threads/gc-unreachable.wast"
    "tests/local/shared-everything-threads/builtins.wast"
    "tests/local/shared-everything-threads/funcs.wast"
    "tests/local/shared-everything-threads/rec-groups.wast"
    "tests/local/shared-everything-threads/i31.wast"
    "tests/local/shared-everything-threads/tables.wast"
    "tests/local/shared-everything-threads/ref-equivalence.wast"
    "tests/local/stack-switching/cont_bind.wast"
    "tests/local/stack-switching/actor-lwt.wast"
    "tests/local/shared-everything-threads/globals.wast"
    "tests/local/stack-switching/actor.wast"
    "tests/local/stack-switching/cont_canonicalisation.wast"
    "tests/local/stack-switching/async-await.wast"
    "tests/local/stack-switching/cont_new.wast"
    "tests/local/shared-everything-threads/structs.wast"
    "tests/local/stack-switching/cont_suspend.wast"
    "tests/local/stack-switching/cont_resume.wast"
    "tests/local/shared-everything-threads/types.wast"
    "tests/local/stack-switching/fun-state.wast"
    "tests/local/stack-switching/generators.wast"
    "tests/local/stack-switching/pipes.wast"
    "tests/local/stack-switching/fun-actor-lwt.wast"
    "tests/local/stack-switching/fun-lwt.wast"
    "tests/local/stack-switching/control-lwt.wast"
    "tests/local/stack-switching/lwt.wast"
    "tests/local/shared-everything-threads/arrays.wast"
    "tests/local/stack-switching/static-lwt.wast"
    "tests/local/stack-switching/cont.wast"
    "tests/local/stack-switching/validation_gc.wast"
    "tests/local/stack-switching/validation.wast"
    "tests/local/component-model/types.wast"

I have looked through a bunch of the reported errors and they seem to be expected from the turning-off of features. However, it would certainly be nicer if we had an automated way of testing this fast-validation path.

@Robbepop
Copy link
Contributor Author

Robbepop commented Nov 14, 2024

I think is possible to simplify Wasm feature checking by making wasmparser aware of Wasm feature dependencies.
For example:

  • enable function-references if the gc proposal is enabled because gc depends on function-references
  • disable gc if function-references is disabled (same reason as above)
  • etc..

This way it was not possible to have Wasm features sets that do not make much sense.

Or are there problems that I have overlooked? 🤔

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Do you think it would be possible to optimize the processing of types in-place though rather than conditionally gating on features? From a high-level I don't think we have any other logic where we significantly differ on features for what path to take at runtime (except to return an error). I'm worried that this will become a relatively poorly tested path due to not being the default and being relatively nontrivial. I haven't dug in much though to try to understand what the slow parts are here though so it may not be possible.

Otherwise for feature dependencies I'm not sure how to best manage that. We already primarily use method accessors for features rather than .contains(WasmFeatures::FEATURE) so it might not be too hard to update those accessors to automatically test for a set of bits rather than just a single bit.

@Robbepop
Copy link
Contributor Author

Robbepop commented Nov 14, 2024

Thanks! Do you think it would be possible to optimize the processing of types in-place though rather than conditionally gating on features? From a high-level I don't think we have any other logic where we significantly differ on features for what path to take at runtime (except to return an error). I'm worried that this will become a relatively poorly tested path due to not being the default and being relatively nontrivial. I haven't dug in much though to try to understand what the slow parts are here though so it may not be possible.

I agree that this solution is not ideal. To me this is just a temporary solution to make wasmparser usable by Wasmi. Unfortunately, I was not able to come up with a more elegant, inline solution.

An idea for an inline solution:

Perform the light-weight validation introduced in this PR until a type is encountered that strictly requires all the type canonicalization, sub-type checking etc. Then, perform all those steps for all the types seen so far and continue with this more elaborate approach.

@alexcrichton Do you think this is feasible?


Slow parts of type section validation are:

  • type canonicalization
  • sub-type checks (depends on canonicalization)
  • internment (not sure)

All of those are prevented in this PR for the selected set of features.


I would really love to use upstream wasmparser in Wasmi and keep it up to date and be able to contribute to its development. As long as Wasmi sits on its own branch this development is kinda hindered. Fast type section validation is the last strict requirement for this to happen.

@Robbepop
Copy link
Contributor Author

Robbepop commented Nov 16, 2024

I have experimented with an implemented for the idea presented in this comment. However, problems arised quickly due to certain invariants of other structures in the validator's greater scheme. For example resetting the TypeAlloc. Furthermore, compared to the solution presented in this PR, there was some overhead, even for the fast case.

All in all, I think a proper, more elegant approach would take a lot of more work and consideration. This PR implements a very fast type section validation at nearly no runtime cost (one bit-flags check). The feature gated allow list should be fairly maintainble since development of new features won't casually break the type validator and old features won't change their semantics. Once we figure out a more elegant and inline approach for fast type section validation, this solution can easily be replaced since it is neatly isolated and contained in a single spot.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for investigating! I think that's a reasonable approach myself and the benefits seem worth the theoretical risk here, so I'm happy to land this. It's also not like this has zero testing, the "feature frobbing" done in Wasmtime and wasm-tools fuzzing should still cover this case.

@alexcrichton alexcrichton added this pull request to the merge queue Nov 16, 2024
Merged via the queue into bytecodealliance:main with commit 9cebc6a Nov 16, 2024
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

wasmparser: Performance regressions since v0.100.2
2 participants