Implementation of the custom descriptors proposal #2295

yurydelendik · 2025-09-02T16:03:41Z

Draft implementation of the proposal that can be found at https://github.com/WebAssembly/custom-descriptors/blob/main/proposals/custom-descriptors/Overview.md.

Keeping it as draft:

PackedIndex or RefType mask extension for exact type is okay
Ensure validation rules properly implemented
Has some support for wasm-smith
Exact / heap types for imports/ref.func

There are no tests or valid examples in the proposal yet: providing my own.

alexcrichton

I only skimmed the validation/bit-packing-RefType bits and I'll cc @fitzgen for those bits too as I'm sure he'll be interested in them when they shape up. Otherwise I left a few comments here and there. Overall thanks for this though!

crates/wasmparser/src/features.rs

crates/wasm-smith/src/core.rs

crates/wasmparser/src/validator/core/canonical.rs

crates/wasmparser/src/validator.rs

crates/wasmparser/src/readers/core/types.rs

fitzgen

Thanks!

crates/wasm-encoder/src/core/instructions.rs

crates/wasmparser/src/readers/core/types.rs

fitzgen

Changing RefType to 4 bytes is going to make ValType larger than 4 bytes, which has been observed to have been a pretty big slow down in part work. I'm going to try and avoid that via

Allowing ValType to remain a nice Rust enum and grow larger than 4 bytes, but then
Creating an internal ValTypePacked type that is still 4 bytes but is not a nice enum and has worse ergonomics
Making the core validation loop use ValTypePacked instead of ValType

This should relieve the pressure you're bumping up against. I'll whip this up and then once it has landed on main you can rebase and then this should be able to merge (modulo any other review comments Alex might have, I haven't looked at anything other than the RefType bitpacking stuff).

Thanks for your patience @yurydelendik!

fitzgen · 2025-10-24T21:31:50Z

I'll whip this up

I was hoping I would finish this up before the weekend but it looks like I won't. Since the CG meeting in Munich is next week, I probably won't finish this until the week afterwards.

fitzgen · 2025-11-07T17:04:26Z

Draft PR here: #2369

Alex has ideas about landing this PR before that, though.

alexcrichton · 2025-11-07T18:32:36Z

Ok I've done a bit of benchmarking now. With this PR as-is:

bench results

validate/tests          time:   [8.5038 ms 8.5072 ms 8.5128 ms]
                        change: [+3.1259% +3.2126% +3.3074%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate/lots-of-unique-fn-types
                        time:   [1.6182 ms 1.6190 ms 1.6200 ms]
                        change: [+4.2532% +4.3423% +4.4290%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate-old/lots-of-unique-fn-types
                        time:   [764.95 µs 765.34 µs 765.94 µs]
                        change: [+5.6912% +5.7338% +5.7780%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate/intgemm-simd   time:   [725.08 µs 725.26 µs 725.46 µs]
                        change: [-2.5999% -2.5459% -2.5027%] (p = 0.00 < 0.05)
                        Performance has improved.

validate-old/intgemm-simd
                        time:   [713.58 µs 713.74 µs 713.90 µs]
                        change: [-2.7302% -2.6875% -2.6424%] (p = 0.00 < 0.05)
                        Performance has improved.

validate/bz2            time:   [313.12 µs 313.41 µs 313.75 µs]
                        change: [-7.5212% -7.4616% -7.3949%] (p = 0.00 < 0.05)
                        Performance has improved.

validate-old/bz2        time:   [310.04 µs 310.17 µs 310.32 µs]
                        change: [-8.2737% -8.1943% -8.0962%] (p = 0.00 < 0.05)
                        Performance has improved.

validate/lots-of-empty-fn-types
                        time:   [611.75 µs 612.03 µs 612.33 µs]
                        change: [+6.5110% +6.6114% +6.7036%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate-old/lots-of-empty-fn-types
                        time:   [687.19 µs 687.32 µs 687.48 µs]
                        change: [+6.3189% +6.3881% +6.4912%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate/spidermonkey   time:   [18.976 ms 18.981 ms 18.987 ms]
                        change: [-3.8797% -3.8300% -3.7809%] (p = 0.00 < 0.05)
                        Performance has improved.

validate-old/spidermonkey
                        time:   [19.005 ms 19.008 ms 19.012 ms]
                        change: [-3.8858% -3.8447% -3.8097%] (p = 0.00 < 0.05)
                        Performance has improved.

validate/pulldown-cmark time:   [793.03 µs 793.35 µs 793.79 µs]
                        change: [-2.5407% -2.4893% -2.4298%] (p = 0.00 < 0.05)
                        Performance has improved.

validate-old/pulldown-cmark
                        time:   [787.91 µs 788.14 µs 788.41 µs]
                        change: [-3.1519% -3.1091% -3.0608%] (p = 0.00 < 0.05)
                        Performance has improved.

So not obviously bad. It's actually a speedup on all 'real world' modules but is an equivalent regression on synthetic stressor modules.

Next I took #2369 and massaged it into the validator as well. The commit is alexcrichton@c2d95e4 and is pretty messy as my goal was just to get to the point of benchmarking. Try as I might it's a net regression across the board:

bench numbers

validate/tests          time:   [8.6207 ms 8.6267 ms 8.6365 ms]
                        change: [+4.5500% +4.6617% +4.8015%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate/lots-of-unique-fn-types
                        time:   [1.6303 ms 1.6312 ms 1.6324 ms]
                        change: [+5.0902% +5.2510% +5.4381%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate-old/lots-of-unique-fn-types
                        time:   [756.01 µs 756.24 µs 756.51 µs]
                        change: [+4.5071% +4.5594% +4.6147%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate/intgemm-simd   time:   [793.10 µs 793.51 µs 794.08 µs]
                        change: [+6.5632% +6.6305% +6.6962%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate-old/intgemm-simd
                        time:   [781.38 µs 781.66 µs 781.94 µs]
                        change: [+6.5166% +6.5586% +6.6019%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate/bz2            time:   [347.23 µs 347.40 µs 347.69 µs]
                        change: [+2.5307% +2.5825% +2.6475%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate-old/bz2        time:   [344.52 µs 344.78 µs 345.28 µs]
                        change: [+1.9069% +2.0015% +2.1150%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate/lots-of-empty-fn-types
                        time:   [610.01 µs 610.26 µs 610.52 µs]
                        change: [+6.2306% +6.3265% +6.4184%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate-old/lots-of-empty-fn-types
                        time:   [678.92 µs 679.28 µs 679.86 µs]
                        change: [+5.0427% +5.1188% +5.2261%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate/spidermonkey   time:   [20.722 ms 20.729 ms 20.737 ms]
                        change: [+4.9696% +5.0265% +5.0852%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate-old/spidermonkey
                        time:   [20.741 ms 20.745 ms 20.749 ms]
                        change: [+4.8940% +4.9394% +4.9792%] (p = 0.00 < 0.05)
                        Performance has regressed.

validate/pulldown-cmark time:   [857.67 µs 857.86 µs 858.06 µs]
                        change: [+5.3636% +5.4135% +5.4581%] (p = 0.00 < 0.05)
                        Performance has regressed.


validate-old/pulldown-cmark
                        time:   [855.23 µs 855.41 µs 855.60 µs]
                        change: [+5.0936% +5.1401% +5.1899%] (p = 0.00 < 0.05)
                        Performance has regressed.

I've tried a number of ways to further optimize this in my commit, but to no avail. In retrospect this makes sense though:

ValType is extremely pervasive and this PR inflates it from 4 to 8 bytes. While we're changing the validator to store ValTypePacked nowhere else is changing so everywhere else is getting a larger memory footprint. I think this is why the synthetic modules are regressing.
Validator performance with ValTypePacked is pretty bad because, looking at the generated code, there's a lot of conversion between ValType and ValTypePacked. The theory of ValTypePacked is that its conversion gets inlined away and is zero-cost, but in practice that's just not happening. It's certainly happening in some places, and I can see it, but it's not happening in all cases which is basically what's required here.

So my conclusion is that we should land this PR as-is. Update the test to accept an 8-byte MaybeType instead of a 4-byte MaybeType and just accept that ValType is now 8 bytes instead of 4 bytes. I don't think it's worth replacing ValType with ValTypePacked (e.g. have it be an opaque 32-bit integer) because that loses match exhaustiveness. I also don't think it's worth having two representations because it's objectively slower and also a larger maintenance burden. In the end if we want this to be faster I think we just need to wait for future Rust language features such as pattern types and/or the ability to define a type which has a broader niche than just NonZero for example.

fitzgen · 2025-11-07T18:43:39Z

ValType is extremely pervasive and this PR inflates it from 4 to 8 bytes. While we're changing the validator to store ValTypePacked nowhere else is changing so everywhere else is getting a larger memory footprint. I think this is why the synthetic modules are regressing.

Validator performance with ValTypePacked is pretty bad because, looking at the generated code, there's a lot of conversion between ValType and ValTypePacked. The theory of ValTypePacked is that its conversion gets inlined away and is zero-cost, but in practice that's just not happening. It's certainly happening in some places, and I can see it, but it's not happening in all cases which is basically what's required here.

Yeah I was hoping that the conversions would pretty much get inlined away and it is unfortunate that you aren't seeing that.

I think I agree with your conclusion that we can land this PR as-is (modulo CI going green) but I think it is worth thinking a tiny bit more about ValTypePacked before giving up there. Specifically, where are the conversions not getting inlined away? I would imagine it is at the places where valtypes enter/exit the validator code itself, not within the validator. Assuming that is correct, are there perhaps a couple choice places outside the validator that we could change to use the packed variant and avoid most of the conversion costs? On the other hand, if the conversions aren't boiling away in the middle of validator's hot loop, then this whole approach is pretty dead in the water.

fitzgen · 2025-11-07T18:44:19Z

So not obviously bad. It's actually a speedup on all 'real world' modules but is an equivalent regression on synthetic stressor modules.

FWIW, I think we should care about the real world modules way more than the synthetic ones.

alexcrichton · 2025-11-07T18:47:17Z

On the other hand, if the conversions aren't boiling away in the middle of validator's hot loop, then this whole approach is pretty dead in the water.

More-or-less this is what I saw. For example, this is one of the hottest methods in the validator:

    fn visit_local_set(&mut self, local_index: u32) -> Self::Output {
        let ty = self.local(local_index)?;
        self.pop_operand(Some(ty))?;
        self.local_inits.set_init(local_index);
        Ok(())
    }

Here self.local is reading from some storage of ValTypePacked. Later pop_operand reads a MabyeTypePacked representation. Here the ty isn't a "constant" like it is in other places (e.g. pop_operand(Some(ValType::I32))) but it's genuinely a runtime variable. LLVM was unable to boil away all the conversions so self.local, which returns ValType, would decode and then it's get re-encoded in pop_operand to MaybeTypePacked.

We could fix this by changing self.local to return ValTypePacked I think, but that starts bringing up the question of just wholesale replacing ValType with ValTypePacked everywhere in the validator. Once we do that there's the question of everythign else and .... then I reached the conclusion we shouldn't do this

fitzgen · 2025-11-07T18:54:00Z

We could fix this by changing self.local to return ValTypePacked I think, but that starts bringing up the question of just wholesale replacing ValType with ValTypePacked everywhere in the validator. Once we do that there's the question of everythign else and .... then I reached the conclusion we shouldn't do this

Yeah, makes sense to me.

Really wish Rust didn't make us choose between packing and ergonomics!

alexcrichton · 2025-11-07T19:12:40Z

@yurydelendik alright if you're ok with it the ball's now back in your court. I pushed up two commits here (feel free to rebase/squash them away as you see fit) as well as a merge with main. Notably in 162dff0 I raised the assertion limit for MaybeType to unblock this PR. In the next commit I updated the spec test suite submodule which has a lot more custom-descriptors tests. Would you be ok adjusting text/validation/etc as necesary to get the spec test suite passing?

Also adjust the validation of `Exact` feature-wise and add a test that custom-descriptors constructs are rejected without the feature.

yurydelendik · 2025-11-11T17:39:50Z

@yurydelendik alright if you're ok with it the ball's now back in your court. I pushed up two commits here (feel free to rebase/squash them away as you see fit) as well as a merge with main. Notably in 162dff0 I raised the assertion limit for MaybeType to unblock this PR. In the next commit I updated the spec test suite submodule which has a lot more custom-descriptors tests. Would you be ok adjusting text/validation/etc as necesary to get the spec test suite passing?

Thanks. I rebased and adjusted most of syntax. Changed logic around ref.get_desc and ref.cast_desc too.

Not implement is parsing of some 'exact' case, e.g. in imports -- that requires some deep diving into data structures. Will be looking at that next

alexcrichton · 2025-11-13T20:53:52Z

Happy to help out if you get stuck! I just pushed up a commit which is a re-BLESS=1 of the cli testsuite which cuts things down to just a few failures

alexcrichton · 2025-11-14T03:27:32Z

Also, if you'd like, these tests are pretty minor so I think it would be reasonable to land what you have here and address the missing functionality in a follow-up too. Whichever you'd prefer.

yurydelendik · 2025-11-14T17:34:16Z

Also, if you'd like, these tests are pretty minor so I think it would be reasonable to land what you have here and address the missing functionality in a follow-up too. Whichever you'd prefer.

That sounds good. We can land what is here so far, at any moment. I'll continue working on exact types in imports.

alexcrichton · 2025-11-14T18:29:57Z

Ok sounds good, I'll do one final pass on review and frob CI to get the test to be expected to fail and then land this

alexcrichton reviewed Sep 2, 2025

View reviewed changes

yurydelendik force-pushed the custom-descr branch from 07068fa to 0b11138 Compare September 2, 2025 22:42

fitzgen reviewed Sep 4, 2025

View reviewed changes

crates/wasm-encoder/src/core/instructions.rs Show resolved Hide resolved

crates/wasmparser/src/readers/core/types.rs Outdated Show resolved Hide resolved

yurydelendik force-pushed the custom-descr branch 3 times, most recently from 9763983 to 9d29633 Compare October 23, 2025 17:56

fitzgen reviewed Oct 23, 2025

View reviewed changes

This was referenced Nov 7, 2025

WIP: create 4-byte struct ValTypePacked to offset larger enum ValType #2369

Closed

Consider replacing the ValType enum with an opaque 32-bit integer. #2370

Open

yurydelendik and others added 5 commits November 11, 2025 11:16

Implement custom descriptors proposal

6d7e08d

Add custom proposals tests; modify snapshots

92e0cf5

Get tests passing

7141fba

Also adjust the validation of `Exact` feature-wise and add a test that custom-descriptors constructs are rejected without the feature.

Update spec test suite with more custom-descriptors tests

7094adb

Initial support in wasm-smith

5182421

yurydelendik force-pushed the custom-descr branch from 8524917 to 5182421 Compare November 11, 2025 17:36

Update snapshot expectations

29ab20a

Simplify ref.get_desc and fix exact.wast

62fd2be

yurydelendik marked this pull request as ready for review November 14, 2025 17:38

yurydelendik requested a review from a team as a code owner November 14, 2025 17:38

yurydelendik requested review from pchickey and removed request for a team November 14, 2025 17:38

alexcrichton requested review from alexcrichton and removed request for pchickey November 14, 2025 18:30

Expect exact-func-import.wast to fail

63524dd

alexcrichton approved these changes Nov 14, 2025

View reviewed changes

alexcrichton added this pull request to the merge queue Nov 14, 2025

Merged via the queue into bytecodealliance:main with commit ec6b54a Nov 14, 2025
34 checks passed

Implementation of the custom descriptors proposal #2295

Implementation of the custom descriptors proposal #2295

Uh oh!

Conversation

yurydelendik commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexcrichton left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fitzgen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

fitzgen left a comment

Choose a reason for hiding this comment

Uh oh!

fitzgen commented Oct 24, 2025

Uh oh!

fitzgen commented Nov 7, 2025

Uh oh!

alexcrichton commented Nov 7, 2025

Uh oh!

fitzgen commented Nov 7, 2025

Uh oh!

fitzgen commented Nov 7, 2025

Uh oh!

alexcrichton commented Nov 7, 2025

Uh oh!

fitzgen commented Nov 7, 2025

Uh oh!

alexcrichton commented Nov 7, 2025

Uh oh!

yurydelendik commented Nov 11, 2025

Uh oh!

alexcrichton commented Nov 13, 2025

Uh oh!

alexcrichton commented Nov 14, 2025

Uh oh!

yurydelendik commented Nov 14, 2025

Uh oh!

alexcrichton commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yurydelendik commented Sep 2, 2025 •

edited

Loading