-
Notifications
You must be signed in to change notification settings - Fork 694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify display conventions for wasm locations #1053
Conversation
Web.md
Outdated
To achive the same goal of a common representations for WebAssembly constructs, the | ||
following conventions are adopted. | ||
|
||
A wasm location is a reference to a particular instruction in the binary, and may be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/wasm/WebAssembly/g
everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Web.md
Outdated
Where | ||
* `${url}` is the URL associated with the module (e.g. via a response object), if any. | ||
* `${funcIndex}` is an index the [function index space](https://github.com/WebAssembly/design/blob/master/Modules.md#function-index-space). | ||
* `${pcOffset}` is the offset in the module binary of the first byte of the instruction, printed in hexadecimal with lower-case digits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0x
prefix or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but in this formulation the 0x
is part of the template and not part of the substituted value. Do you think we should switch that? It would mean that this line would say something like "${pcOffset}
is the offset ... printed in hexadecimal with lower-case digits and a leading 0x
prefix" which seems a little more awkward, but I don't have a strong opinion on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dunno, all I'm saying is this isn't clear. Whichever way works for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched to more-or-less that wording. I agree it's clearer.
Web.md
Outdated
Names of functions may also be displayed if the module contains a `"name"` | ||
section; these can be used in the same contexts as JavaScript functions. | ||
If there are no names provided, then engines should somehow indicate this; | ||
(it may be sufficient to simply use e.g. an empty string if the name is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function number instead of empty string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a stack trace this would be kind of redundant. e.g. in SpiderMonkey, the full stacktrace entry would be
${name}@${location}
where ${location} would include the wasm function number for wasm, and ${name} is already empty for top-level JS code not in a function.
For V8 the format is currently
at ${name} (${location})
for wasm and when there is a JS function, and just at ${location}
for for top-level JS.
So the point is that there is already precedent for empty JS function names that browsers might want to reuse. OTOH I'm not opposed to a little redundancy in the name of making things clearer either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I guess this wording is unclear to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworded to clarify.
Web.md
Outdated
It has the following format: | ||
`${url}:wasm-function[${funcIndex}]:0x${pcOffset}` | ||
Where | ||
* `${url}` is the URL associated with the module (e.g. via a response object), if any. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the format if there is no URL? Is the field just empty? Is the colon still included?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there should be something, rather than empty. The note below addresses that, but I agree it's not clear at this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise if it's empty there would be no way to tell different modules apart if they didn't have URLs. Obviously it's still possible to have collisions if the instantiation location is used, but at least it would allow a developer to avoid them if they cared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworded to clarify.
Web.md
Outdated
`${url}:wasm-function[${funcIndex}]:0x${pcOffset}` | ||
Where | ||
* `${url}` is the URL associated with the module (e.g. via a response object), if any. | ||
* `${funcIndex}` is an index the [function index space](https://github.com/WebAssembly/design/blob/master/Modules.md#function-index-space). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: index into the ...
Also, the URL can be relative (just "Modules.md#...").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
I am still a little fuzzy about the Edit: I just noticed the part about the |
Web.md
Outdated
* `${pcOffset}` is the offset in the module binary of the first byte of the instruction, printed in hexadecimal with lower-case digits. | ||
* `${url}` is the URL associated with the module (e.g. via a response | ||
object), or other module identifier (see notes). | ||
* `${funcIndex}` is an index the [function index space](Modules.md#function-index-space). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: missing "in"
Web.md
Outdated
|
||
Notes: | ||
* The URL field may be interpreted differently depending on the context. For | ||
example offline tools may use a file name; or when the ArrayBuffer-based | ||
`WebAssembly.instantiate` API is used in a browser, it may display the | ||
location of the API call instead. | ||
location of the API call instead. It should not be empty however; a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API calls may not have useful source locations either, e.g. when performed as part of an eval
call.
Web.md
Outdated
|
||
Notes: | ||
* The URL field may be interpreted differently depending on the context. For | ||
example offline tools may use a file name; or when the ArrayBuffer-based | ||
`WebAssembly.instantiate` API is used in a browser, it may display the | ||
location of the API call instead. | ||
location of the API call instead. It should not be empty however; a | ||
developer should be able to write their code such that modules from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting that programmers should be able to rely on unambiguous location URLs? If so, I don't think that can work in general, e.g. for the aforementioned reason, but also because of URL ambiguities in general. I would rather drop that half-sentence.
@Cellule @rossberg-chromium re:URLs So the obvious case is for the response APIs where there's a real non-data URL, which we hope will be the most common.
|
I don't really like option 2 since it seems likely that with e.g. dynamic loading we'll have several linked modules all instantiated from the same API call (especially if we have an |
One thing to note is that currently for eval browsers have all sorts of heuristics that generate "URLs" for stack traces. (E.g. I've seen "<eval code> in https://example.com/page.html" as a "URL".) I'm not sure whether browsers would want to either reuse that logic for wasm, or if maybe the community wants to crack down and only allow interoperable well-specified URLs for this new technology. (Possibly including such generated "URLs" in the future, but only if we get them specified so they can be implemented interoperably.) |
If modules can have names inside them, it makes sense to prefer that over the location of an API call. And maybe even over the URL (although then it would be asymmetric with JS locations which do use URLs instead of other names). So I could go either way on whether a URL or module name should be preferred. But in any case, having the name available will at least ensure that a developer can always specify something of their choosing. |
(And I've not attempted to specify here what anyone should do if in the presence of |
Thanks for writing this up! Sorry for taking so long to get back to it (it always takes some time to page everything in). So the module name is a new (but good) twist to consider. I think, even if a module name is supplied, you'd still always want to include the URL (since it's not redundant info). So what if instead:
What I like is that this keeps all the names from the name section to the left of the |
@lukewagner I like that idea; however:
|
|
To be clear, my point in bringing up eval was that the from-ArrayBuffer APIs are analogous in terms of the kind of "source URLs" they might generate, in response to
eval() of code that calls the from-ArrayBuffer APIs is yet another level of complication (similar to eval() of code that calls eval()), but I wasn't intending to discuss that. |
@domenic I think I agree with what you're saying, but I'm not sure if you're disagreeing with my more-recent comments :) To be clear, I'm suggesting that if, e.g., you're SM and already have URLs like I like the idea of trying to be even more compatible, but this seems hard if there's already a diverging precedent for eval and wasm from-ArrayBuffer APIs can be called from within eval. |
No disagreement; that sounds right! And yeah, it's not clear what the right answer is here, besides just speccing something like "engines should treat these APIs like they do eval for purposes of source locations". We can then hope that in the future someone takes on the heroic task of nailing down the stack trace format, including what happens with eval, for ES, and then wasm can just copy that work. |
Good suggestions; I've tried to capture it, PTAL |
Looks great, although there might be some mismatched parens in the example :) |
Web.md
Outdated
|
||
Names of functions may also be displayed if the module contains a | ||
["name" section](BinaryEncoding.md#name-section); | ||
these can be used in the same contexts as JavaScript functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "... same contexts as JavaScript function names".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I missed the reply in email; sorry for taking so long to get back and thanks for applying all the changes! lgtm with two small nits
Web.md
Outdated
not specify the full format of strings such as stack frame representations; | ||
this allows engines to continue using their existing formats for JavaScript | ||
(which existing code may already be depending on) while still printing | ||
WebAssembly frames in a format consistent with JavaScript. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add a Note saying somewhere that these conventions do not describe the value of the .name
property of exported WebAssembly functions which is precisely [defined](JS.md#exported-function-exotic-objects) to be ToString(function-index)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha good point. Would we want a way to map one to the other as a standalone function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You you mean like some new JS API for producing the module_name.func_name
? That seems possible, but it also makes the names section (more) semantically visible (than before), so I guess it depends on what our use case is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. My thinking it: we already let developers access the name section so they don't have to parse their own module... but then they need to parse the name section to get that information! Cut the middle-person. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense if client code would otherwise be doing their own binary parsing that we've already done. With the other Module reflection methods, our motivating use case was module loaders (and specific experience with incorporating wasm into SystemJS). It'd be nice to have some specific user who is wanting to programmatically access these function names.
Anyhow, this probably belongs in a different issue.
I think @dschuff is out for a bit, so I took the liberty of applying my review requests to the PR. I also turned the naming paragraph into nested bullets so it was a bit easier to see the if/else structure. |
Any last comments before merging? |
This has probably been discussed elsewhere, but... That would have the advantage of being independent of the current representation of a module (binary, text or some other data structure). As it is written, it seems one has to keep the particular byte stream around just to properly format a stack trace. In principle an absolute binary offset can be used without having to really understand the binary encoding, but it isn't immediately clear that this is an advantage here, because tooling that has access to the byte stream and wants to do anything useful with that information (like displaying the trapping instructions) needs to be able to parse function bodies and possibly convert them into textual representation in any case. |
The 'function n' part is already explicitly present in this PR via From my experience, and from the previous discussion in #990, I think the bytecode index is simpler for everyone. For the engine compiling a trapping instruction, it's quite easy to just save (compile into the fail path, save in trap metadata, etc) the "current" bytecode offset for later trap reporting; no need to save bytecode. If we had to report instruction index, we'd have to maintain an additional instruction-counter that was incremented after decoding each instruction and this could both be a source of rare corner-case bugs and a mild source of decoding slowdown. For wasm producers, tools like wabt have a Also, wasm source maps are currently being proposed to map bytecode offsets to source via bytecode offset and this would provide what developers really want which is errors in terms of source code location. |
@lukewagner Thanks for helping push this along! I do think that using some abstraction other than byte offset is an interesting idea worth considering; we currently have some discussion in #1064 and here, so maybe we should merge this and file a separate issue or PR for that question specifically. If we were to switch, it would just replace the |
Based on the discussion in #990