-
Notifications
You must be signed in to change notification settings - Fork 694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Display in browsers and tools for addresses of functions and instructions #990
Comments
I agree 2 and 4 are basic things that would be great to agree on. Regarding 2, offsets relative to the entire binary file seem most natural and simple - there is no ambiguity, no offsets to add when looking in the binary, etc. And regarding hex vs. dec, no strong feelings but JS line numbers are dec, so maybe nice to be consistent with that. |
And now, in which I suggest a color for the bikeshed and tradeoffs: I think I'd prefer to see function index (and name if present) as well as an absolute offset (instead of an offset from the function start). Using a section offset would be more conventional (for the tools) compared to existing tools like objdump, but I agree that in the browser it would require printing yet another number or name, which wouldn't be great. I don't have a strong opinion on that. Also wrt base: hex is more common for tools, but decimal is more common for browsers. Naturally I prefer everything to look as much like unix tools from the 60s as possible 😁 |
I'm usually very opinionated on things, but in this particular issue I don't have strong feelings. I'd like to give the browser vendors some time to experiment--maybe try things we're not thinking of--in a competitive effort to provide a great debugging solution. |
We wanted to take a pass over all the error messages that V8 generates and make them a bit nicer. Happy to converge with others here. My thoughts:
|
I think the intent is for this agreement to cover those stacktraces, as well as the compile-time errors. Was there some particular reason you chose function-offset rather than file-offset for that? I think my preference would be for file or section offset but honestly that preference is based more on what it would look like in tooling rather than browsers. |
The function offset is preferable for stack traces, since they are more stable. E.g. I am thinking of tools that use stack traces of exceptions to track different kinds of software defects (like my previous project). It's nicer to have a more stable function-relative offset for those use cases. That said it is also good to see the byte offset in the module when debugging producers. So that's why the V8 --trace-wasm-decoder flag outputs both. |
I'm not sure I buy that function offsets are more stable. If you recompile a binary, both function offsets and file offsets are subject to arbitrary changes (not to mention that function index values or names may change or functions may disappear entirely). If you want to make any sense of a stack trace you need a copy of the exact binary in any case, no? |
It depends on the optimization level, of course. But for a given small change to a C++ function, unless that function is inlined everywhere, or somehow otherwise triggers different inlining or optimization decisions in other functions, those other functions' bytecode should be stable. |
I like hex over dec. I also like file offsets; you can use the same offset-space for any error. IIRC, @sbc100 modeled wasmdump after objdump, which appears to use file offsets, not section offsets:
It looks like label+offset is used as well:
|
OK, let's not consider section offsets then. I still have a preference for file/module offsets over function offsets, but I could be convinced. Do engine implementors think that one or the other of those would carry extra cost (e.g. memory or bookkeeping) in the engine? It's pretty easy for the offline tools to just do whatever. WRT hex vs dec split it might make sense to use hex if we do file offsets (they are like "addresses" and correspond more directly to how addresses are used elsewhere) and decimal if we do function offsets, just to signal that this is not exactly an address. |
For the --trace-wasm-decoder output, I think it makes sense to do what @binji has done and follow the objdump output format. For decoding error messages, function index + function name + function offset + module offset would probably be the most useful. For stacktraces, I think function index + function name + function offset is sufficient, as I don't see how the module offset would be particularly useful for runtime errors. WDYT? |
Module offsets in hex are the only thing useful if you are looking at hex
dumps of the binary. At least that's what I found when debugging the spec
interpreter (which initially used decimal). Having function offsets or
decimal is fine, too, but only in addition to the other.
…On 20 February 2017 at 10:54, titzer ***@***.***> wrote:
For the --trace-wasm-decoder output, I think it makes sense to do what
@binji <https://github.com/binji> has done and follow the objdump output
format. For decoding error messages, function index + function name +
function offset + module offset would probably be the most useful. For
stacktraces, I think function index + function name + function offset is
sufficient, as I don't see how the module offset would be particularly
useful for runtime errors. WDYT?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#990 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEDOO_fsZzww5DKuUyRN9NwzZaGIt3m9ks5reWLAgaJpZM4MCZDi>
.
|
@titzer: I don't see the distinction between error messages and runtime stacktraces in that context, the module offset seems very useful in both. (I personally don't see much value in the function offset, but am not opposed to having it as well.) |
I am against 6 (mandating precise exception reporting). Mostly because this makes bce much harder on 32 bit platforms. I also think whatever we do should fall short of normative text on error.stack output. I wonder too if it's even desirable to fully converge. Inevitably that means people parsing the text. And then for compat, the suggested non-normative text may become de facto standard (and unchangeable). |
I definitely like the idea of standardizing on a set of fields (that get embedded in the string) and their meaning. So for a given callstack entry, I can see these fields all being meaningful:
For Given the existing JS For the remaining three fields, it seems like we have the opportunity to define a shared string format to replace
? Some reasons:
Thoughts? |
@MikeHolman Agreed on precise offsets. I think we actually already agreed to this in the middle of #814. But w.r.t best-effort stack strings, I think server-side crash reporting/triage is a valid use case for |
Fair point. Error bucketing (e.g. Watson) proves very helpful for me, so it would be bad to neglect it. How do people do this in js? |
IIUC, they just grab an |
OK, time to push this forward a little more. I like @lukewagner's suggestion, so carrying that out a little more. In browsers, text representations of locations would be something like For wasm, For tools such as wasmdump (and in some browser use cases such as dev tools) it won't always make sense to have that whole location string together. For example if you are dumping a file, the filename would replace the URL but it wouldn't make sense to print it over and over again. You might have a format that includes a subset, formatted in the same format and order that it appears elsewhere. Likewise in devtools UI. This covers the goals:
A few open questions:
|
|
Currently if you have a trap or validation error in your wasm binary, V8 gives you error output like
Compiling WASM function #26:<?> failed:Result = expected 1 elements on the stack for fallthru to @3 @+58
orat (<WASM>[3]+156)
.SpiderMonkey says something like
CompileError: wasm validation error: at offset 4214: type mismatch: expression has type void but expected i32
Meanwhile if you want to look at your binary with WABT's wasmdump tool, you will see something like
WABT shows functions as indices in the (import+function) function index space, and all the "addresses" as hexadecimal file offsets.
If I understand correctly, V8 is giving you function index (in same function index space) + decimal function offset in the stack representation and... something else... in the validation error text.
SpiderMonkey gives you no function name or index, and a (file?) offset.
So to debug anything you might have to look at the source code of your favorite browser engine to determine what information you are getting, and then use your favorite programmers' calculator to convert between file and function offset and number bases. Needless to say this is a suboptimal user experience.
#814 had some discussion of normalizing the content of Error.stack, and I understand that's somewhat fraught. My goal here is simultaneously more ambitious (in that we should include tools as well as browsers) and less ambitious (see below).
Here are possible things we might agree on (where the definition of "agree" is also negotiable).
LinkError
andCompileError
text), and even dev tools.I think getting 2 and 4 is the absolute bare minimum, so a user can dump their binary and find (an approximation of) the faulting address or (the exact, hopefully) the invalid address, without having to do base conversions and arithmetic. 1 and 3 would be nice too, and seems like they ought to be pretty easy to agree on and implement. 5 and 6 would in principle be nice too, but as long as the other conditions are met, the incremental benefit is fairly small and the cost might be higher. I'm fine if we want to punt those.
As to the definition of "agree", there is:
Honestly even 4 would be fine. Whatever we do, other tools will probably follow for their own convenience, and users (who aren't going to look for specs unless they are confused) will be happy.
Thoughts?
The text was updated successfully, but these errors were encountered: