-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First draft of DebugInfomationFormat.md. #1
Conversation
section. In the binary format, this will be an unknown section inserted into | ||
the wasm file. Because this new info will make the current | ||
[name section](BinaryEncoding.md#name-section) redundant, we propose deprecating | ||
the name section. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name section is aimed at debugging wasm at the wasm level, which is a related but independent concern that is not redundant with debugging at the high-level-language level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. Definitely not redundant, then. I'll drop this sentence.
(Line notes tend to disappear on GH, so I'll put my comment here, even though it's extending the convo in line notes above.) First of all, thanks for posting this and kicking off a discussion! I think @sunfishcode raises an important "meta" topic that would help set the context for discussing this particular proposal. In general, I see there being a spectrum of expressiveness/user experience/complexity here:
I see the value of 1 (just enough to provide useful backtraces in both devtools and Error.stack). 2 has potential in that it could leverage existing devtool work supporting sourcemaps. 3 vs. 4 has been a question for a while and I lean towards 4 simply because specifying and implementing 3 seems like a huge undertaking (universal debugging format sounds even harder than universal VM :). IIUC, this proposal is starting closer to 2 but is talking about extending toward 3. Is that accurate? If so, I'd be interested to talk about 3 vs. 4 since, if we think 4 is the better long strategy, then that could set the context for this proposal as being more squarely focused at 2. |
Thanks, @lukewagner. The way I (very imperfectly) understand things, 1 and 2 are much better suited to debugging compiled code than source code. For instance, I don't know how these would support showing a source variable value. I'm warming up to 4 because of its flexibility and extendability, but I need to understand more details. For starters, how do we express arbitrary debug info in the text format? And the browser API will have to be bidirectional: dev tools UI would tell the debug library, say, that the user wants to set a breakpoint in some source location, and, conversely, the debug library would have to ask the browser for current wasm memory contents. We'll have to spell out those details and evaluate how future-proof they are. Interestingly, this proposal could be interpreted as one specific instance of the general strategy suggested by 4. If we adopted 4, this could be the first of multiple schemes for encoding and interpreting debug info. |
@dekimir You're absolutely right that, with 1, you're basically just adding names to assembly code (just like adding a symbols section to native binary), and 2 breaks down pretty quickly, so we can't stop there. That being said, for many quick-and-dirty tasks, 1 and 2 may be sufficient and their low(er) size overhead may make them attractive for some use cases.
I've been thinking as a black box: a
Right, I've been thinking of these as two completely separate interfaces:
Agreed that, given 4, your proposal could be fully implemented in "user space". (It'd probably be useful to have it as a leading prototypical use case, too, since it'd be much easier to get up and running than, say, full DWARF.) |
@lukewagner my question was how would .wast represent that big |
Oh, I see, sorry. So I guess it depends what you're doing. If you're writing the tool that generates the debug info, then I think we want something like what @yurydelendik asked for a while ago where you can tag various nodes in the .wast with a label and a .wast-to-.wasm tool would also spit out (somehow, separate file or new section) a tag-to-bytecode-offset map. Then you could use this with whatever debug info you extracted from LLVM or whatever tool you used to generate the initial .wast to produce the binary debug info section of the final .wasm. There is the separate question of how to render a .wasm that contains a user-space debug info section; I don't think there's a lot we can do here since it's all opaque by definition; I'd suggest just some string encoding like we currently have for data segments. Particular toolchains with support for a particular debug format could render their debug info to a particular text format, I suppose, but this would be separate from the wasm-spec-defined text format. |
@lukewagner: sounds like we're talking about .wast not being fully equivalent to the corresponding .wasm? IOW, the front end could output a .wasm containing debug info in a dedicated section, but the corresponding .wast wouldn't have it? And if the .wast contains a string encoding of the debug info, then don't the @ tags lose their purpose? Presumably the encoded info would remain valid even if the AST had no tags, right? |
Right, if we have a "black box" debug-info section, I'm not sure there's a lot the .wast can do other than say "here are the bytes". If we just did what I described above, then the @tags wouldn't be part of the official text language, they'd just be a tool to extract byte offsets from an assembler. |
|
||
META: We need an evolution strategy that allows new front-end/debugger pairs to | ||
use the format in the future to transfer information currently unanticipated. | ||
Examples: a) dynamic scoping in Lisp; b) full DWARF 4 equivalence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Full DWARF 4 seems pretty huge? Isn't that a Turing-complete language? :)
Would you have a specific DWARF feature to point at instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These examples are just to jog one's imagination for where future evolution may take us. They're not meant to suggest that we intend to go in that direction, just that we don't want to gratuitously prevent it. Another example could be "enough of DWARF to allow stock gdb to run on wasm modules".
Anyway, see the discussion below -- we are moving to a different, more flexible design that allows any debug-info format to be used. I'll mothball this PR soon and create another one with the new design description.
Here's another question that occurred to me: if the debug-info format is not fixed, and there are different debuggers out there (which I think is a fair assumption given the state of JS debugging today), how do we ensure that the user needs to compile their source only once? IOW, how do we avoid requiring something akin to |
I think, in the general case, we want a wasm module to contain the URL of its debugger that runs portably on all the browsers using the two portable interfaces we described above. Maybe after some time there will be a de facto standard that browsers can build in debuggers for as a convenience/optimization, but I think we shouldn't start with that since it could lead to precisely the type of problems you're describing. |
Superseded by WebAssembly#708. |
@yurydelendik @sunfishcode @jfbastien @dschuff: PTAL