-
Notifications
You must be signed in to change notification settings - Fork 745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: remove excess optimization for sourcemap #5547
Conversation
cd260b6
to
661fe5e
Compare
Hi Maintainer,
outputs:
outputs:
|
526aaed
to
c0b040d
Compare
c0b040d
to
50e1f1b
Compare
@@ -133,6 +133,7 @@ | |||
(i32.const 0) | |||
) | |||
) | |||
;;@ fib.c:3:0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to understand why this addition is correct. So I turned the sourcemap to json, and there are three entries for line 3:
{
"generatedLine": 1,
"generatedColumn": 643,
"lastGeneratedColumn": null,
"source": "fib.c",
"originalLine": 3,
"originalColumn": 0,
"name": null
},
{
"generatedLine": 1,
"generatedColumn": 686,
"lastGeneratedColumn": null,
"source": "fib.c",
"originalLine": 3,
"originalColumn": 0,
"name": null
},
{
"generatedLine": 1,
"generatedColumn": 693,
"lastGeneratedColumn": null,
"source": "fib.c",
"originalLine": 3,
"originalColumn": 0,
"name": null
},
Before this PR there are two locations where line 3 is mentioned in the .fromBinary
file here. So I'd expect one more to appear, to get to 3 appearances to match the source map. But this PR adds more than that. Am I looking at this wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kripken Hi Alon,
sourceMap
location is not one by one equal to the debug location of expression. it is a range concept.
to be short, if expression offset is between previous source map debug location offset and current source map debug location offset, this expression will take the previous source map debug location.
before this change, the debug location of siblings element just were not printed, but it existed in the functionRef->debugLocations
, so this change just print it and it is more clear to see which instruction is under which location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if remove the filter condition of printDebugLocation
of Print.cpp
.
it will print all locations to all expressions, and it is more clear to see the location information, and just for debug, print all debug location is not neccessary I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the range concept idea documented somewhere? I'm not familiar with that. I had assumed source maps are just individual locations, but I could be wrong...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Alon,
I didnot find the documentation of the solution, this may come from PR:
#1017
Here is the read and write flow:
wasm-opt
read:
wasm-s-parser.h
function:SExpressionParser::parseDebugLocation
it will updateloc
ofSExpressionParser
, so the after expression will take this position until theloc
will updated with another debug location.- binary write:
WasmBinaryWriter::writeSourceMapEpilog
writeBase64VLQ(*sourceMap, int32_t(offset - lastOffset));
writeBase64VLQ(*sourceMap, int32_t(loc->fileIndex - lastLoc.fileIndex));
writeBase64VLQ(*sourceMap, int32_t(loc->lineNumber - lastLoc.lineNumber));
writeBase64VLQ(*sourceMap,
int32_t(loc->columnNumber - lastLoc.columnNumber));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kripken
Hi Alon,
I suppose the primary design for source map is shrink the size of debug info like DWARF, but source map is lack the control instructions, so do you think we should back to point to point
solution for source map debug info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @JesseCodeBones , sorry for the late response. I don't know that much about this and was hoping someone else would know...
I think you might be right, that there are ranges here, since we keep using the same value until it is updated. I was not aware of that before. But since that is how it is, we shouldn't change it, I guess.
To move forward here, getting back to the start of this thread, I think it would be helpful to explain the diffs in this PR. That is, I'm still not sure how to read the diff here and check that it is correct. Something like "this annotation appeared here because of this part of the sourcemap" would be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kripken
Hi Alon,
the changes contains two parts:
part 1: during printing, if parent ancestors node already printed the debug info, skip print
part 2: during source map generation, only ancestors generate the source map debug info
I am not sure if it is clear, if you have any question, please feel free to ask me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand that, and the code sort of makes sense. But I am having specific trouble in reading the diffs in the testcase specifically. That is where I need help. I'll add a comment on a specific part of the test diff now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry, this is already on a test diff 😄 Then, yeah: how can I see that this addition on line 136 is a correct one to add? I'm not sure how to read the sourcemap itself and see that it says there should be an annotation here.
@@ -2547,6 +2548,10 @@ struct PrintSExpression : public UnifiedExpressionVisitor<PrintSExpression> { | |||
Module* currModule = nullptr; | |||
Function* currFunction = nullptr; | |||
Function::DebugLocation lastPrintedLocation; | |||
uint32_t lastPrintedIndent = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we will cache last print indent as the depth of last printed instruction
@kripken
if (func != nullptr) { | ||
const auto& debugLocationIterator = func->debugLocations.find(curr); | ||
if (debugLocationIterator != func->debugLocations.cend()) { | ||
if (lastDebugLocation != debugLocationIterator->second || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the function debug info emit, we add the depth logic, if the parent or siblings instruction already inserted the debuginfo, the instruction will not emit the debug info in source map generation, it will not impact the debug info reload from source map (children and siblings instructions will also get the debug info when load them from source map to memory structures), just optimize the generation of source map.
@kripken
// ) | ||
// ;; debug location | ||
// (siblings) | ||
if (lastPrintedLocation == location && indent > lastPrintedIndent) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kripken
Hi Alon,
the easiest way to check the debug info is remove this filter for debugging, here is my step:
1, remove this condition return
2, build binaryen
3, run cmd: ./build/bin/wasm-opt ./test/fib-dbg.wasm -ism ./test/fib-dbg.wasm.map --print
then wasm-opt will print all debug info.
so you can check if the new added annotation of debug info is reasonable or not.
if any other question, please feel free to ask, thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kripken
Hi Alon,
here is the wat file(fib-dbg.wasm.fromBinary) will full debug info printed:
(module
(type $i32_=>_i32 (func (param i32) (result i32)))
(type $i32_i32_=>_none (func (param i32 i32)))
(type $none_=>_i32 (func (result i32)))
(type $i32_=>_none (func (param i32)))
(type $none_=>_none (func))
(import "env" "memory" (memory $mimport$0 256 256))
(import "env" "table" (table $timport$0 0 0 funcref))
(import "env" "DYNAMICTOP_PTR" (global $gimport$0 i32))
(import "env" "tempDoublePtr" (global $gimport$1 i32))
(import "env" "ABORT" (global $gimport$2 i32))
(import "env" "STACKTOP" (global $gimport$3 i32))
(import "env" "STACK_MAX" (global $gimport$4 i32))
(import "env" "gb" (global $gimport$5 i32))
(import "env" "fb" (global $gimport$6 i32))
(import "global" "NaN" (global $gimport$7 f64))
(import "global" "Infinity" (global $gimport$8 f64))
(import "env" "memoryBase" (global $gimport$9 i32))
(import "env" "tableBase" (global $gimport$10 i32))
(global $global$0 (mut i32) (global.get $gimport$0))
(global $global$1 (mut i32) (global.get $gimport$1))
(global $global$2 (mut i32) (global.get $gimport$2))
(global $global$3 (mut i32) (global.get $gimport$3))
(global $global$4 (mut i32) (global.get $gimport$4))
(global $global$5 (mut i32) (global.get $gimport$5))
(global $global$6 (mut i32) (global.get $gimport$6))
(global $global$7 (mut i32) (i32.const 0))
(global $global$8 (mut i32) (i32.const 0))
(global $global$9 (mut i32) (i32.const 0))
(global $global$10 (mut i32) (i32.const 0))
(global $global$11 (mut f64) (global.get $gimport$7))
(global $global$12 (mut f64) (global.get $gimport$8))
(global $global$13 (mut i32) (i32.const 0))
(global $global$14 (mut i32) (i32.const 0))
(global $global$15 (mut i32) (i32.const 0))
(global $global$16 (mut i32) (i32.const 0))
(global $global$17 (mut f64) (f64.const 0))
(global $global$18 (mut i32) (i32.const 0))
(global $global$19 (mut i32) (i32.const 0))
(global $global$20 (mut i32) (i32.const 0))
(global $global$21 (mut f64) (f64.const 0))
(global $global$22 (mut i32) (i32.const 0))
(global $global$23 (mut f64) (f64.const 0))
(export "setThrew" (func $setThrew))
(export "runPostSets" (func $runPostSets))
(export "establishStackSpace" (func $establishStackSpace))
(export "stackSave" (func $stackSave))
(export "stackRestore" (func $stackRestore))
(export "_fib" (func $_fib))
(export "stackAlloc" (func $stackAlloc))
(func $stackAlloc (param $0 i32) (result i32)
(local $1 i32)
(block $label$1
(local.set $1
(global.get $global$3)
)
(global.set $global$3
(i32.add
(global.get $global$3)
(local.get $0)
)
)
(global.set $global$3
(i32.and
(i32.add
(global.get $global$3)
(i32.const 15)
)
(i32.const -16)
)
)
(return
(local.get $1)
)
)
)
(func $stackSave (result i32)
(return
(global.get $global$3)
)
)
(func $stackRestore (param $0 i32)
(global.set $global$3
(local.get $0)
)
)
(func $establishStackSpace (param $0 i32) (param $1 i32)
(block $label$1
(global.set $global$3
(local.get $0)
)
(global.set $global$4
(local.get $1)
)
)
)
(func $setThrew (param $0 i32) (param $1 i32)
(if
(i32.eq
(global.get $global$7)
(i32.const 0)
)
(block
(global.set $global$7
(local.get $0)
)
(global.set $global$8
(local.get $1)
)
)
)
)
(func $_fib (param $0 i32) (result i32)
(local $1 i32)
(local $2 i32)
(local $3 i32)
(local $4 i32)
(local $5 i32)
(local $6 i32)
(local $7 i32)
(local $8 i32)
(local $9 i32)
(local $10 i32)
(local $11 i32)
(block $label$1
(local.set $11
(global.get $global$3)
)
;;@ fib.c:3:0
(local.set $6
;;@ fib.c:3:0
(i32.gt_s
;;@ fib.c:3:0
(local.get $0)
;;@ fib.c:3:0
(i32.const 0)
)
)
;;@ fib.c:3:0
(if
;;@ fib.c:3:0
(local.get $6)
(block
;;@ fib.c:3:0
(local.set $1
;;@ fib.c:3:0
(i32.const 0)
)
;;@ fib.c:3:0
(local.set $5
;;@ fib.c:3:0
(i32.const 1)
)
;;@ fib.c:3:0
(local.set $8
;;@ fib.c:3:0
(i32.const 0)
)
)
(block
;;@ fib.c:3:0
(local.set $4
;;@ fib.c:3:0
(i32.const 1)
)
;;@ fib.c:8:0
(return
;;@ fib.c:8:0
(local.get $4)
)
)
)
;;@ fib.c:8:0
(loop $label$4
;;@ fib.c:8:0
(block $label$5
;;@ fib.c:4:0
(local.set $3
;;@ fib.c:4:0
(i32.add
;;@ fib.c:4:0
(local.get $5)
;;@ fib.c:4:0
(local.get $1)
)
)
;;@ fib.c:3:0
(local.set $9
;;@ fib.c:3:0
(i32.add
;;@ fib.c:3:0
(local.get $8)
;;@ fib.c:3:0
(i32.const 1)
)
)
;;@ fib.c:3:0
(local.set $7
;;@ fib.c:3:0
(i32.eq
;;@ fib.c:3:0
(local.get $9)
;;@ fib.c:3:0
(local.get $0)
)
)
;;@ fib.c:3:0
(if
;;@ fib.c:3:0
(local.get $7)
(block
;;@ fib.c:3:0
(local.set $4
;;@ fib.c:3:0
(local.get $3)
)
;;@ fib.c:3:0
(br $label$5)
)
(block
;;@ fib.c:3:0
(local.set $2
;;@ fib.c:3:0
(local.get $5)
)
;;@ fib.c:3:0
(local.set $5
;;@ fib.c:3:0
(local.get $3)
)
;;@ fib.c:3:0
(local.set $8
;;@ fib.c:3:0
(local.get $9)
)
;;@ fib.c:3:0
(local.set $1
;;@ fib.c:3:0
(local.get $2)
)
)
)
;;@ fib.c:3:0
(br $label$4)
)
)
(return
;;@ fib.c:8:0
(local.get $4)
)
)
)
(func $runPostSets
(local $0 i32)
(nop)
)
;; custom section "sourceMappingURL", size 35, contents: "\"http://localhost:8000/fib.wasm.map"
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks, and sorry for the delay.
I'll look at this again soon, but first, I have recently been debugging another source maps issue (a user reported the wrong annotation somewhere), and so I read the source maps spec again and then our code. I have to say I am still a little confused on it. It seems like the spec says nothing about ranges or how to infer debug locations for things without an annotation, but Binaryen does do that - it will keep using a debug location after it sees one, until the next. I am not sure if that is correct or not, and I'm not sure if that is related to this PR or not, but the binary format parts seem like they might be? I am quite worried about changing it, because of how it might affect other users...
For this PR specifically, since I would like to move forward on it: Is it possible to split the Print parts from the binary format parts? I think there is a low bar to landing a Print improvement, but a high bar for a binary format change (again, because users could be affected by the latter). So if we can split it, that would make review simpler, and also allow the Print part to land quickly I hope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kripken
yep, I will splite the PR and suppose the print PR will comming soon
Removing the excess optimization for source map generation and print.
Followup to: 5504 and 5524