From 7048eb7d7cc2739077e81965ae91ffd5699d5c9e Mon Sep 17 00:00:00 2001 From: Harry Altman Date: Thu, 3 Aug 2023 21:14:25 -0400 Subject: [PATCH 1/3] Add mapping key identification to format --- docs/source/format.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/docs/source/format.md b/docs/source/format.md index 1d2e3d76..2f0c3384 100644 --- a/docs/source/format.md +++ b/docs/source/format.md @@ -54,6 +54,11 @@ For example, the Solidity compiler will in some cases perform a "tail-call" opti the compiler will push the entry point of `f` as the return address for the call to `g`. The format should help explicitly identify the targets of internal function calls and what arguments are being passed on the stack. +### Mapping key identification + +EVM languages commonly include non-enumerable mappings. As such, it is useful to be able to dynamically identify any mapping keys that may appear +while analyzing a transaction trace or debugging. + ## The Format The format will be JSON so that it may be included in the standard input/output APIs that the Vyper and Solidity compilers support. @@ -229,6 +234,7 @@ is itself a dictionary that (optionally) includes some of the following: * The AST ID(s) that "correspond" to the opcode * The layout of the stack, including type information and local variable names (if available) * Jump target information (if available/applicable) +* Identification of mapping key information In the above "correspond" roughly means "what source code caused the generation of this opcode". @@ -238,6 +244,7 @@ that contributed to the generation of this opcode. * `ast`: A list of AST ids for the "closest" AST node that contributed to the generation of this opcode. * `stack` A layout of the stack as understood by the compiler, represented as a list. * `jumps`: If present, provides hints about the location being jumped to by a jumping command (JUMP or JUMPI) +* `mappings`: If preent, contains information about how the opcode relates to mapping keys. #### Source Locations @@ -321,3 +328,10 @@ If the value of `sort` is `"return"`, then the dictionary has the following fiel * `returns`: A list of dictionaries with the same format of as the `arguments` array of `call`, but without any `return_address` entries. **Discussion**: The above proposal doesn't really handle the case of "tail-calls" identified at the beginning of this document, where multiple return addresses can be pushed onto the stack. Is that something debug format must explicitly model? + +#### Mapping key identification + +The value of this field (when present) is a dictionary with (some of) the following fields: +* `isMappingHash`: A boolean that identifies whether the opcode is computing a hash for a mapping. +* `isMappingPreHash`: For mappings that use two hashes, this boolean can identify whether the opcode is computing the first of the two hashes. Possibly this field should be combined with a previous one into some sort of enum? +* `mappingHashFormat`: An enumeration; specifies the format of what gets hashed for the mapping. Formats could include "prefix" (for Solidity), "postfix" (for Vyper value types), and "postfix-prehashed" (for Vyper strings and bytestrings). Possibly "prefix" could be split further into "prefix-padded" (for Solidity value types) and "prefix-unpadded" (for Solidity strings and bytestrings). This could be expanded in the future if necessary. From 6ca1e3ca6546357eb33a18109e8b3256e30ddc67 Mon Sep 17 00:00:00 2001 From: Harry Altman Date: Fri, 4 Aug 2023 13:29:38 -0400 Subject: [PATCH 2/3] Add note about padding types --- docs/source/format.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/format.md b/docs/source/format.md index 2f0c3384..c2f0aa46 100644 --- a/docs/source/format.md +++ b/docs/source/format.md @@ -334,4 +334,4 @@ If the value of `sort` is `"return"`, then the dictionary has the following fiel The value of this field (when present) is a dictionary with (some of) the following fields: * `isMappingHash`: A boolean that identifies whether the opcode is computing a hash for a mapping. * `isMappingPreHash`: For mappings that use two hashes, this boolean can identify whether the opcode is computing the first of the two hashes. Possibly this field should be combined with a previous one into some sort of enum? -* `mappingHashFormat`: An enumeration; specifies the format of what gets hashed for the mapping. Formats could include "prefix" (for Solidity), "postfix" (for Vyper value types), and "postfix-prehashed" (for Vyper strings and bytestrings). Possibly "prefix" could be split further into "prefix-padded" (for Solidity value types) and "prefix-unpadded" (for Solidity strings and bytestrings). This could be expanded in the future if necessary. +* `mappingHashFormat`: An enumeration; specifies the format of what gets hashed for the mapping. Formats could include "prefix" (for Solidity), "postfix" (for Vyper value types), and "postfix-prehashed" (for Vyper strings and bytestrings). Possibly "prefix" could be split further into "prefix-padded" (for Solidity value types) and "prefix-unpadded" (for Solidity strings and bytestrings). This could be expanded in the future if necessary. (Also, potentially `"prefix-padded"`, if split out, could be broken down even further, by padding type -- zero padding (left) vs sign-padding vs zero-padding (right)...) From af7c98f929c9929504e38594d03a8b5102a78c42 Mon Sep 17 00:00:00 2001 From: Harry Altman <35589221+haltman-at@users.noreply.github.com> Date: Wed, 9 Aug 2023 20:15:50 -0400 Subject: [PATCH 3/3] Fix typo Co-authored-by: Daniel --- docs/source/format.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/format.md b/docs/source/format.md index c2f0aa46..993bb86f 100644 --- a/docs/source/format.md +++ b/docs/source/format.md @@ -244,7 +244,7 @@ that contributed to the generation of this opcode. * `ast`: A list of AST ids for the "closest" AST node that contributed to the generation of this opcode. * `stack` A layout of the stack as understood by the compiler, represented as a list. * `jumps`: If present, provides hints about the location being jumped to by a jumping command (JUMP or JUMPI) -* `mappings`: If preent, contains information about how the opcode relates to mapping keys. +* `mappings`: If present, contains information about how the opcode relates to mapping keys. #### Source Locations