Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding for "shortstring" typed data (SNIP-12) #1039

Open
penovicp opened this issue Mar 25, 2024 · 4 comments
Open

Encoding for "shortstring" typed data (SNIP-12) #1039

penovicp opened this issue Mar 25, 2024 · 4 comments

Comments

@penovicp
Copy link
Collaborator

I want to draw attention to the approach taken with the shortstring encoding. It was implemented by reusing the existing felt encoding.

As a basic example, a provided value "hello" will be encoded as 0x68656c6c6f, this I believe to not be contentious. The part where I believe there might be issues with is the encoding of numerical strings, the existing felt encoding treats both numbers and numerical strings as the same value and such behaviour might be undesirable/unexpected for the shortstring type. As an example "2" will be encoded as 0x2 while someone coming from a Cairo background will probably assume it to be encoded as 0x32.

Added context: This issue is a copy of the issue noted in this PR comment.

@xJonathanLEI
Copy link
Contributor

Hi, I'm trying to implement SNIP-12 in starknet-rs. Given the lack of test vectors from the spec I decided to use starknet.js for generating test data, since it's pretty much the de facto impl in the wild. This is when I noticed a discrepancy between the spec and the starknet.js impl, which is exactly what this issue is about.

Based on the current impl:

case 'felt':
case 'shortstring': {
// TODO: should 'shortstring' diverge into directly using encodeShortString()?
if (revision === Revision.ACTIVE) {
assertRange(getHex(data as string), type, RANGE_FELT);
} // else fall through to default

a type of shortstring gets properly encoded as shortstring iif it's a non-numerical string. This violates the spec and can be highly problematic especially in the context of a contract where it can be very expensive to tell if a certain string is numerical. I would definitely call this a bug.

However, fixing this bug has some pretty devastating implications.

First of all, the bug conveniently accommodates an exception outlined in the spec:

Revision 1: Will be the initial version of the specification. Note that for this revision the value in this field should be the integer 1 and not the shortstring "1" despite being defined as shortstring. This exception is made to support an inconsistency in the Braavos wallet implementation.

with this bug the revision field correctly uses 0x1 for revision. Naively fixing this bug would cause this field to use 0x31 instead, which would violate the exception noted from the quote above.

This is a minor issue though. Some refactoring can be done to actually implement the exception specifically for that domain field.

The bigger issue is just the fact that fixing this bug itself is a breaking change. Many applications might already depend on the current behaviour. Making matters worse, this bug affects the domain hash, which is usually hard-coded in contracts for verification purposes. Upgrading them would either need to be a huge coordinated effort or simply impossible.

So I guess whichever direction we take, at least one of these need to be done:

  1. Obviously just fix this and ask apps/contracts to upgrade if they're affected (they most probably are). Here I'm assuming that wallets & apps use starknet.js for SNIP-12.
  2. Update SNIP-12 to officially specify that shortstring only uses short string encoding if the string in question is not ^\d+$ (good luck implementing this in a contract tho).

@xJonathanLEI
Copy link
Contributor

Hey @thiagodeev I saw you committed to SNIP-12 so pinging you here. Any thoughts on this? Thanks!

@thiagodeev
Copy link

Hi @xJonathanLEI.
I believe the confusion comes from the SNIP-12 statement that the shortstring type is encoded "... the way cairo transforms the value into a felt" (ref); it doesn't specify if it should be the ASCII representation. I don't know how a felt is created in Cairo under the root, I just used the methods available to implement the feature in Starknet.go and they matched the results from starknet.js.
The SNIP is unclear in various aspects.

I recommend you to keep implementing the feature by comparing it with starknet.js, because as you said, it's the most widely used. In the meantime, you can keep analyzing it until finding an answer. Please let me know the result of your research.

Thanks

@penovicp
Copy link
Collaborator Author

Considering this issue and the faulty enum encoding noted within #1292, the option of introducing a new revision to the SNIP-12 spec was brought up as a possible solution.

The avenue would be to modify the revision 1 spec in line with the current implementation, and the corrected implementation to be represented as revision 2.

Tagging various people that have been involved with the topic, feel free to bring anyone else into the discussion as you deem appropriate: @xJonathanLEI @thiagodeev @sgc-code @gaetbout @DelevoXDG @yoga-braavos @PhilippeR26 @amanusk @tabaktoni

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants