Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Into<Message> in signing api #755

Merged
merged 1 commit into from
Oct 23, 2024

Conversation

liamaharon
Copy link
Contributor

Closes #700 (also see rust-bitcoin/rust-bitcoin#2821).

Unrelated question while I have the authors attention: the schnorr apis (e.g. https://github.com/liamaharon/rust-secp256k1/blob/9afbf5111113ce84ff6f3b52f37c60554af2c283/secp256k1-sys/src/lib.rs#L81-L82) accepts a param named msg32, then directly after msg_len. I find it confusing since I would assume from the name msg32 it must be 32 bytes. Should those instances be renamed msg if it is indeed variable len?

@liamaharon liamaharon force-pushed the into-message branch 4 times, most recently from 87aee8c to 50df818 Compare October 19, 2024 19:13
@liamaharon
Copy link
Contributor Author

Any tips for debugging WASM and ASAN?

@tcharding
Copy link
Member

WASM fail is unrelated. I don't know what is causing the ASAN fail, from looking at the changeset I'd hazard a guess it is also unrelated.

tcharding
tcharding previously approved these changes Oct 21, 2024
Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 50df818

@apoelstra
Copy link
Member

In 50df818:

I am seeing a test failure with --release, as cargo test --release

test tests::test_low_r ... FAILED
test tests::test_low_s ... ok
test tests::test_noncedata ... ok
test tests::test_grind_r ... FAILED

failures:

---- tests::test_low_r stdout ----
thread 'tests::test_low_r' panicked at src/lib.rs:937:9:
assertion `left == right` failed
  left: 30440220047dd4d049db02b430d24c41c7925b2725bcd5a85393513bdec04b4dc363632b02201054d0180094122b380f4cfa391e6296244da773173e78fc745c1b9c79f7b713
 right: 304402207d1056c8864e4e942fe728d9764e1e5e09e92748ce65c41e999cdb276b2e3b6102207d3e3aff89a0cc0a84ec49983dd9ee975b8e66d64e8b9e3a24bd87aa7cbf2489
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- tests::test_grind_r stdout ----
thread 'tests::test_grind_r' panicked at src/lib.rs:954:9:
assertion `left == right` failed
  left: 304302202ffc447100d518c8ba643d11f3e6a83a8640488e7d2537b1954b942408be6ea3021f26e1248dd1e52160c3a38af9769d91a1a806cab5f9d508c103464d3c02d6e1
 right: 3043021f74262540ad2375f9e2742660769a2f9f706c222a1c0ff6500768c7dcdb739e022010f3c87a33b1a6f9f0d969bc8e3d9e8d2b7dcc3feacad1cf1092fa9ea5edc4ad


failures:
    tests::test_grind_r
    tests::test_low_r

test result: FAILED. 46 passed; 2 failed; 1 ignored; 0 measured; 0 filtered out; finished in 0.02s

This is the first time I'm trying to test with --release on this crate, so my inclination is that it's not this PR's fault....but it appears that master runs clean and this does not. So I think it's a real bug.

But code-reveiw ack.

@liamaharon
Copy link
Contributor Author

liamaharon commented Oct 21, 2024

Wow, good catch @apoelstra. Managed to fix it with this change: c0937e3

Not sure what was wrong with the previous code, FFI wasn't happy reusing the same msg pointer.. Curious if you have any ideas?

Seems like --release testing is also needed in CI?

edit: this also fixed ASAN.

@apoelstra
Copy link
Member

Seems like --release testing is also needed in CI?

I think we just got lucky and what we really need is Miri testing -- this fix does not look like a normal release/debug issue but rather a symptom of UB in our code. Sadly we will probably never be able to Miri-test this crate because Miri can't handle FFI calls.

Maybe we can try running tests in valgrind. A long time ago that didn't work because of bugs in rustc but maybe those are resolved.

If none of that works I suppose we can add tests with --release ... but really the reason that this caught the bug is that a different combination of optimizer flags led to the bug being exposed.

BTW can you sqash the two commits together? I would prefer not to have any commits in history with UB, to the extent possible :).

sk: &SecretKey,
check: impl Fn(&ffi::Signature) -> bool,
) -> Signature {
let mut entropy_p: *const ffi::types::c_void = ptr::null();
let mut counter: u32 = 0;
let mut extra_entropy = [0u8; 32];
let msg = msg.into();
Copy link
Member

@tcharding tcharding Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why we have a local var here but on line 262 we just chain the calls? Did you mean to change both?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the local variable to ensure that the object lives through the entire time that we are using the pointer.

If we're still chaining the calls on line 262 then that also needs to be changed. Thanks for beating me to the review!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I thought as much.

Out of interest how can the value not exist the entire time we are using it when its pass by value, so is part of the stack frame, calling Into is just borrow checker stuff and as_c_ptr is type checker stuff, neither of which effects the actual data (I think). Then the value is used in a single function call (the ffi function call).

Copy link
Member

@apoelstra apoelstra Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by Into being "just borrow checker stuff". Into::into takes an abstract message by value and returns a Message by value. Then as_c_ptr borrows this object and returns a raw pointer whose lifetime must not exceed the lifetime of the Message. The borrow checker is barely involved with any of this, and even if it was, it can only do sanity checks; it never affects the semantics of code.

But if we don't give the Message a variable binding, its lifetime will only consist of one line of code. If we give it a binding it'll live until the end of the function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tcharding note that this is the same underlying issue as this lint: https://doc.rust-lang.org/rustc/lints/listing/warn-by-default.html#temporary-cstring-as-ptr

(one day we can write an attribute to add these lints for user types)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just musings and for my education, so please only respond if it amuses you to do so.

When sign_ecdsa_with_noncedata_pointer is compiled is the parametr msg: impl Into<Message>, just 32 bytes in the stack frame?

I'm not sure what you mean by Into being "just borrow checker stuff".

I misspoke, FTR I don't know the exact correct terminology for all the parses of the compiler.

I read the link above but that is different in that a str has to have memory backing it but in our case I thought the memory backing it would be the 32 bytes in the stack frame that was passed in as msg (in function sign_ecdsa_with_noncedata_pointer) - so I can't understand why creating a local variable is fixing the problem.

Said another way, I get that at the end of msg.into().as_c_ptr() that there are no guarantees that the pointer is valid, but I don't get why we cannot tell it is valid because we know where the value is on the stack already because it was passed in with the function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think I can map C functions to opcodes but I do not know exactly how to map Rust functions to opcodes.)

Copy link
Contributor Author

@liamaharon liamaharon Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why we have a local var here but on line 262 we just chain the calls? Did you mean to change both?

@tcharding it's because here it's used in a loop. impl Into<Message> doesn't implement Copy, so if we .into it inside the loop we lose ownership and cannot proceed with more iterations.

Screenshot 2024-10-22 at 10 51 27

By calling .into outside of the loop, we get the Message type which implements Copy so it can be used inside the loop as many times as needed.


We need the local variable to ensure that the object lives through the entire time that we are using the pointer.

If we're still chaining the calls on line 262 then that also needs to be changed. Thanks for beating me to the review!

@apoelstra interesting. So once Rust passes .into().as_c_ptr() to the ffi, it marks the memory as unused and so that memory could be recycled before the C code finishes doing what it needs with that memory slice? I wonder if that is what was causing the UB in the release build.

I've updated all instances of chaining to have the .into() in a higher scope so the memory is held onto.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the link above but that is different in that a str has to have memory backing it but in our case I thought the memory backing it would be the 32 bytes in the stack frame

How could it be? The 32 bytes in the stack frame belong to an object of a completely different type (an opaque Into<Message> vs a Message). And we are telling the compiler that we don't need the Into<Message> anymore and it can reclaim the memory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to read a book on how the Rust compiler works. Thanks for your patience.

Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK ec0a69f

Copy link
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK ec0a69f; successfully ran local tests; I wonder if we should do the same thing with SecretKey

@apoelstra apoelstra merged commit ac7c74a into rust-bitcoin:master Oct 23, 2024
29 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make the signing API use Into<Message>
4 participants