Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is global-context-less-secure a feature? #713

Open
Kixunil opened this issue Jul 25, 2024 · 18 comments
Open

Why is global-context-less-secure a feature? #713

Kixunil opened this issue Jul 25, 2024 · 18 comments
Labels
1.0 Issues and PRs required or helping to stabilize the API

Comments

@Kixunil
Copy link
Collaborator

Kixunil commented Jul 25, 2024

It's supposedly used to solve the situation when rand panics. However IIUC this panic really comes from getrandom always returning a failure. This should never happen unless the OS is broken (then panicking might be appropriate) or someone used the custom feature of getrandom and implemented the feature incorrectly.

But even if it's justifiable to have this on some super exotic architectures (which ones?) it looks like it should be a cfg rather than feature. It being feature causes cargo test --all-features to not test rerandomization which is weird. We could also use RngCore::try_fill_bytes and just ignore the errors during rerandomization, so the cfg wouldn't be needed (unless people need it for performance).

@Kixunil Kixunil added the 1.0 Issues and PRs required or helping to stabilize the API label Jul 25, 2024
@apoelstra
Copy link
Member

Agreed, let's drop the feature. There are a few solutions:

  • Always use randomness and make people manually turn on the js feature of getrandom or else get panics (which in browser JS appear as the page just not loading)
  • Always use randomness and make us turn on the js feature of getrandom (but we want this to be the case only when targeting wasm, which I'm unsure if it's possible, and maybe users wouldn't want the feature for some reason?)
  • Use conditional compilation to refuse to compile without the feature, so the user has to do it himself.
  • Use try_fill_bytes and just silently not rerandomize without the feature. (Thanks for the suggestion -- I think this wasn't a function back in 2021, or I wasn't aware of it anyway).

I am strongly leaning toward the latter solution. After the first rerandomization we don't actually need any randomness because we can pilfer some from secret key data. So the loss of security (which to reiterate from other issues about this, is purely defense-in-depth and has no practical value that we are aware of) would only affect users who were constantly restarting the software and then constantly doing exactly the same operations after restart.

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 25, 2024

  • manually turn on the js feature of getrandom or else get panics (which in browser JS appear as the page just not loading)

I happened to look at the code of getrandom and found it actually compile_errors, so I guess it's just some custom implementations returning Err.

  • make us turn on the js feature of getrandom

This goes directly against recommendation of getrandom docs and thus I don't want to do that because exactly, someone might need the feature to be off. However I was considering to forward the feature IFF we want to stabilize before getrandom is stable. It'll take a long time before that so I wouldn't jump to it now.

  • Use conditional compilation to refuse to compile without the feature

It's not possible to detect the state of a feature in a foreign crate.

After the first rerandomization we don't actually need any randomness because we can pilfer some from secret key data.

Oh, I didn't know that. But doesn't it mean we can always use secret key data? Why would it be different scenario?

Anyway, given your argument with restarting, I think silently skipping is fine.

@apoelstra
Copy link
Member

Oh, I didn't know that. But doesn't it mean we can always use secret key data? Why would it be different scenario?

The premise behind the randomization API is that you start from a fresh random state on startup, and from that point onward you can "reseed" the RNG using secret key data (since this data is conveniently available on exactly the set of operations that would necessitate rerandomizing).

If you want to be thorough, you should also rerandomize on forks.

If we exclusively use secret key data (and we should also use message hashes and anything else available), as I'm proposing, then we have a situation where the first signature is always using the same context state. Which for applications like HWWs which are likely to be turned on before each signature and turned off after, means we gain nothing by the randomization.

Even if we rerandomize before signing, tihs doesn't help us much if the rerandomization is always the same. The exact random state isn't that important to an attacker; what's important is that the random state stays the same across very many signatures. (Upstream, we have been planning to change the signing code to rerandomize the context using only a single bit of the secret key, since one bit per signature is more enough drift to thwart any timing attack, but is cheap enough that we could do it unconditionally ...... if only we could figure out the mutability/multithreaded issues with mutating the context at all).

This is a somewhat practical concern. An attacker who had a working sidechannel attack and physical access to a HWW could power it on, sign a fixed message, power off, and repeat many thousands of times. So we really do want some initial randomness, if we want randomness at all.

@apoelstra
Copy link
Member

For something like this I'm kinda tempted to try using the clock time or CPU jitter or something. We really only need a few bits of entropy to ruin the day of an attacker who can't learn those bits.

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 25, 2024

OK, so if I understand correctly we actually don't need to depend on an external RNG at all if we can figure out how to seed the context on startup - e.g. by providing an API that users can use. E.g. by having a method on GlobalContext to randomize using external entropy. So we wouldn't really even need rand to do that.

One thing I'm unsure about is how global context really works. It's a static, so immutable but if there's no rerandomization isn't it always the same, thus allowing an attacker to sign same thing multiple times with the same context? Or does it internally use atomics to update some random state?

@apoelstra
Copy link
Member

Right now the global context just never gets rerandomized. We have this enormous issue #388 about this.

@apoelstra
Copy link
Member

And you are correct that we don't even need an external RNG -- what we need is some seed on startup that an attacker can't control.

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 26, 2024

OK, it looks like I've misunderstood the whole issue when suggesting the solution to rerandomization on each operation and I probably overcomplicated it a lot. Do we have any API from upstream to use secret key data for rerandomization or do we need to write it on our own?

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 26, 2024

I did a bunch of digging and given that rerandomization is only required at the beginning and for each public key and the cost of rerandomization is larger than cost of context creation I think we can do the following:

  • If std is available have an external function documented to be called from main() to provide entropy explicitly.
  • If both std and rand are available do with the global context what we currently do. (The function still overrides it.)
  • If std is not available the global context must be created by the consumer once around the beginning of main. We enforce it by an atomic and panic if the consumer attempts an operation that requires the context if it was not set.
  • We don't rerandomize global context again
  • Per Protecting ecmult_gen against side-channel attacks: the big picture bitcoin-core/secp256k1#1141 when a public key is computed we create an ad-hoc context on the stack and rerandomize it using rng(seed, RERAND_ATOMIC_COUNTER.fetch_add(1, Ordering::Relaxed)) where rng is a keyed hash function using seed as the key. seed is generated together with context and stored in memory.

Of course we need the stack allocation helper from upstream to do this. It could be also helpful if upstream exposed symbols for context size and alignment which we could read in the build script to create a true static (without heap allocation) for all cases except for dynamically linking a system library which reasonably requires an allocation (there we don't even need the alloc feature since we can call malloc).

@apoelstra
Copy link
Member

That all sounds good to me except for the panic in the no-std-no-rand case. If it weren't for rerandomization then it'd be possible and easy to use a static global context. So it sucks to require initialization/construction.

I am tempted to instead go for a model where we

  • Get the "stack allocation helper" merged upstream (I have it implemented but I haven't added tests and PR'd it).
  • Have a static random seed consisting of an array of atomic u8s which we best-effort update on all operations using secret key data...
  • ...as well as the output of rand if it is available.
  • And by default, for all signing ops, we use the stack allocation helper rather than the global static context
  • And for verification/non-signing ops, just we use the global verification context provided by upstream (and this code can be identical with or without std).

And then finally we have an explicit method to re-seed with new randomness, which we strongly recommend nostd users call. But in practice I'm not optimistic, no matter what API we decide on, that they'll actually do so. Even if we require them to call it, somehow, they may just call it with a fixed string or something, which isn't much good.

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 27, 2024

That all sounds good to me except for the panic in the no-std-no-rand case.

I'm not happy about it either. I was once bitten by effectively same thing but that code was C++ and just called abort() without any message (:man_facepalming:). But with a good panic message it'll be pretty clear what's wrong. I don't really see a significantly better options. We can do extern hack like getrandom does converting the panic into (probably more confusing) link-time error. (But then we could just use getrandom, I guess? Unless we want to stabilize before it. Though we would have to depend on it unconditionally or introduce the same crap we had with core2.) We could also silently not randomize but do we really want to? I was also considering abusing #[deprecated] to warn if rerandomization is disabled and any randomization-using method is called.

Have a static random seed consisting of an array of atomic u8s which we best-effort update on all operations using secret key data...

I hope we don't have to do that since it may affect performance doing so many atomic operations each time we need the seed. Two u64 loads maybe aren't that bad but not all targets support u64 I think. Also IIRC WASM doesn't support atomics at all unless you're compiling for experimental target with multithreading. (OK, this is also a problem with my proposal. I need to figure out what to o about it.)

But we could also have two seeds and atomic with four states: Seed1Active, UpdatingSeed2, Seed2Active, UpdatingSeed1 the Updating* ones are basically a lock for the respective seed with no waiting - if a thread has some entropy and fails to update the seed it's not a big deal - some other thread has entropy and is already doing it.

  • as well as the output of rand if it is available.

If we have a global rand then we also have std in which case I'd exclusively use that since it's simple, and performant.

  • by default, for all signing ops, we use the stack allocation helper

Isn't it super inefficient without providing much additional benefit? IIRC 2x slowdown.

  • And for verification/non-signing ops, just we use the global verification context provided by upstream (and this code can be identical with or without std).

I consider this painfully obvious and implied.

no matter what API we decide on, that they'll actually do so. Even if we require them to call it, somehow, they may just call it with a fixed string or something, which isn't much good

There's a big difference between forgetting or not noticing that there's such a thing and intentionally bypassing a security feature. If someone wants to do that we might as well keep less-secure cfg.

@apoelstra
Copy link
Member

apoelstra commented Jul 27, 2024

I hope we don't have to do that since it may affect performance doing so many atomic operations each time we need the seed. Two u64 loads maybe aren't that bad but not all targets support u64 I think. Also IIRC WASM doesn't support atomics at all unless you're compiling for experimental target with multithreading. (OK, this is also a problem with my proposal. I need to figure out what to o about it.)

I'm pretty sure atomics are trivial in WASM. It has only one thread so atomics can be implemented identically to non-atomic types. I don't know why this wouldn't be supported.

But we could also have two seeds and atomic with four states: Seed1Active, UpdatingSeed2, Seed2Active, UpdatingSeed1 the Updating* ones are basically a lock for the respective seed with no waiting - if a thread has some entropy and fails to update the seed it's not a big deal - some other thread has entropy and is already doing it.

Sure, we could try to do some cross-thread coordination this way to be a bit more efficient.

  • by default, for all signing ops, we use the stack allocation helper

Isn't it super inefficient without providing much additional benefit? IIRC 2x slowdown.

(a) the performance should be identical between re-randomizing an existing context and randomizing a fresh one, and (b) even if it wasn't, who cares? Already if users are randomizing then they are willing to take a perf hit for defense-in-depth.

We should provide non-randomizing versions of the signing methods for users who need the performance.

no matter what API we decide on, that they'll actually do so. Even if we require them to call it, somehow, they may just call it with a fixed string or something, which isn't much good

There's a big difference between forgetting or not noticing that there's such a thing and intentionally bypassing a security feature. If someone wants to do that we might as well keep less-secure cfg.

Sure. But then let's turn it on by default if it results in a huge ergonomic hit for users in exchange for an intangible hypothetical benefit which can only be realized by non-std-non-rand-users who nonetheless have access to an entropy source at application startup.

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 28, 2024

I'm pretty sure atomics are trivial in WASM. It has only one thread so atomics can be implemented identically to non-atomic types. I don't know why this wouldn't be supported.

Of course but then you have to deal with the variant of WASM that actually needs real atomics but the spec is not finalized. (And if you don't deal with it and someone compiles your code using it you get UB.) Anyway, I still haven't had time to check what the exact issue was.

even if it wasn't, who cares? Already if users are randomizing then they are willing to take a perf hit for defense-in-depth.

In bitcoin-core/secp256k1#1141 @real-or-random argued that rerandomization between signing doesn't help so it's just wasted performance. Do you think his analysis is incorrect?

which can only be realized by non-std-non-rand-users who nonetheless have access to an entropy source at application startup

That's literally all HWWs. Who is building a no_std/no-rand thing that is not a HWW? How many people are doing it?

@apoelstra
Copy link
Member

apoelstra commented Jul 28, 2024

Of course but then you have to deal with the variant of WASM that actually needs real atomics but the spec is not finalized. (And if you don't deal with it and someone compiles your code using it you get UB.) Anyway, I still haven't had time to check what the exact issue was.

This seems like a general Rust problem. We have AtomicU8 etc in core. What happens if you use those and then somebody decides to compile for multithreaded-wasm? I don't think we should avoid atomics just because of the existence of this architecture. (Especially in the model I'm currently thinking of where our only atomic ops are on a random seed, which can use completely unordered atomic operations, which surely are/can be implemented in a non-UB way?)

In bitcoin-core/secp256k1#1141 @real-or-random argued that rerandomization between signing doesn't help so it's just wasted performance. Do you think his analysis is incorrect?

His argument is that synthetic nonces "do the same thing and are simpler". I believe he is correct. Though API-wise I think it doesn't change things much for us: synthetic nonces involve us taking some extra entropy and feeding it into the signing function, versus context rerandomizatoin which would involve us taking extra entropy and feeding it into the context-rerandomize function.

I guess, the benefit of using synthetic nonces is that there's zero perf hit relative to not using them, so we don't need to provide non-rerandomizing variants.

But I would still like to update our global/thread-local random state on signing operations, since these are operations in which we have access to secret random data so we ought to take advantage of that.

That's literally all HWWs. Who is building a no_std/no-rand thing that is not a HWW? How many people are doing it?

Do we actually need std here or are we only arguing about rand? Is the issue that we need std to get thread_rng? Can we directly use getrandom?

I think many HWWs actually do have std but of those that don't, I'd be surprised to learn that they can access randomness but only through a crappy "manual" API.

@apoelstra
Copy link
Member

Let me make another proposal: in the non-std-non-rand case, we make it impossible at compile time to use keygen ECDH and signing APIs without initializing the internal RNG. By gating the normal methods on the required features, and making HWW users use different methods (which would take some sort of "rng context" object by mutable ref, and update that, evading all concerns with globals etc).

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 28, 2024

By gating the normal methods on the required features, and making HWW users use different methods (which would take some sort of "rng context" object by mutable ref, and update that, evading all concerns with globals etc).

That sounds like a cool way to move forward! We can always add the API back if we figure out a solution.

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 28, 2024

His argument is that synthetic nonces "do the same thing and are simpler". I believe he is correct.

Oh, true, I misunderstood it previously.

FYI I have this weird use case for deterministic nonces: generate them using a different technique based on whether the signed transaction is some kind of advanced contract (LN or PayJoin) to prevent accidentally RBF-ing from a restored wallet.

@apoelstra
Copy link
Member

FYI I have this weird use case for deterministic nonces

I think you can do this while still getting "synthetic nonces" which are sufficient to rerandomize the computation. Basically, you'd generate a normal RFC6979 or whatever deterministic nonce.

Then you would xor in a 256-bit value where every other bit was uniformly random but every other bit encoded a 128-bit message of your choice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.0 Issues and PRs required or helping to stabilize the API
Projects
None yet
Development

No branches or pull requests

2 participants