-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: SMT Proofs (Inclusion and Exclusion) #648
Conversation
fuel-merkle/src/tests/sparse.rs
Outdated
#[test] | ||
fn generate_proof_and_verify_with_valid_key_value_returns_true((key_values, tree) in random_tree()) { | ||
let mut rng = StdRng::seed_from_u64(0xBAADF00D); | ||
if let Some((key, value)) = key_values.choose(&mut rng).cloned() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will always choose the same index. You could provide the random index in the test declaration.
The simplest way to do that would probably be just add a arb_num: usize
and then in the code: let index = arb_num % key_values.len()
, or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fuel-merkle/src/sparse/proof.rs
Outdated
leaf, | ||
} = self; | ||
let key = key.into(); | ||
let mut current = *leaf.hash(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost fixed=D The problem is that getter leaf_key
has an internal assert. So if someone provides a node,
it will fail.
Added a test case with a panic=)
I think if the node is not a placeholder or a leaf, we need to return false. Overwise the proof provider may try to trick us=)
fuel-merkle/src/sparse/proof.rs
Outdated
pub struct ExclusionProof { | ||
pub root: Bytes32, | ||
pub proof_set: ProofSet, | ||
pub(crate) leaf: Node, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be public and Node
type as well. Otherwise, it is not possible to transfer this object between applications. We need to also implement serialization and deserialization somehow=)
Also, it would be nice if we implement serde
on top of that
Yep, that's a good point. The I moved the test case you wrote into the
Maybe it makes sense to store a subset of node data in the proof, rather than the whole node? E.g., just This way, we:
edit:
I think this you are suggesting something similar to what I am describing: a "subset type" of Node that can be used for just the |
Re: serde for proofs: #694 |
This PR is ready for the next round of reviews. I'm leaving |
fuel-merkle/src/sparse/node.rs
Outdated
pub fn leaf_key(&self) -> &Bytes32 { | ||
assert!(self.is_leaf()); | ||
debug_assert!(self.is_leaf()); | ||
self.bytes_lo() | ||
} | ||
|
||
pub fn leaf_data(&self) -> &Bytes32 { | ||
assert!(self.is_leaf()); | ||
debug_assert!(self.is_leaf()); | ||
self.bytes_hi() | ||
} | ||
|
||
pub fn left_child_key(&self) -> &Bytes32 { | ||
assert!(self.is_node()); | ||
debug_assert!(self.is_node()); | ||
self.bytes_lo() | ||
} | ||
|
||
pub fn right_child_key(&self) -> &Bytes32 { | ||
assert!(self.is_node()); | ||
debug_assert!(self.is_node()); | ||
self.bytes_hi() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think public getters should return Option
instead.
For internal usage, you can create a private version of this function. Maybe even it is better to mark them unsafe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Node
is public only to the fuel-merkle
crate, i.e. pub(crate) enum Node
. The only place Node
should ever be used is in the context of the MerkleTree
, where we already know what kind of node we are using. There is no reason for it to be used in a context where we do not know the kind of node, such as user space or another crate, which is why it is marked pub(crate)
. Therefore, I do not agree that these should return Option
- we always know what data we are accessing, i.e, we could always unwrap
the Option
; we would never need to branch based on the variant.
The debug_assert
s are there for fuel-merkle
developers who might modify the update/delete algorithms, not for consumers of the fuel-merkle
crate. But if you prefer, I can mark these functions as pub(crate)
as well to make that explicit.
I also disagree that these functions are unsafe, because it is always "safe" to call them, even in the wrong context. You cannot create an unsafe/incoherent state using these functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they are used only within the crate, let's mark them pub(crate)
to avoid confusion in the future=)
They are unsafe from a business perspective. I only brought attention to it because this commit shows that it is not always true. And from the API perspective, it would be right to have a separate bunch of functions only for internal MerkleTree
work. And pub
or pub(crate)
functions that may fail outside of the MerkleTree
implementation.
The unsafe
for internal functions will show that it is dangerous to use. But the comment may describe why it is safe. Or we can always return an option and write expect
and the reason why it is expected to be valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they are used only within the crate, let's mark them pub(crate) to avoid confusion in the future=)
We can also use pub(in crate::sparse)
to restrict this scope further. We could even rearrange the file structure to put Node
under the merkle_tree
module and do pub(in crate::sparse::merkle_tree)
or really just pub(super)
to make this very explicit. Any part of the interface marked pub
is therefore also pub(in crate::sparse::merkle_tree)
because of Rust's visibility rules.
They are unsafe from a business perspective. I only brought attention to it because this commit shows that it is not always true. And from the API perspective, it would be right to have a separate bunch of functions only for internal MerkleTree work. And pub or pub(crate) functions that may fail outside of the MerkleTree implementation.
I agree that we should differentiate between external and internal APIs, and that they can have different safety guarantees. At the moment, there is really only an internal API for Node
. Putting Node
in Proof
violated that agreement by putting Node
in user-space and mixing business logic with internal logic. That was bad! Node
should only be concerned with internal logic, never business logic. Enforcing a more narrow visibility on Node
will help reduce errors like that in the future.
To summarize: The interface was unsafe from a business perspective, but it was wrong to use it in business logic. I think isolating Node
to internal logic is the first fix. If you still think unsafe
should be used in this context, then I will add unsafe
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have a look at #696. I think this is a bit nicer: Here, Node
is completely restricted to the merkle_tree
module using pub(super)
, making it very explicit that the only consumer is merkle_tree
, and that there are no consumers external to merkle_tree
. This means that the API is strictly internal, and we can always analyze it from an internal perspective.
// divergence equates to this hash. | ||
// | ||
let leaf = actual_leaf.clone().into(); | ||
let exclusion_proof = ExclusionProof { proof_set, leaf }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your idea of returning the key and value in the case of the leaf. You can return an enum, which is either a placeholder or (key, value). And in the proof function, we can always create a leaf node.
fuel-merkle/src/sparse/proof.rs
Outdated
} | ||
|
||
#[derive(Clone, Eq, PartialEq)] | ||
pub struct ExclusionLeaf { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be an enum
fuel-merkle/src/sparse/proof.rs
Outdated
} | ||
|
||
fn is_placeholder(&self) -> bool { | ||
self.leaf_value() == zero_sum() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed it before, user may use any value, even zero sum. We need to use a separate variant for the case of the placeholder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but in this context, we are looking at the "exclusion leaf" (the nearest included leaf to the excluded key). The value
of the exclusion leaf is either the hash of the user data for this leaf (not the user data itself), or placeholder hash. This is how this version allows us to use the zero sum for user data.
But, either way, we can use an enum here to simplify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is true that, at this point, we have a hash of the user's value. But if the hash is zero, we will treat it as a placeholder, while it is not true. Enum fixes that case.
fuel-merkle/src/sparse/proof.rs
Outdated
if !leaf.is_placeholder() && *leaf.leaf_key() == key.as_ref() { | ||
return false; | ||
} | ||
let mut current = leaf.hash(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to process enum here. If the leaf
is a (key, value)
, then we want to use Node::create_leaf(key, value)
. Otherwise we want to use Node::create_placeholder()
.
After you can call node.hash()
.
* Refactor * Refactor * Minor * Fix import * Revert file move * Fixes + comments * Use core deref instead of std * Group public and private functions
fuel-merkle/src/sparse/proof.rs
Outdated
@@ -58,9 +58,15 @@ pub struct InclusionProof { | |||
impl InclusionProof { | |||
pub fn verify(&self, root: &Bytes32, key: &MerkleTreeKey, value: &[u8]) -> bool { | |||
let Self { proof_set } = self; | |||
|
|||
if proof_set.len() > u32::MAX as usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! We can actually make this
if proof_set.len() > 256 {
since a valid proof set cannot have more than log(n) side nodes.
Related issues:
This PR:
Given an SMT, we can generate a proof for any key within the 32 byte key space. If the key is included in the tree, i.e., has been used in an update operation and not deleted, then we generate an inclusion proof. If the key has not been included in the tree, we generate an exclusion proof. Inclusion proofs verify that the given key-value is, in fact, in the tree. If the value undergoing verification does correspond to the given key, verification fails. Exclusion proofs verify that the key is a placeholder. If the given key is not a placeholder in the tree, verification fails.
Example
Consider the following example:
We have a key-value data set, upon which we are building a sparse Merkle tree.
The data set is structured as such:
The data set contains 3 key values. Because the key space supports up to 8 keys (000 - 111), we know that there are 5 unused slots in the key space.
For each unused slot, we know that the value is unset. Therefore, the complete key-value data set looks like this:
The corresponding sparse Merkle tree is constructed on the key-value data set:
Example Use Case
We can make statements about the data set. For example:
We can categorize these statements into one of two categories:
Assertion 1 is an assertion of knowledge: I assert that key K has value V. 1. For assertion 2, an equivalent assertion is: I assert that key K is not in the table.
For these assertions, we can generate cryptographic proofs. Respectively, these are proofs of inclusion and proofs of exclusion. Verification is the process of confirming a given statement using the proof. Verifying a statement and proof returns a boolean (true or false) output that can confirm the corresponding statement. If verification of the statement returns true, the statement has been proven. If verification of the statement returns false, the statement has not been proven.
Verification failure does not imply the opposite statement. For example, we cannot assert that a key is in the table without providing a specific value, i.e. "key K is included in the table (with some unknown value V)" using a failed exclusion assertion. If verification of the statement "key K has a zero value" returns false, it is possible that K has a non-zero value and is included in the tree; however, a failed proof of exclusion does not necessarily imply inclusion. This is because the proof provided to the verification can be somehow invalid or maliciously altered.
More concretely:
where:
Proofs that fail to validate do not imply the opposite statement.