-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for TAP 15 #1948
Add support for TAP 15 #1948
Conversation
This pr is a draft because of warnings given by I am not sure what is the problem for python3.7 https://github.com/theupdateframework/python-tuf/runs/5888251078?check_suite_focus=true |
Pull Request Test Coverage Report for Build 2271063967
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work.
- I left a complaint about using dict to "hide" API, that one I fee strongly about, let's figure out a solution for it.
- changing API in incompatible ways (like arg order in
DelegatedRole.__init__()
) may be inevitable, but let's at least try to avoid it - other comments are not strictly change requests.
Is the current implementation using None as the key in Delegations.roles
dictionary? I'm quite surprised mypy does not complain about it in Delegations.from_dict() ...
tuf/api/metadata.py
Outdated
paths: Optional[List[str]] = None, | ||
path_hash_prefixes: Optional[List[str]] = None, | ||
succinct_hash_info: Optional[Dict[str, Any]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason for using a different name than the TAP does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it boils down to that it's just shorter.
succinct_hash_delegations
is 25 characters and because of our characters per line restriction, I had to often split lines.
succinct_hash_info
is 18 characters which save a good amount of characters while still hinting at the purpose of this argument.
I have made multiple changes:
@jku it's ready for another review. |
1bf36b5
to
fac307e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think the direction in this version can work. It's a new class just to hold two variables: this feels like overkill at first but there's some specific problems (the bin math plus the handling unrecognized fields in the new dict) that we can contain within that class so I think it works out.
tuf/api/metadata.py
Outdated
@@ -1384,6 +1465,58 @@ def _is_target_in_pathpattern(targetpath: str, pathpattern: str) -> bool: | |||
|
|||
return True | |||
|
|||
def _calculate_delegation(self, hash_bits_representation: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method now seems like it should be part of the new class (HashDelegations.find_delegation() maybe)?
I also think the argument to the function should be targetpath
: this way the arguments and output make sense to a reader: you feed in the targetpath and get the real delegated role name as output -- The fact that this specific implementation uses a string of ones and zeros as an intermediate value is not important and you only need to move a few lines of code to get this better abstraction.
Then DelegatedRole.find_delegation()
ends up being quite clean, something like
if self.succinct_hash_info:
return self.succinct_hash_info.find_delegation(target_filepath)
elif self.is_delegated_path(target_filepath):
return self.name
return None
Opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make this a little more complex: we don't have an API for someone who
- uses non-succinct hash delegations
- wants to find out which bin might provide (or shoudl provide) a targetpath
This API clearly needs to be above the DelegatedRole level -- at Delegations maybe, but internally could use some of the same code
This question is not really TAP 15 related... but makes sense to solve (or at least discuss) at same time (and is related to #1952 and warehouse work)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I don't pass targetpath
as an argument is the testing.
I want to test the actual bin calculation math and if I pass the targetpath
this means that the hash calculation has to be moved inside _calculate_delegation
and I won't be able to pass the binary representation in a string as do now.
That way the test inside test_api.py
won't be clear for the reader if we do that.
Given that this is an internal API I think it's allowed to require a little more specific argument for easier testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I think I'll leave this to you and @lukpueh as I think you've been talking about this part of the code.
Personally I'm not fond of making production code harder to read based on unit testing needs, or the basic idea of building a string of ones and zeros just to turn it into an int later ... but I can live with those so I'll leave these for you
tuf/api/metadata.py
Outdated
succinct_hash_info: Optional[Dict[str, Any]] = None, | ||
unrecognized_fields: Optional[Mapping[str, Any]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strictly speaking this is still breaking API.
DelegatedRole("a", [], 1, False, [], None, {})
used to be valid code. Now it will fail I assume.
New arguments should generally speaking be added in the end of the list... I'm not saying we can't break API here: it's not a major breakage by any means but maybe I'd still vote for adding the new arg in the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't add it to the end as in my mind unrecognized_fields
are at the end of each class, but maybe that's not so important.
@jku I updated the pr once again making the following changes (not in order):
|
Made an update to one of the commits to remove the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- SuccinctHashDelegations works better as a name I think: it's long but not used v commonly.
- moving
_calculate_delegation()
without modifying it otherwise probably does not make sense: having to disable pylints is a good hint that something is not right. my opinion is still that it would make sense for SuccinctHashDelegations to expose an actual delegation lookup method that accepts target path as argument but I'll just leave that as opinion - I think it mostly looks good: our naming is now not perfect as DelegatedRole can now represent multiple roles and as result we need
DelegatedRole.find_delegation()
which really does not make sense as a name ... but it seems best we can do without breaking API: I'm fine with that solution
@@ -1258,17 +1259,99 @@ def to_dict(self) -> Dict[str, Any]: | |||
return snapshot_dict | |||
|
|||
|
|||
class SuccinctHashDelegations: | |||
"""A container with information about a succinct hash delegation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this still needs an explanation: "succinct hash delegation" is probably word salad for anyone except about 20 people on the planet.
It does not need to explain every detail but the core idea: that it desribes delegations of all possible target paths to variable number of bins, the names of which are generated based on the two fields.
I addressed all of the @jku comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not looked at all of it but the naming scheme seems important enough to raise the issue now (see code comments)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work! I love how all of this can be implemented in the metadata classes and does not need any changes in ngclient
.
I mostly reviewed the SuccinctHashDelegations
class in this round and left a couple of minor comments, let me know what you think.
Regarding the changes in Delegations
and DelegatedRole
, I lean towards revisiting the TAP 15 metadata definition. See my suggestion for an alternative format that might make the implementation a little less "hackish" in: #1943 (comment)
I haven't looked closely at the tests yet, but will follow up shortly. Do you plan to add a client test that validates a target delegated to a hash bin?
|
||
Args: | ||
hash_prefix_len: Number of bits between 1 and 32 used to separate the | ||
different bins. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs a better explanation, either here or in the function description above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better explanation about what is a hash deletion or specifically about succinct hash bin delegation?
Can you give some reference points to what you mean?
if hash_prefix_len <= 0 or hash_prefix_len > 32: | ||
raise ValueError("hash_prefix_len must be between 1 and 32") | ||
if not isinstance(bin_name_prefix, str): | ||
raise ValueError("bin_name_prefix must be a string") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I ask why you don't check if hash_prefix_len
is an int, but do check if bin_name_prefix
is a string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If hash_prefix_len
is not a number a comparison like the one done on line 1283 will throw a TypeError
, for example if hash_prefix_len
is a string this comparison will throw TypeError: '<' not supported between instances of 'str' and 'int'
I am not convinced there is a point in checking that hash_prefix_len
is an int
, not a float
.
I have updated my pr and addressed all comments made by @jku and @lukpueh. |
Start supporting succinct_hash_delegations attribute inside DelegatedRole. This change is part of TAP 15 implementation. Unfortunatly, this change is not compatible with all previos python-tuf versions as the "name" attribute is described as optional in TAP 15, meaning it has to be replaced to become the fourth argument instead of being the first when creating a DelegatedRole instance. Additionally, serialization tests covering the new use cases as TAP 15 states that only one of "paths", "path_hash_prefixes" and "succinct_hash_delegations" is allowed in a DelegatedRole instance. Signed-off-by: Martin Vrachev <[email protected]>
Inside test_updater_delegation_graphs.py we define the TestDelegation class. There the "paths" attribute is defined as a "List[str]" and it has a default value of a "[*]". This is often used so that all targets are considered as delegated by the delegated role that will consequently be created with a paths attribute with a value of "[*]". Still, there are a couple of test cases where we want to give a "None" value to the "paths" attribute and mypy complains that we can't assign a "None" value when the expected type is a list of strings. That's why we have to make "paths" optional. Signed-off-by: Martin Vrachev <[email protected]>
As part of the implementation of TAP 15 I had to make sure that ngclient supports target metadata files using succinct hash delegations. Some things to keep in mind while reviewing: - first, it should be noted that when succinct hash info is added inside a particular delegated role this means that every possible target has a bin assigned to it - the "name" attribute for DelegatedRole objects has been changed to optional inside TAP 15, because when succinct hash delegations is used "name" becomes redundant - I have added an additional helper function `DelegatedRole._calculate_rolename" for easier testing - I intentially added more tests inside the "get_rolename_dataset" to showcase how succinct hash delegations is using binary prefix instead of hash prefix. Signed-off-by: Martin Vrachev <[email protected]>
Add SuccinctHashDelegations class containing the information from the succinct_hash_delegation dict described in TAP 15. This allows for easy mypy checks on the types, easy enforcement on TAP 15 restrictions (as for example that "delegation_hash_prefix_len" must be between 1 and 32) and support for unrecognized fields inside succinct_hash_delegation without much of a hassle. Signed-off-by: Martin Vrachev <[email protected]>
Add zero padding to bin names inside SuccinctHashDelegations. Zero padding ensures that the bin names always have the same length. This characteristic is implied in the example given by TAP 15 where the third bin is named "alice.hbd-03" For context read TAP 15: https://github.com/theupdateframework/taps/blob/master/tap15.md Signed-off-by: Martin Vrachev <[email protected]>
Change the test_find_delegation_with_succinct_hash_info() test, so it can test the whole calculation of bin name from a given target path. Signed-off-by: Martin Vrachev <[email protected]>
Add a SuccinctHashDelegations.get_bin_name() API call which returns the bin name of a given sequential bin number. This is useful to easily get the name of the first bin which can be used as a unique identifier for delegated roles using a succinct hash delegations. A unique identifier for such delegated roles with succinct hash is required by our implementation of Delegations.roles where the keys are DelegatedRole.name attributes and values are DelegatedRole objects and the fact that when succinct hash delegations are used DelegatedRole.name should be populated with "None". Solves: theupdateframework#1943 Signed-off-by: Martin Vrachev <[email protected]>
Add a public method to calculate and receive the bin name when succinct hash delegation is used. It's logical that all of the logic required to calculate the bin name should be inside the class responsible for that SuccinctHashDelegations. This will eliminate the need of SuccinctHashDelegation._find_bin_for_bits() and its test. Signed-off-by: Martin Vrachev <[email protected]>
Remove the need of calculation of all hash bits and then filtering those that aren't needed with simple bit shifting. That way it's clearer what we actually do without using complicated python tricks. Signed-off-by: Martin Vrachev <[email protected]>
Deprecate DelegatedRole.is_delegated_path() in favor of DelegatedRole.find_delegations() which will not only check if the given targetpath is one of the paths DelegatedRole is trusted to provide, but it will also return the role name. This is especially useful when using succinct hash delegations. Signed-off-by: Martin Vrachev <[email protected]>
Add get_all_bin_names and is_bin in SuccinctHashDelegations. Those methods proved useful in the testing code, but I believe they have a potential value for production code as well. Signed-off-by: Martin Vrachev <[email protected]>
Change Metadata.verify_delegate(), so it can handle delegated roles where succinct hash delegation is used. In the case of succinct hash delegation, when the delegator is "targets" we have to make sure that a given role "r" contains a succinct hash delegation information and that the "delegated_role" is a role name is a bin handled by "r". This complication comes from the idea that we could have two delegated roles with succinct hash delegation information and there could be other non-succinct roles as well. Signed-off-by: Martin Vrachev <[email protected]>
Change RepositorySimulator.add_delegations() to work with delegated role "role" that uses succinct hash delegations. This case is special as for that role we are not creating one target, but we are creating target metadata instances for all bins. Signed-off-by: Martin Vrachev <[email protected]>
Test traversing the delegation tree when there is a delegated role using succinct hash delegation. Signed-off-by: Martin Vrachev <[email protected]>
I think I have applied all suggestions by @jku and @lukpueh.
|
I have implemented @lukpueh suggestion from the comment here https://github.com/MVrachev/tuf/tree/new-suggestion. |
Together with @jku, @lukpueh, and the TAP author - @mnm678 we met to discuss if we should change the design on TAP 15 based on multiple suggestions. |
Related to issue: #1909
Description of the changes being introduced by the pull request:
TAP 15 was created on June 23-rd 2020 and was last modified in July 6-th 2020.
Since then the TAP has been put to
draft
status meaning it needs a prototype implementation before it's accepted as a future specification change.Given that this TAP underlines an efficient way of handling hash bin delegations, meaning it would be really useful when
python-tuf
is integrated intoWarehouse
, it's logical that we should not only create a prototype but directly work to integrate it inpython-tuf
.The outcome of the TAP 15 implementation is
succinct hashed bin delegations
.examples/
showcasing howsuccinct hashed bin delegations
can be utilized in practice.This pr covers points 1 and 2.
Point 3 will be done in another pr.
Please verify and check that the pull request fulfills the following
requirements: