Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relying on mandatory DelegatedRole.name attribute when TAP 15 describes it as optional #1943

Closed
2 tasks
MVrachev opened this issue Apr 6, 2022 · 34 comments
Closed
2 tasks
Assignees
Labels
discussion Discussions related to the design, implementation and operation of the project
Milestone

Comments

@MVrachev
Copy link
Collaborator

MVrachev commented Apr 6, 2022

Description of issue or feature request:
We are working to implement TAP 15 (see #1909).
There it's said that:

If the succinct_hash_delegations field is present in a delegation, the name field will not be used as a rolename, and so is not required.

meaning that the DelegatedRole.name attribute is not required.
That seems to be a problem for us as until now we are relying on its existence to uniquely identify roles:
(see Delegations.from_dict() implementation)

def from_dict(cls, delegations_dict: Dict[str, Any]) -> "Delegations":

One solution to this problem could be providing an additional ID number that will serve as a unique identifier for roles.
This approach is widely used by databases.

But then we have the next question what to do about DelegatedRole.to_dict() where we currently provide name:

"name": self.name,

In summary what we should do is:

  • decide what unique identifier to use for delegated roles inside Delegations
  • decide what to do with DelegatedRole.to_dict
@lukpueh
Copy link
Member

lukpueh commented Apr 6, 2022

Yes, we definitely still need distinct names for each "bin-n" role, otherwise the client can't download the targets metadata needed for a target in a bin. The challenge with TAP 15 is that the delegating targets metadata does not list all its delegatees anymore, but rather a number (delegation_hash_prefix_len) that allows to compute how many delegatees (bins) there are.

Currently, the idea is to compute the name for each bin using a configured prefix bin_name_prefix together with the bin index. See theupdateframework/taps#132 for some remaining discussions about the name.

@MVrachev
Copy link
Collaborator Author

MVrachev commented Apr 6, 2022

Currently, the idea is to compute the name for each bin using a configured prefix bin_name_prefix together with the bin index. See theupdateframework/taps#132 for some remaining discussions about the name.

The problem is that it's possible to have DelegatedRoles objects without a name attribute set and the calculation of the bin index requires a target path...

@lukpueh
Copy link
Member

lukpueh commented Apr 6, 2022

I think the problem is that DelegatedRole so far was used for exactly one delegated role, but with TAP 15 there would only be one DelegatedRole object for all bin-n roles.

@MVrachev
Copy link
Collaborator Author

MVrachev commented Apr 6, 2022

I think the problem is that DelegatedRole so far was used for exactly one delegated role, but with TAP 15 there would only be one DelegatedRole object for all bin-n roles.

Yes, you are correct!
I think the solution then is clear:

  1. when we don't use succinct_hash_delegations role names are provided and everything works the same way as it does now - Delegations.roles attribute is a Dict[str, DelegatedRole].
  2. if we use succinct_hash_delegations then Delegations.roles attribute becomes only DelegatedRole.

This means that Delegation.roles will be of type Union[Dict[str, DelegatedRole], DelegatedRole] and we would have to add a couple of more if checks here and there to comply with mypy.

@MVrachev MVrachev self-assigned this Apr 6, 2022
@jku
Copy link
Member

jku commented Apr 7, 2022

if we use succinct_hash_delegations then Delegations.roles attribute becomes only DelegatedRole.

The idea that a delegations dictionary that contains a succinct delegation can only contain that delegation and no others seems reasonable but I don't think it comes from TAP 15? Note that I'm not opposed to making implementation restrictions, just want to be clear when we do those.

A quick unfinished alternative idea: Internally use special DelegatedRole.name = "" for succinct delegation (and prevent that being otherwise used). This would change the requirement to "delegations dictionary can contain a maximum of one succinct delegation but could contain other normal delegations: these can be ordered by normal means".

@jku
Copy link
Member

jku commented Apr 7, 2022

Yet more alternatives: use the first bin-n-name as the roles dictionary key

  • this allows even multiple succinct delegations in the same delegations dictionary
  • using the bin-name seems safe since we already implictly require the repository to not have conflicts with bin-names

This dictionary key would obviously be just an implementation detail: the serialized format would be as described in tap15

@MVrachev
Copy link
Collaborator Author

MVrachev commented Apr 7, 2022

if we use succinct_hash_delegations then Delegations.roles attribute becomes only DelegatedRole.

The idea that a delegations dictionary that contains a succinct delegation can only contain that delegation and no others seems reasonable but I don't think it comes from TAP 15? Note that I'm not opposed to making implementation restrictions, just want to be clear when we do those.

You are correct! I assumed only DelegatedRole could exist when succinct_hash_delegations are used, but when you look closer to the proposed delegations scheme:

{ "keys" : {
       KEYID : KEY,
       ... },
   "roles" : [{
       ("name": ROLENAME,)
       "keyids" : [ KEYID, ... ] ,
       "threshold" : THRESHOLD,
       ("path_hash_prefixes" : [ HEX_DIGEST, ... ], |
       "paths" : [ PATHPATTERN, ... ], |
       "succinct_hash_delegations" : {
         "delegation_hash_prefix_len" : BIT_LENGTH,
         "bin_name_prefix" : NAME_PREFIX
       })
       "terminating": TERMINATING,
   }, ... ]
 }

you see that after the delegated role describing the succinct hash information there is ... meaning there could be more than one.

A quick unfinished alternative idea: Internally use special DelegatedRole.name = "" for succinct delegation (and prevent that being otherwise used). This would change the requirement to "delegations dictionary can contain a maximum of one succinct delegation but could contain other normal delegations: these can be ordered by normal means".

I think I prefer this idea more than the one with the usage of bin-n-name prefix as the rolename.
Given that TAP 15 hinted that DelegatedRole.name is unused in the case of existing succinct hash delegatations I think it makes more sense to have it that way.

Yet more alternatives: use the first bin-n-name as the roles dictionary key

  • this allows even multiple succinct delegations in the same delegations dictionary

I am not sure you will ever need to use multiple succinct hash delegations.
I only see it as a possible complication with no real benefit.
Given that you can have 2^32 - 1 bins there are more than enough and even then you can continue delegating more levels downwards.

  • using the bin-name seems safe since we already implictly require the repository to not have conflicts with bin-names

This dictionary key would obviously be just an implementation detail: the serialized format would be as described in tap15

Still, if we decide to use bin-name how do you imagine this to happen?
Probably the same way we did with normal hash bin delegations?

@MVrachev
Copy link
Collaborator Author

MVrachev commented Apr 7, 2022

@lukpueh what do you think about Jussi's suggestions?
@lukpueh and @jku I hope we can soon have a consensus on how to solve this issue as it's blocking my progress on TAP 15 as mypy warnings appeared,

@jku
Copy link
Member

jku commented Apr 8, 2022

I am not sure you will ever need to use multiple succinct hash delegations.
I only see it as a possible complication with no real benefit.

That is indeed valid feedback for the TAP itself... even combining normal delegations and succinct ones feels quite weird.

But strictly speaking the format in the TAP is currently designed to allow multiple succinct delegations so we'll have to at least document this diversion if we implement it that way.

@mnm678 maybe has an opinion on combining a succinct delegation with either another succinct delegation or normal delegations -- what was the TAP intention here?

@jku
Copy link
Member

jku commented Apr 8, 2022

I wanted to make one thing clear:

  • requiring a role name is just an implementation detail in python-tuf -- we can change that if needed, and that's not a huge issue
  • however, I also don't understand why it was made optional: this makes implementations more difficult to code just to save a few bytes

so one option is that python-tuf keeps requiring a role name and we suggest a modification to the TAP.

@MVrachev
Copy link
Collaborator Author

MVrachev commented Apr 8, 2022

I think it makes sense that name is missing when succinct_hash_delegations is used.
If it does exist when succinct_hash_delegations is used It's won't bring any semantical value and it could be lead to confusion about which to use Delegated.name or Delegated_get_rolename(the new API call I am adding for TAP 15) when you want to get the name of the delegated role responsible for a target.

I would love to hear @mnm678 about that.

@MVrachev MVrachev mentioned this issue Apr 8, 2022
3 tasks
@mnm678
Copy link
Contributor

mnm678 commented Apr 8, 2022

This is a great discussion!

@mnm678 maybe has an opinion on combining a succinct delegation with either another succinct delegation or normal delegations -- what was the TAP intention here?

The TAP purposefully left the option for multiple succinct delegations, as at the time I didn't see the need to restrict this. There may be a case when a repository would have multiple succinct delegations signed by different keys (especially a large repository like DockerHub). But if this is a sticking point for the implementation, I don't think it's very important and it hasn't specifically been requested. @trishankatdatadog has some of the real-world use cases for this TAP and so might have an opinion there.

name was left out for succinct hash delegations as it's not necessary for determining the next metadata file to download. However, they way this is worded "the name field will not be used as a rolename, and so is not required" does leave it open for implementations to use this field internally. So I think identifying these either through a rolename of "" or a rolename of the first bin-n-name would be in line with the TAP (though I have a personal preference of "" as it seems harder to confuse).

In general, I think it's fine to implement TAP 15 with a couple of additional restrictions if that leads to simpler code. We can even suggest these adaptions in the TAP.

@trishankatdatadog
Copy link
Member

Sorry, haven't had time to fully follow everything, but why can't implementations like python-tuf do the following?

Inside Delegations.from_dict(), you would have internal helper methods that would know how to handle the different types of delegations. If regular hashed-bin/targets delegations, you get the name for free. Otherwise, the helper for succinct hashed delegations knows how to do it.

Am I missing something?

@MVrachev
Copy link
Collaborator Author

MVrachev commented Apr 8, 2022

Sorry, haven't had time to fully follow everything, but why can't implementations like python-tuf do the following?

Inside Delegations.from_dict(), you would have internal helper methods that would know how to handle the different types of delegations. If regular hashed-bin/targets delegations, you get the name for free. Otherwise, the helper for succinct hashed delegations knows how to do it.

Am I missing something?

The problem comes from the fact that we store delegated roles as a dictionary attribute inside the Delegations class.
In this dictionary, the keys are the delegated role names and values are DelegatedRole objects or Dict[str, DelegatedRole]
According to TAP 15 if succinct hash delegations are used then name becomes an optional attribute meaning we no longer have a key for that dictionary.

Summarized, the solutions proposed what to do when succinct hash delegation is used are as follows:

  1. make roles a Union or Union[Dict[str, DelegatedRole], DelegatedRole] meaning that in the case of succinct hash delegations we will no longer allow any other roles (which is a constrain not required from TAP 15)
  2. use the special delegated role name "" denoting the delegated role with the succinct hash delegations info. The other roles are forbidden from using the "" as a role name, but they can coexist with the succinct hash delegation role
  3. using the first bin-n-name or (if I understood correctly) constructing a name from the succinct_hash_delegations["bin_name_prefix"] and the first bin name. Looking into the TAP 15 example this would be alice.hbd-00.

I think I am for option 2. It doesn't limit anything. There is a reasonable special value that can be used and there is no reason why anyone will explicitly want a delegated role with the name "".

@jku
Copy link
Member

jku commented Apr 11, 2022

I think I am for option 2. It doesn't limit anything.

Let's be clear about the decisions we make: option 2 prevents multiple succinct delegations, a feature the TAP author just confirmed was intentional (even if possibly not important).


There are two discussions going on at the same time here, I'll try to separate them:

1. what should the TAP say

  • TAP should explicitly state what it wants WRT succinct delegations combined with other normal delegations, and succinct delegations with other succinct delegations: are these supported? Currently it strongly implies both are supported: that a succinct delegation is just another delegation among possibly many others. If there is a practical use case for this, it would be good to mention that in the TAP.
  • If possible, TAP should recognise that the delegated role structure was already the trickiest structure to handle... succinct delegations makes it very tricky: there are so many combinations of optional values and sometimes optional values that it's a clear source of future implementation bugs. Maybe there is no design that makes implementation easier but this should be a concern
  • TAP could consider keeping rolename for the succinct delegation for that reason -- the spec constantly changing what is required and what is not allowed is difficult to get right in an implementation: just look at the amount of checks Martin has to do in this PR. I admit that there's no obvious value for rolename here (since we are describing multiple roles) so probably this is not a good idea...

2. How we implement the TAP, do we implement it exactly or with modifications?

  • An implementation that isn't fully TAP-compliant is possible but should only be done with very good reasons. The default choice should be implement the TAP exactly.
  • The fact that Metadata API uses rolename to uniquely identify delegations is an implementation detail, they're just keys to dictionary: we can change that if needed (it's an implementation detail that is useful as it enforces rolename uniqueness for us with 0 lines of code). This has nothing to do with whether the wire format contains a rolename.
  • In fact, if we want to keep using unique identifiers as keys to the roles dictionary we can do so even if we don't set a rolename for succinct delegations. We'll just have to invent a unique key as we insert the role...

@MVrachev
Copy link
Collaborator Author

An implementation that isn't fully TAP-compliant is possible but should only be done with very good reasons. The default choice should be implement the TAP exactly.

Agree with this and all of the above notes are great thoughts.

More and more I agree it's important to try to be fully compliant with the current version of the TAP.
If somewhere we see a big obstacle we can discuss it and change the TAP if needed.
After a conversation with @jku more and more I see the usage of the name of the first bin as a viable solution.
I will experiment with that.

We will have more discussions with @jku and hopefully some of the participants in this thread in my pr #1948 and if there are some conclusions from it will write here.

@ivanayov ivanayov added the discussion Discussions related to the design, implementation and operation of the project label Apr 20, 2022
@ivanayov ivanayov added this to the sprint 22 milestone Apr 20, 2022
MVrachev added a commit to MVrachev/tuf that referenced this issue Apr 20, 2022
Add a unique identifier for delegated roles using a succinct hash
delegations. This functionality is required by our implementation of
Delegations.roles where the keys are DelegatedRole.name attributes and
values are DelegatedRole objects and the fact that when succinct hash
delegations are used DelegatedRole.name should be populated with "None".

Mypy complained that it's possible to have DelegatedRole.name with a
value of "None" which is not a string. That's why I had to add special
handling for this special case when succinct hash delegations are used
and return the name of the first bin as the supposed DelegatedRole.name.

Solves: theupdateframework#1943

Signed-off-by: Martin Vrachev <[email protected]>
@lukpueh
Copy link
Member

lukpueh commented Apr 22, 2022

Why don't we take a literal step back here? Maybe, and Jussi suggests this in his last comment, the reason why implementing this is tricky is because the metadata format in TAP 15 is tricky. Maybe (putting on my ER-modeler hat) the "succinct hash delegation object" is just too different from "delegation role" objects? What about this alternative format:

{
  "keys" : {
      KEYID : KEY,
      ...
  },
  "roles" : [
    {
      "name": ROLENAME,
      "keyids" : [ KEYID, ... ] ,
      "threshold" : THRESHOLD,
      ("path_hash_prefixes" : [ HEX_DIGEST, ... ] |
      "paths" : [ PATHPATTERN, ... ]),
      "terminating": TERMINATING,
    }, ...
  ],
  "succinct_hash_delegations": [ # better name pending
    {
      "keyids" : [ KEYID, ... ],
      "threshold" : THRESHOLD,
      "terminating": TERMINATING,
      "delegation_hash_prefix_len" : BIT_LENGTH, # better name pending
      "bin_name_prefix" : NAME_PREFIX # better name pending
    }, ... 
  ]
}

I am aware that the objects share keyids, threshold,terminating fields, but a non-normalized data definition seems preferable to mutual exclusive relationships between multiple fields. And in code we could summarize them in a common parent class for DelegatedRole and DelegatedBins (better name pending).

@mnm678
Copy link
Contributor

mnm678 commented Apr 26, 2022

Why don't we take a literal step back here? Maybe, and Jussi suggests this in his last comment, the reason why implementing this is tricky is because the metadata format in TAP 15 is tricky. Maybe (putting on my ER-modeler hat) the "succinct hash delegation object" is just too different from "delegation role" objects? What about this alternative format:

I like this idea. It maintains backwards compatibility by leaving the existing roles the same, and it allows us to use explicit names for everything in succinct_hash_delegations. We could maybe just call it hash_delegations as a step towards deprecating the old method of hashed bin delegations.

Does this solve the issue with the way metadata api identifies delegations? I suppose with the new object it makes a clear place for a new codepath to identify hashed bin delegations without a name for each delegation.

@lukpueh
Copy link
Member

lukpueh commented Apr 27, 2022

We could maybe just call it hash_delegations as a step towards deprecating the old method of hashed bin delegations.

Yes, or maybe even shorter: hash_bins, as in the role that defines the delegations object can delegate to individual roles and/or to one or more collections of hash_bins

Does this solve the issue with the way metadata api identifies delegations?

IMO it makes it more obvious. We no longer need a dictionary whose keys can be either a role name or something else, identifying either an individual role or a collection of roles. We would just use two different dictionaries/datastructures.

@lukpueh
Copy link
Member

lukpueh commented Apr 27, 2022

It maintains backwards compatibility by leaving the existing roles the same

That is true, but we might want to consider making some backwards incompatible decisions in the DELEGATIONS object.

{
  "keys" : {},
  "roles" : [] 
  "hash_bins": [] # name TBD
}

The backwards compatible way would be to make hash_bins optional but leave roles mandatory, which seems inconsistent. I think we should make them either both optional or both mandatory. And I lean towards making them mandatory, because optional fields make canonicalization a bit harder, which we currently require for signatures.

@lukpueh
Copy link
Member

lukpueh commented Apr 27, 2022

@awwad just pointed out that the order in roles is important when traversing the delegation tree. Now if we have two fields, we cannot interleave "normal" delegations from the roles list and succinct hash bin delegation from the hash_bins field, which is something that was possible in the original TAP 15 proposal. Although, I doubt that there is actually a use case for it.

At any rate, we need to define if delegations in roles are prioritized over delegations in hash_bins, or the other way around.

@JustinCappos
Copy link
Member

JustinCappos commented Apr 28, 2022 via email

@jku
Copy link
Member

jku commented Apr 28, 2022

It maintains backwards compatibility by leaving the existing roles the same,

I already commented elsewhere but this has some very tricky edge cases: it may look compatible but is actually a potential security hole. If both succinct_hash_delegations and roles are allowed in delegations then old clients that see new metadata would consider succinct_hash_delegations just unidentified extra data, something that the spec explicitly allows. So old clients would happily use new metadata without actually understanding the delegations defined in it. This is likely not what we want.

@trishankatdatadog
Copy link
Member

So old clients would happily use new metadata without actually understanding the delegations defined in it. This is likely not what we want.

This is always already a risk with accepting and ignoring unknown fields (except only to check if any signatures cover them), correct?

@jku
Copy link
Member

jku commented Apr 29, 2022

This is always already a risk with accepting and ignoring unknown fields (except only to check if any signatures cover them), correct?

I don't think so. In this case we are changing the way delegations are supposed to be understood, while leaving the format seemingly compatible. The old client will think it has understood the delegation correctly when it has only understood a part of it.

It's possible that there are no security issues here but I don't like the ambiguity: I would rather choose a design where old client fails to process the whole delegations object if the delegations object contains a delegation client does not understand.

@mnm678
Copy link
Contributor

mnm678 commented Apr 29, 2022

A solution to this would be to make this TAP part of a major version change. Clients check the version when downloading, so would know that the metadata shouldn't be parsed by an older client. If we do that, we could then require that metadata includes either, but not both of delegations and succinct_hash_delegations.

@MVrachev
Copy link
Collaborator Author

If we do that, we could then require that metadata includes either, but not both of delegations and succinct_hash_delegations.

I think here you meant roles and succinct_hash_delegations inside the delegtions objects as from what I understand those two are in conflict.

@jku
Copy link
Member

jku commented May 2, 2022

A solution to this would be to make this TAP part of a major version change.

This certainly does it but I don't think it's necessary: If succinct delegations are not used, the old clients would still work fine with any of the formats we've discussed. Bumping major version would mean they would be made artificially incompatible which feels quite drastic.

I think it would be acceptable if the spec major version stays the same, but we just document that old clients will not work if repository uses succinct delegations (and then we really make sure that the format we design can't accidentally half-work with old clients like I described earlier).

MVrachev added a commit to MVrachev/tuf that referenced this issue May 2, 2022
Add a SuccinctHashDelegations.get_bin_name() API call which returns the
bin name of a given sequential bin number.
This is useful to easily get the name of the first bin which can be
used as a unique identifier for delegated roles using a succinct hash
delegations.

A unique identifier for such delegated roles with succinct hash
is required by our implementation of Delegations.roles where the keys
are DelegatedRole.name attributes and values are DelegatedRole objects
and the fact that when succinct hash delegations are used
DelegatedRole.name should be populated with "None".

Solves: theupdateframework#1943

Signed-off-by: Martin Vrachev <[email protected]>
MVrachev added a commit to MVrachev/tuf that referenced this issue May 4, 2022
Add a SuccinctHashDelegations.get_bin_name() API call which returns the
bin name of a given sequential bin number.
This is useful to easily get the name of the first bin which can be
used as a unique identifier for delegated roles using a succinct hash
delegations.

A unique identifier for such delegated roles with succinct hash
is required by our implementation of Delegations.roles where the keys
are DelegatedRole.name attributes and values are DelegatedRole objects
and the fact that when succinct hash delegations are used
DelegatedRole.name should be populated with "None".

Solves: theupdateframework#1943

Signed-off-by: Martin Vrachev <[email protected]>
MVrachev added a commit to MVrachev/tuf that referenced this issue May 4, 2022
Add a SuccinctHashDelegations.get_bin_name() API call which returns the
bin name of a given sequential bin number.
This is useful to easily get the name of the first bin which can be
used as a unique identifier for delegated roles using a succinct hash
delegations.

A unique identifier for such delegated roles with succinct hash
is required by our implementation of Delegations.roles where the keys
are DelegatedRole.name attributes and values are DelegatedRole objects
and the fact that when succinct hash delegations are used
DelegatedRole.name should be populated with "None".

Solves: theupdateframework#1943

Signed-off-by: Martin Vrachev <[email protected]>
MVrachev added a commit to MVrachev/tuf that referenced this issue May 9, 2022
Change the current pr and experiment with Luka's suggestion outlined
here: theupdateframework#1943 (comment)

For the ease of implementation, I have additionally decided that:
- "roles" and "succinct_hash_delegations" attributes in the Delegations
class are mutually exclusive
- there could be only one "succinct_hash_delegations"
- I didn't create a new common class between "DelegatedRole" and
"SuccinctHashDelegation" classes as I didn't notice any meaningful
improvements
- I have decided to leave Delegations.find_delegation() even though it's
not used by the client as that seemed like a logical API call to have

Signed-off-by: Martin Vrachev <[email protected]>
MVrachev added a commit to MVrachev/tuf that referenced this issue May 9, 2022
Add a SuccinctHashDelegations.get_bin_name() API call which returns the
bin name of a given sequential bin number.
This is useful to easily get the name of the first bin which can be
used as a unique identifier for delegated roles using a succinct hash
delegations.

A unique identifier for such delegated roles with succinct hash
is required by our implementation of Delegations.roles where the keys
are DelegatedRole.name attributes and values are DelegatedRole objects
and the fact that when succinct hash delegations are used
DelegatedRole.name should be populated with "None".

Solves: theupdateframework#1943

Signed-off-by: Martin Vrachev <[email protected]>
MVrachev added a commit to MVrachev/tuf that referenced this issue May 9, 2022
Change the current pr and experiment with Luka's suggestion outlined
here: theupdateframework#1943 (comment)

For the ease of implementation, I have additionally decided that:
- "roles" and "succinct_hash_delegations" attributes in the Delegations
class are mutually exclusive
- there could be only one "succinct_hash_delegations"
- I didn't create a new common class between "DelegatedRole" and
"SuccinctHashDelegation" classes as I didn't notice any meaningful
improvements
- I have decided to leave Delegations.find_delegation() even though it's
not used by the client as that seemed like a logical API call to have

Signed-off-by: Martin Vrachev <[email protected]>
@MVrachev
Copy link
Collaborator Author

MVrachev commented May 9, 2022

Why don't we take a literal step back here? Maybe, and Jussi suggests this in his last comment, the reason why implementing this is tricky is because the metadata format in TAP 15 is tricky. Maybe (putting on my ER-modeler hat) the "succinct hash delegation object" is just too different from "delegation role" objects? What about this alternative format:

{
  "keys" : {
      KEYID : KEY,
      ...
  },
  "roles" : [
    {
      "name": ROLENAME,
      "keyids" : [ KEYID, ... ] ,
      "threshold" : THRESHOLD,
      ("path_hash_prefixes" : [ HEX_DIGEST, ... ] |
      "paths" : [ PATHPATTERN, ... ]),
      "terminating": TERMINATING,
    }, ...
  ],
  "succinct_hash_delegations": [ # better name pending
    {
      "keyids" : [ KEYID, ... ],
      "threshold" : THRESHOLD,
      "terminating": TERMINATING,
      "delegation_hash_prefix_len" : BIT_LENGTH, # better name pending
      "bin_name_prefix" : NAME_PREFIX # better name pending
    }, ... 
  ]
}

I am aware that the objects share keyids, threshold,terminating fields, but a non-normalized data definition seems preferable to mutual exclusive relationships between multiple fields. And in code we could summarize them in a common parent class for DelegatedRole and DelegatedBins (better name pending).

I have implemented @lukpueh suggestion from the comment here https://github.com/MVrachev/tuf/tree/new-suggestion.

In my prototype, I have additionally decided that:

  • roles and succinct_hash_delegations attributes in the Delegations class are mutually exclusive
  • there could be only one succinct_hash_delegations
  • I didn't create a new common class between DelegatedRole and SuccinctHashDelegation classes as I didn't notice any meaningful improvements

@jku, @lukpueh, @mnm678 what do you think?

@jku
Copy link
Member

jku commented May 10, 2022

Some initial comments:

  • it's not obviously "less code", in fact it's 100 lines more (not necessarily bad, just pointing it out)
  • The core SW design decision seems to be to make Delegations more important: it has an API now
  • a lot of the complexity has moved to Delegations serialization, this seems natural and "easy" to understand
  • I think the Delegations API requires work but it seems like it is viable. Some comments on that
    • get_all_delegations internally finds roles to return the roles names -- this name is then used by calling code to find the same role... seems needlessly complex
    • find_delegated_role return value Optional[Union[DelegatedRole, SuccinctHashDelegations]] is definitely a code smell.
    • find_delegated_role runs a for-loop (via the iterator) potentially 4 billion times. The loop seems unnecessary
  • add_key() and remove_key() suddenly became quite complex, I did not expect that. The APIs now don't make complete sense either -- why does the caller supply a rolename if it's then not used for anything? I don't have great suggestions, just pointing out an issue

Any comments on how complex this felt to implement vs the other approach @MVrachev ? Assuming we manage to polish the API, I feel like this looks a little more manageable to me -- the complexity is now contained in places where I expect it (although the key management issues are an unfortunate side effect).

@MVrachev
Copy link
Collaborator Author

Some initial comments from me as well on the prototype based on Lukas suggestion:

it's not obviously "less code", in fact it's 100 lines more (not necessarily bad, just pointing it out)

Agree, I spent a good amount of time optimizing the code and making it smaller.
For example, I wanted to escape of the need of an additional class common for DelegatedRole and SuccinctHashDelegation. It seemed a waste given that Role already has keyid and threshold in it and the only missing common argument was terminating .

get_all_delegations internally finds roles to return the roles names -- this name is then used by calling code to find the same role... seems needlessly complex

Agree, I wanted to simplify it, but I wasn't sure how as if I just return succinct hash delegation instead of the name then I will add the first bin role name to be visited instead of the actual bin name reponsible for that target path.

find_delegated_role return value Optional[Union[DelegatedRole, SuccinctHashDelegations]] is definitely a code smell.

Yep, agree it looks ugly when I am not using a common class between those two. I can't say I just return Role as Role doesn't have terminating which we later use in preoreder_depth_first_search.

add_key() and remove_key() suddenly became quite complex, I did not expect that. The APIs now don't make complete sense either -- why does the caller supply a rolename if it's then not used for anything? I don't have great suggestions, just pointing out an issue

Yes, the suddenly added complexity of those two API calls probably shows that those functions tries to cover a more generic case than it should.

Any comments on how complex this felt to implement vs the other approach @MVrachev ?

It felt that in some places I had to add more code just to make sure I am handling all cases related that only one of roles and succinct_hash_delegations can be set. There are many examples of this like Targets.add_key, Targets.remove_key() and RepositorySimulator which now needs a new call add_succinct_hash_info_to_delegations.

Assuming we manage to polish the API, I feel like this looks a little more manageable to me -- the complexity is now contained in places where I expect it (although the key management issues are an unfortunate side effect).

I have a feeling that this is a more logical design to rely on Delegations to do the heavy lifting.
Using DelegatedRole for succinct_hash_delegation felt little unatural.

@jku
Copy link
Member

jku commented May 12, 2022

get_all_delegations internally finds roles to return the roles names -- this name is then used by calling code to find the same role... seems needlessly complex

Agree, I wanted to simplify it, but I wasn't sure how as if I just return succinct hash delegation instead of the name then I will add the first bin role name to be visited instead of the actual bin name reponsible for that target path.

find_delegated_role return value Optional[Union[DelegatedRole, SuccinctHashDelegations]] is definitely a code smell.

Yep, agree it looks ugly when I am not using a common class between those two. I can't say I just return Role as Role doesn't have terminating which we later use in preoreder_depth_first_search.

There could be some alternative design here...

Throwing suggestions on the wall: terminating is only needed once in updater so what if you return a tuple of optional rolename, terminating from get_all_delegations()? Then find_delegated_role() is no longer needed outside metadata.py so you could try refactoring the one remaining use case inside metadata.py...

@MVrachev
Copy link
Collaborator Author

MVrachev commented May 16, 2022

There could be some alternative design here...

Throwing suggestions on the wall: terminating is only needed once in updater so what if you return a tuple of optional rolename, terminating from get_all_delegations()? Then find_delegated_role() is no longer needed outside metadata.py so you could try refactoring the one remaining use case inside metadata.py...

There was a TAP 15 design meeting. What we have decided is summarized here: theupdateframework/taps#154 (review).

We have decided to drop the terminating field from succinct hash delegations.

MVrachev added a commit to MVrachev/tuf that referenced this issue May 17, 2022
Add a SuccinctRoles.get_bin_name() API call which returns the
bin name of a given sequential bin number.
This is useful to easily get the name of the first bin which can be
used as a unique identifier for delegated roles using a succinct_roles.

A unique identifier for such delegated roles with succinct_roles
is required by our implementation of Delegations.roles where the keys
are DelegatedRole.name attributes and values are DelegatedRole objects
and the fact that when succinct hash delegations are used
DelegatedRole.name should be populated with "None".

Solves: theupdateframework#1943

Signed-off-by: Martin Vrachev <[email protected]>
@MVrachev
Copy link
Collaborator Author

We have decided that succinct hash delegations will be part of the Delegations object and as such it's no longer connected to DelegatedRole and the problem described in this issue description.
Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discussions related to the design, implementation and operation of the project
Projects
None yet
Development

No branches or pull requests

7 participants