Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does it make sense to use the integrity attribute for this? #42

Open
bakkot opened this issue Jan 15, 2025 · 7 comments
Open

Does it make sense to use the integrity attribute for this? #42

bakkot opened this issue Jan 15, 2025 · 7 comments

Comments

@bakkot
Copy link
Contributor

bakkot commented Jan 15, 2025

It does work. But as a developer, it's quite confusing: integrity normally contains a digest (which changes with the content of the script), not a public key (which does not). Furthermore, there is a digest involved in this process, but it's provided elsewhere. And the digest is used to verify the contents of the script, whereas the key is used to verify its authorship, which is a different property. The only reason to re-use "integrity" is because it happens to be defined in the SRI spec, even though from a developer's point of view it works completely differently than the existing "integrity" attribute. That seems like a lousy reason to me.

This area tends to be pretty confusing for developers to begin with, so I think it would be a lot easier for developers if you'd avoid overloading this attribute and instead just introduce a new one: perhaps integrity-key.

Also, I'm very much hoping this proposal can grow to accommodate inline scripts, and inline scripts do not currently respect the integrity attribute at all. It would be confusing to start seeing it on inline scripts, especially if you couldn't put a sha256 digest there like you can for external scripts.

@mikewest
Copy link
Member

I've seen "integrity" as encompassing both integrity of the content (via digests) and integrity of the supply chain (via proof of ownership of a known key, and willingness to use it to sign a given resource). There's enough discussion of "supply chain integrity" today that I don't think there's a risk of confusion about the attribute's purposes. Of course, I would say that, since I came up with the current wording. :)

My feeling is that it's valuable to leverage the existing infrastructure, and that piping more metadata up and down through the stack as key rather than integrity doesn't produce much practical benefit. If folks disagree about the aesthetic or conceptual benefits, let's talk about it. 🤷

cc folks who have engaged on other issues with opinions: @annevk @ddworken @punkeel @yoavweiss

@bakkot
Copy link
Contributor Author

bakkot commented Jan 15, 2025

I don't think there's a risk of confusion about the attribute's purposes.

I'm less concerned about purpose and more about how it's used: a hash and a public key are different kinds of things and are used in different ways, even if they can both be used towards a broader notion of "integrity". And I do think it's especially confusing that with this new kind of "integrity" there would still be a hash, just elsewhere.

I've talked to a bunch of people of varying levels of sophistication about CSP and SRI and how to make it work in their particular contexts and I can assure you that there will be confusion if you overload this to contain both a hash and a public key. There's already plenty of confusion about CSP and SRI. Don't make it worse.

Relying on the folks who are going to have to fight with this sort of thing to have even heard of "supply chain integrity" is not a safe bet; you are assuming wildly more sophistication than is frequently present. Yes, the person who originally set it up is probably reasonably sophisticated (though not necessarily if they're setting CSP/SRI up in response to a request from their regulator or security auditor rather than because they themselves realized it was a good idea), but the people down the line who are then having to update the page when something changes, or who are trying to figure out how to integrate a vendor's code, are frequently not security people even to the level of having heard of "supply chain integrity".

(Since I work for a vendor whose code needs to be integrated into existing pages, I end up talking to a lot of the latter kind of people at a lot of different companies, usually when the existing teams have not been able to figure out what to do.)

My feeling is that it's valuable to leverage the existing infrastructure, and that piping more metadata up and down through the stack as key rather than integrity doesn't produce much practical benefit.

Do you just mean the spec infrastructure? That's supposed to be the lowest priority in the priority of constituencies. It's also not that complicated; if that's the only concern I'm happy to write the PRs.

@mikewest
Copy link
Member

Changing this proposal to use a different attribute name is not a huge deal; I'd just make the SRI monkey-patches in the draft spec not be monkey-patches anymore, and change the HTML and Fetch monkey-patches accordingly. We'd need to do a bit more work in CSP to wire things up, but it's also ~trivial. That said, it's not a change that I think is necessary, and it seems to me that there are some conceptual advantages to tying multiple checks against subresources' content into one concept rather than splitting them into more narrow chunks.

Still, if there's a general agreement that I'm wrong then I'll certainly defer to it. Naming is hard, I'm happy to grant that I'm pretty terrible at it. :)

I don't think there's a risk of confusion about the attribute's purposes.

I'm less concerned about purpose and more about how it's used: a hash and a public key are different kinds of things and are used in different ways, even if they can both be used towards a broader notion of "integrity". And I do think it's especially confusing that with this new kind of "integrity" there would still be a hash, just elsewhere.

I've talked to a bunch of people of varying levels of sophistication about CSP and SRI and how to make it work in their particular contexts and I can assure you that there will be confusion if you overload this to contain both a hash and a public key. There's already plenty of confusion about CSP and SRI. Don't make it worse.

Relying on the folks who are going to have to fight with this sort of thing to have even heard of "supply chain integrity" is not a safe bet; you are assuming wildly more sophistication than is frequently present. Yes, the person who originally set it up is probably reasonably sophisticated (though not necessarily if they're setting CSP/SRI up in response to a request from their regulator or security auditor rather than because they themselves realized it was a good idea), but the people down the line who are then having to update the page when something changes, or who are trying to figure out how to integrate a vendor's code, are frequently not security people even to the level of having heard of "supply chain integrity".

(Since I work for a vendor whose code needs to be integrated into existing pages, I end up talking to a lot of the latter kind of people at a lot of different companies, usually when the existing teams have not been able to figure out what to do.)

My feeling is that discussion of "supply chain integrity" (in those words) will become more common given the way I see governmental/regulatory entities discussing this. That might well be a misreading of the general direction, and I can certainly accept that talking about this differently might be helpful. I'd also suggest that we've fielded requests to "add signatures to SRI!" for a long time now, predating the current focus on supply chains. It simply doesn't seem like a stretch to me to consider provenance checks to be another kind of integrity. 🤷

That said, I think you're right that a lot of people want to use security functionality the platform provides, and that explaining that functionality without making overly-optimistic assumptions about folks' baseline understanding is often difficult. While we need to make sure that things have reasonable names, it seems to me that the deeper concern relates to the attention we pay to documentation, well-vetted recommendations, devtools integration, etc. Whatever we call this, we should do our best to help folks use it in a reasonable way (just as we ought to do for other things as well). I appreciate you pointing out integration points that you think are particularly rough!

Do you just mean the spec infrastructure? That's supposed to be the lowest priority in the priority of constituencies. It's also not that complicated; if that's the only concern I'm happy to write the PRs.

The spec infrastructure and the implementations. I think we'll end up doing the same work twice (e.g. validating both server-initiated content checks and client-initiated content checks requires generating a digest over the content: doing that twice when we could do it once is pure overhead) if we treat these as entirely distinct concepts.

I agree that if you're right about developer confusion, that's more important. If folks generally agree with you on that, I'm confident that we can structure the specification to allow the same kinds of optimization that the current sketch does (e.g. by generating the digest in Fetch and passing it to docs that rely on it? Would need to re-read those integration points.).

@annevk
Copy link
Contributor

annevk commented Jan 15, 2025

Can the signature and hash tokens overlap at some point?

Perhaps the signature tokens should be distinguished a bit more?

There's probably some value in being able to tell at a glance what an integrity attribute enforces, but I'm not convinced it warrants a completely different API. Everyone in the ecosystem will have to consider the union of hashes and signatures once we add signatures into the mix, separating them at the attribute level just raises different web developer questions (what if I do both?).

@mikewest
Copy link
Member

Can the signature and hash tokens overlap at some point?

My expectation is that they'll always be distinguished by (at least) the algorithm component: sha256-... vs ed25519-.... Length will also differ in many cases (e.g. while SHA-256 and Ed25519's public key are coincidentally the same length, SHA-512 is substantially longer).

Perhaps the signature tokens should be distinguished a bit more?

Do you have any additional distinguishers in mind? Beyond changing the separator (sha256-... vs ed25519+... or something?) I'm not coming up with much that would be useful.

There's probably some value in being able to tell at a glance what an integrity attribute enforces, but I'm not convinced it warrants a completely different API. Everyone in the ecosystem will have to consider the union of hashes and signatures once we add signatures into the mix, separating them at the attribute level just raises different web developer questions (what if I do both?).

I generally agree with this.

@annevk
Copy link
Contributor

annevk commented Jan 15, 2025

So it will never be reasonable for hashes and signatures to use the same algorithm? (Sorry, not entirely my area.)

I was thinking that it could be something like signature-ed25519-blah potentially, which would make it clear to a reader familiar with integrity they're not just looking at a new hash algorithm.

(Only softly suggesting this though because as I said above this isn't entirely my area.)

@mikewest
Copy link
Member

So it will never be reasonable for hashes and signatures to use the same algorithm? (Sorry, not entirely my area.)

"Never" is a long time, but I don't know of any hashing algorithms that can also be used to generate or validate signatures.

I was thinking that it could be something like signature-ed25519-blah potentially, which would make it clear to a reader familiar with integrity they're not just looking at a new hash algorithm.

Got it. That approach hadn't occurred to me, but it's certainly something we could do if we see this kind of confusion as a risk.

If we went this route, I wonder if we'd want to alias sha256-... as digest-sha25-...?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants