-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: Provider Removal of Content & Transparency #184
Comments
I think to solve this problem we need to take a step back and look at the centralization chokepoint caused by requiring the HTTPS scheme for URLs for Activity Content. This takes control of user-generated content out of the hands of an individual user and into that of the entity (probably the provider) that controls the hostname in question. A way of addressing this is to enable consumers to attempt to retrieve content from a decentralized storage protocol such as IPFS. If the spec is changed to enable IPFS as a top-level scheme for content retrieval, then the user (or any other entity) has the ability to host content in any location they choose. Any given provider/host may cease to make the content available at any time (in general, they should be encouraged not to—but services shut down, or they might find content that violates their terms of service, or they might receive legal notice requiring them to take down content, and so on). The only reliable safeguard from a user point of view is to keep a backup of their content so that it can either be made available elsewhere (ideally redundantly). In this scenario, provider removal of content would simply mean unpinning the content in question. This would likely be coupled with a blocklist that is either provider-specific or shared between providers. (nb. Attestation—via DSNP Content Attribute Set Announcements—provides an in-protocol means of flagging content publicly, but blocklists could also be private or out-of-band.) The blocklist would tell providers not to retrieve or show the content in question, even if it exists (i.e. is pinned) elsewhere in the network. An unfiltered view of the content stream would still show the content, provided it is still pinned somewhere. This brings up a second concern, which is a provider's ability to post Tombstone Announcements in such a scenario. Because DSNP treats tombstoning as final (which is consistent with the intended semantics), there exists a situation where a provider, however trustworthy, might be compelled (i.e. by force of law) to publish a Tombstone Announcement for user content. Note that this is a step further than unpinning, as it would have the effect of disallowing any protocol-compliant application from accessing the content. (A non-compliant application could potentially find the content, but this would get rather confusing.) The only way to mitigate the threat of unsanctioned provider tombstoning of content is to not delegate permission to publish Tombstone Announcements. This is eminently possible with DSNP as is, but if we imagine an application that does not have this permission, and a user that wants to delete (tombstone) their content, we have now introduced a cost to the user for removal of content, even though (typically) the provider would bear the costs for content publishing. So we find that delegation of tombstoning is convenient but dangerous. To propose a (slightly ridiculous) solution, the user could perform the following sequence of actions when they want to remove content:
This is both a horrible user experience and (however briefly) delegates more power than needed to the provider (the latter is probably not a major concern). An alternative is to change the tombstone announcement structure to always require the user's signature. This is still a potentially interruptive user experience, but hopefully the ability to request signatures from a user's control key becomes more standardized over time. There may not be a solution which maximizes both ease of use and user control. |
To summarize the above, the questions are:
|
I guess I viewed content as slightly different than the social graph. If you use an app to publish your content and they are paying to publish and host it, it is more of a partnership in that either party should be able to decide they no longer want to have the content published. The user still owns their content from an IP point of view and could repost it somewhere else if they desired or host it themselves even. |
They would have to re-announce it though, and the new timestamp/block number of the content wouldn't match the original, which I see as a key trust mechanism when reading a thread. (It would also be weird to see the gaps caused by the original provider making the content unavailable.) But maybe I'm overthinking this, and very few services will go to the originally announced URL to get content, preferring to sync with a cache somewhere else. I just find it troubling that we can't guarantee content permanence. |
To clarify for readers, it's not up to DSNP to guarantee content permanence**; only to provide a mechanism that allows a hosted URL to change out from under a decentralized indicator, to handle content that gets moved. I'm in favor of an option for IPFS or a Torrent or something like that. However Providers need the option to Tombstone user content without user permission, in the cases you mention, to more effectively combat abuse and illegal activity and potentially to able to comply with federal laws. Perhaps certain other limits could be discussed, such as, (brainstorming)
This could first of all be used to demonstrate compliance with the law, and secondly partly addresses a frequent complaint of users being having content removed with no explanation as to why nor having any recourse. ** as clearly such guarantees are completely undesirable for illegal content, such as CSAM |
Yes, sorry if that wasn't clear. It should be possible (for example) for a user to switch their provider and take their content with them.
The nice thing about IPFS is that the
Which country's laws? If DSNP is to be used globally, it is probable that some content that is legal in one jurisdiction is illegal in another. I think "collaborative unpinning" should be used instead of tombstoning. If content is illegal, the feds (or whatever powers that be) can go after the hosting providers in their jurisdiction.
I think the second part is going to be true by definition (except in rare cases where no provider is used?). The first part is the current specification. Maybe "and" instead of "or" would be interesting to consider (provider must be both the publisher of the announcement in question and a current delegate), but I still think this gives too much control to the provider. It's the user's content; the provider has agreed to host it (possibly for a limited time); the provider can stop hosting it, but it doesn't become the provider's content.
I think attestations would be a way of providing this metadata (and suggesting to others who share the same jurisdiction or terms that the content should be unpinned and blocked). But anything that allows a provider to unilaterally tombstone content assumes that all providers will agree on their ruling. This might be true in certain circumstances (CSAM, perhaps, though even there, I'm not sure there is a single universal definition that all jurisdictions would agree on) but it is little more than an opinion in others. To take a different example, let's say discussion of gender-affirming care becomes illegal in Alice's provider's jurisdiction, but not in Bob's, but both Alice and Bob have posted extensively about it in the past. Both providers have chosen to pin both parties' content, perhaps fearing this day would come. Alice's provider must unpin all this content to be legally compliant. But should Alice's provider be able to (effectively) force Bob's provider to not show Alice's prior posts, even though they are legal in Bob's jurisdiction and hosted on Bob's provider's server? |
A few adjacent notes specifically to separate out the two issues.
For Issue 2, if Alice's wants to have their content hosted elsewhere, there are several levels required:
Now while something such as an IPFS setup can be suggested and built into tools, some delegate applications may feel they cannot participate in the posting of such complete content (as opposed to just metadata in the Announcements) to IPFS. I do not think that requiring IPFS (or other decentralized storage) top to bottom is a viable option at this time. So we have a few alternatives that have been discussed at various times:
I'm sure there are others, but this is a different issue than the delegate authority issue. For issue 1, I believe it is reasonable that a delegate who has published something to the world (on behalf of the user) can have it removed. Currently that is via a tombstone (if the user has delegated it) or via refusing to continue to pin/host it. I'm sure different legal jurisdictions have different requirements around this, but I ask if the permission to tombstone that a delegate might have is any greater than the permission to publish? The primary thing that is lost is that one cannot untombstone. This is necessary for purposes of recursion limitation. While the spec could have a "Republish" or "Untombstone" announcements, I ask if the added complexity is worth the benefit? |
I think this points to several distinct “states” content could be in when it is “removed”:
I am sure there are many more variations. The most important elements from my perspective: This leads me to ask whether we should (perhaps) always have a content hash in the announcement, with an optional provider url? If the provider url is blank or invalid, the fallback would be to check ipfs, which would give us a robust “reposting” mechanism. Alternatively maybe the original user (directly or via delegation) can post a special “reply”/“repost” to the message that acts as a redirect to a new location (requiring content to pass the security checks of the original announcement)? If the user controls their content (as we claim), then being able to declare the (same) content is somewhere else (given we aren’t using content addressed storage) is the right approach? Feels like a good discussion for the next DSNP Spec meeting, maybe? |
(emphasis mine) The new hash requirement isn't spelled out in the DSNP spec. Is it necessary? If updates with the same hash are allowed, I think it would be reasonable for applications to conclude that content had been moved and not edited, which is important for transparency in the UX. This is a reasonable solution but doesn't scale particularly well. I also think there's a substantive difference between a user proactively vs. reactively replicating their content (I would like to solve for the proactive case). If moving content requires an Update Announcement, there are scenarios that leave a lot of question marks. For example, if someone dies and their control key is not recovered by their estate, and at some later point their provider goes out of business, there is no protocol-native means of preserving their content within the DSNP content corpus. I know these are edge cases, but there are a lot of scenarios where user-driven reactive updates are a poor fit for continuity of access to "public square" communication.
I like this conceptually, but from a consuming application's point of view, there should be a clear algorithm for (attempting to) access a content item, and this feels very hand-wavy—try the centralized URL, but if that doesn't work, use whatever knowledge and protocols you may or may not have at your disposal to try to locate the content by its hash. This seems likely to lead to inconsistent behavior, where content is visible within one application but not in another (which is allowed from a filtering/terms of service point of view, but that's not really the case here).
I get that IPFS is still a pretty exotic concept to providers in the Web 2.0 world, but is pinning on IPFS really that much different from hosting that same file on a public web server? I'm not suggesting that a provider be required to pin any other content (though of course they can if they want, and many will likely cache broad swathes of content as a matter of course). And as you point out, they're required to speak IPFS for batch files anyway (on Frequency at least). Is there a best of both worlds solution? What if we propose the following algorithm for content retrieval:
This approach requires batch consumers to be able to retrieve content over both protocols. But it enables a user to provide redundancy at any time by pinning the file in IPFS (most likely through a relationship with a storage service that needn't be DSNP-aware, such as a Filecoin-backed solution). This would then take us back to the need to define what constitutes a retrieval failure in HTTP. I would submit that any response other than a redirect (within a reasonable number of non-circular hops) or a successful hash match should be counted as a failure. One can easily imagine a case where a defunct provider's domain is still serving HTTP 200s for every URL but the actual content is now a domain parking page. I also think we should shift the IPFS requirement for batches to the DSNP spec (not just DSNP over Frequency). I agree that provider tombstoning is technically an orthogonal topic, so I'll make a separate discussion issue for that. Postscript: I wrote this in a parallel timeline to Harry's reply (i.e. on an airplane without an internet connection) and it looks like he made some of the same points, so apologies for the redundancy. |
Situation: A delegated service has previously announced some content that it now determines it does not want to continue hosting. What happens?
Previous related discussions: #80 (More user request rather an service required)
Off-chain announced content have three responsible parties (could be the same or have splitting out such as a host paid for by someone else):
HTTP Status codes work in some situations:
Difficult situation: What about just terms of service violations?
General Approaches
Some form of DSNP level announcement
Why?
Nothing but a normal 404
Summary from 2022-04-28:
The text was updated successfully, but these errors were encountered: