Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create xip-35-message-sender-signature.md #35

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nmalzieu
Copy link

@nmalzieu nmalzieu commented Dec 6, 2023

Here is a first draft following the discussion in the XMTP-Converse Slack (private link https://converseapp.slack.com/archives/C04FKRLV3EC/p1701789725715419 )

@nmalzieu nmalzieu requested a review from jhaaaa as a code owner December 6, 2023 10:46
@bwcDvorak
Copy link

bwcDvorak commented Dec 11, 2023

Summary for those who do not have access to the original Slack thread:

iOS XMTP client apps face challenges in obtaining App Store approval to filter a user's push messages. To address this, the protocol should provide a mechanism to prevent push notifications for a user's own messages within a specific topic.

The current proposal is for the protocol to provide a way for the push server to know the sender ID (likely anonymized, should be synchronized across clients) and for each client app to only subscribe to messages in a topic that are not from the users' own sender ID.

Initial implementation thoughts:
• A client for a given user would always generate the same unadvertised key for a given topic.
• The private key would be used to generate a signature of the message payload, and this signature would be attached publicly next to the envelope
• Then client would subscribe to a topic and provide the associated public key to the notification server
• When the server sees an envelope, it uses the public key to verify the signature : if it matches, it doesn’t send the notification for that envelope

## Specification

The SDKs would generate a private/public key pair for each topic.
The private/public key pair should always be the same for a given user and a given topic - not dependent on installations / SDK language.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion would be that each user would have a signing key that would be synced between their devices for v1/v2 and be per-installation and not synced for group chats/v3. We would generate a per-topic key by doing HKDF(root_key, salt=$topic).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I get the difference between an "installation" and a "device"?
Goal being that I generate the same per-topic key on Coinbase Wallet & Converse

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I get the difference between an "installation" and a "device"?

"Installation" is a little more specific than "device", because in the XMTP case there can be multiple different apps (Coinbase Wallet, Converse) installed on the same device. We sometimes forget and use the word "device" interchangeably, because a lot of the literature uses that term.

Goal being that I generate the same per-topic key on Coinbase Wallet & Converse

@neekolas's suggestion would work for this - the signing key is synced to all installations, and all installations perform the same HKDF to get the same per-topic key.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would generate a per-topic key by doing HKDF(root_key, salt=$topic).

I'm not sure I understand this proposal for v3. The per-topic key needs to be the same among all installations, so what is "root_key" referring to here?

The SDKs would generate a private/public key pair for each topic.
The private/public key pair should always be the same for a given user and a given topic - not dependent on installations / SDK language.

Before publishing an envelope on a topic, the SDK would use the topic private key to sign the message payload and attach it to the envelope.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important to note that the public key of the user's signing key would be private between the user's devices and the notification server. Third parties would not be able to validate that the signatures originate from the actual sender and were not forged.

Copy link
Contributor

@richardhuaaa richardhuaaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes a lot of sense! One thing to note here, is that ECDSA signatures are recoverable (meaning, you can derive the public key from the signature). This might allow all external observers to link together messages coming from the same account within a conversation.

I'm wondering if we should either use a non-recoverable signature scheme, or use HMAC. The issue with HMAC is that the key is symmetric, and given the key, the server can forge the authentication code on other messages. Definitely out of my element here, would love @Bren2010's input

Another thing to consider here is that by giving the public key to the server we are essentially telling the server which messages out of a given conversation are our own, for perpetuity - and that multiple participants in the conversation may be volunteering this information to the same server over time. At some point we may want to add the ability for this feature to be disabled per-conversation, as well as automatic topic key rotation, but happy to leave this out for now. FYI @tsachiherman

@neekolas
Copy link
Contributor

@richardhuaaa I like the idea of a non-recoverable signature scheme. The biggest drawback of this whole setup is the additional metadata it leaks. Before, an observer would only know that N messages were sent on a given topic, but would have no idea who the sender was. With this change, an observer would be able to tell how many messages each participant has sent.

@richardhuaaa
Copy link
Contributor

I'm wondering if we can cheaply build in topic key rotation from the very beginning, to minimize the information the server has - surveillance via push notifications is a real issue. Instead of topic_key = HKDF(root_key, salt=$topic), we could do something like topic_key = HKDF(root_key, salt=$topic . $days_since_epoch). This means the topic key rotates every day. When subscribing to push notifs, clients would provide the topic keys for yesterday, today and tomorrow. It is okay for push notifications from earlier than this to be dropped.

@tsachiherman
Copy link

This makes a lot of sense! One thing to note here, is that ECDSA signatures are recoverable (meaning, you can derive the public key from the signature). This might allow all external observers to link together messages coming from the same account within a conversation.

I'm wondering if we should either use a non-recoverable signature scheme, or use HMAC. The issue with HMAC is that the key is symmetric, and given the key, the server can forge the authentication code on other messages. Definitely out of my element here, would love @Bren2010's input

Another thing to consider here is that by giving the public key to the server we are essentially telling the server which messages out of a given conversation are our own, for perpetuity - and that multiple participants in the conversation may be volunteering this information to the same server over time. At some point we may want to add the ability for this feature to be disabled per-conversation, as well as automatic topic key rotation, but happy to leave this out for now. FYI @tsachiherman

thanks for looping me in. I need to spend some more CPU on that one..

@neekolas
Copy link
Contributor

we could do something like topic_key = HKDF(root_key, salt=$topic . $days_since_epoch).

That only kinda works. You can have multiple client apps sending messages, but each app is only going to update the keys on its own push notification server. To go with Noe's example above, imagine that the user has not opened the Converse app in a week and then sends a message via Coinbase Wallet. The Converse push notification server would not be aware of today's keys.

I like the idea of non-recoverable signatures more. Then we don't have to worry about varying the signer, since the signatures aren't linked to any identity and could only be verified by someone who already knows the public keys.

Clients could still rotate the keys periodically by uploading an updated EncryptedPrivateKeyBundle containing a new root key, forcing any other apps to update their respective notification servers. But even if we didn't have that rotation, the only thing you would be leaking to the notification server is which messages are yours in a channel filled with messages sent by one of two people. As far as metadata leaks go, I'd consider that pretty mild. There is already a level of trust with the notification server since it knows the list of topics each client is interested in. And since the notification server is controlled by the app developer, you are also trusting the app dev to not exfiltrate messages or metadata from the front-end.

@tsachiherman
Copy link

I think that I have an idea. It's not perfect, but might work for MLS:
when constructing the topic_key, we would use the following:
topic_key = $topic root key + <set of bitmask of the participant, from left to right>
then when requesting for a topic subscription, we would also specify a mask.
that wouldn't disclose the public key, but allow to provide the needed filtering.

@Bren2010
Copy link

Bren2010 commented Dec 16, 2023

I'm not sure we need a signature scheme here, I'm more partial to @richardhuaaa's idea of using an HMAC instead. The fact that the key is symmetric doesn't strike me as an issue, since the key would only be shared between the account installations and the push server. If the push server wanted to not send a notification, it could just not do that -- it doesn't need to forge a token first.

There's a couple of problems specifically with using signatures that bother me:

  • It's not reasonable to assume that because a signature scheme isn't designed to be recoverable (like ECDSA), that it provides /anonymity/ (the inability to link signatures given many message-signature pairs). Whereas HMAC provides this very directly: seeing many message-HMAC pairs specifically does not reveal anything about the secret key.
  • It's not reasonable to assume that a signature only verifies successfully under one public key.

Another thing to consider here is that by giving the public key to the server we are essentially telling the server which messages out of a given conversation are our own, for perpetuity

This is leakage, but it's intentional. The push server needs to learn which account sent which message to be able to filter.

Before, an observer would only know that N messages were sent on a given topic, but would have no idea who the sender was. With this change, an observer would be able to tell how many messages each participant has sent.

With an HMAC over the message payload, the total number of senders / whether the same sender sent different messages would not be leaked to anyone except the push server (which needs to know this information, per above)

@nmalzieu
Copy link
Author

Hi everyone, @nplasterer made me think that many other types of push notifications need to be "not displayed".
To list the current state at Converse, we actually drop notifications for these use cases:

  • when we consider a message spam
  • when someone adds a reaction on a message that is not from me
  • when we don’t support the content type
  • when it’s an empty message (including read receipts)

While we will not be able to be as precise as this without actually decoding the content of the message - which we definitely don't want to do - I think it would be useful to add to that XIP to also leak the content type to the push notification server, not only the sender.

That way, the push notification server can at least not send read receipt notifications to iPhones which won't be able to drop them.

What do you think? How should we modify the XIP?

@Bren2010
Copy link

To leak just the content type to the Notification Server, the best solution is probably just to throw in a symmetric encryption of the content type, alongside the HMAC discussed above.

If you're open to a more complicated construction, I think it's possible to handle all of the cases you listed. Instead of using an HMAC ( + symmetric encryption), we have the Notification Server have a public key. This would be a shared public key that everyone uses to encrypt notification-suppression data.

Installations would register with the notification server by telling it 1.) their account id, 2.) a list of supported content types.

An installation would compute the suppression payload as:

prefix = Hash(message)

h1 = Hash(prefix || account id)
if message is a reaction and account id != reacted message's account id:
    h2 = Hash(prefix || reacted message's account id)
else:
    h2 = random()

h1, h2 = randomlySwap(h1, h2)
ct = Message content type
skip = 1 or 0, if message is empty or otherwise shouldn't cause a notification

payload = Encrypt(notification server public key, h1 || h2 || ct || skip)

To determine if a message should have its push notification suppressed, the Notification Server decrypts payload and suppresses the notification if any of these are true:

  • H(prefix || subscriber's account id) == h1 or h2
  • ct is not one of subscriber's supported content types
  • skip is 1

Encryption prevents outsiders from being able to inspect the payload and see ct or skip. The Hash(prefix || account id) pattern for the sender's account id and reacted message's account id prevents the Notification Server from recognizing whether several messages are related to the same account, unless that account has registered with it.

@nmalzieu
Copy link
Author

nmalzieu commented Jan 2, 2024

@Bren2010 Thanks for this detailed solution!
The push server would be able to get back ct and skip from Encrypt(notification server public key, h1 || h2 || ct || skip) right?
What would be prefix?

Maybe this is a better solution as it would probably enable us to iterate in the future and add criteria

I still don't think it can enable us to handle spam as we compute a spam score based on the content of the message (and we can't rely on the sender to compute this because a scammer will just say it's not a spam) but it handles all other uses cases for sure!

@neekolas
Copy link
Contributor

neekolas commented Jan 2, 2024

I just want to zoom out on the issue a bit before we dig into specific solutions for the benefit of anyone else joining the discussion.

Notification servers live outside of the core XMTP protocol. Client apps communicate with their own notification servers through private channels, and can give their own notification servers whatever information is required to filter messages (although hopefully not enough information to read message contents).

For something like isSender, this works quite well because the additional metadata is only required on my own messages. I can generate new keys locally and share them with my trusted notification servers to verify my own messages.

With some of these additional use-cases for filtering (unsupported content types, reactions to other people's messages), we need to be able to filter out messages sent by other people. The solution to that likely involves creating a new public/private keypair used only for metadata encryption where it is safe for the notification server to have access to the private key. Somehow users will need to advertise which notification server public keys someone should use when sending messages to them. The public keys could be generated per user (we add a new public/private keypair to the contact bundle and users can choose to share the metadata private key with any push notification server they trust) or per application (Converse's push server has a single public/private keypair for all users).

To implement @Bren2010's suggestion, or any of the alternatives possible in this broad framing of the problem, we'll need some new infrastructure to communicate the correct public key someone should use when sending messages to me. We'll also need to think through cases where an account is connected to multiple client applications with different push notification servers.

@nmalzieu
Copy link
Author

nmalzieu commented Jan 3, 2024

Thanks @neekolas
Indeed, leaking information to the Converse server about messages that are not sent by me is more tricky.

We do need a solution relatively fast as Converse will be losing its ability to drop notifications soon, so I don’t think going the full « new infra » works for us even if we can still keep discussing it for the future!

Let’s summarize the issues:

  • isSender : we have a good privacy-preserving solution for that by adding the signature proposed in this XIP initially and it fixes the issue of messages sent by me, let's go!
  • scam / spam : even if we can’t drop the notification, we can detect that it’s spam and show a « New conversation request » or even a « New spam detected » notification (because we still have the ability to show whatever notification we want, we just can't drop them)
  • reaction added to a message that is not from me : we can totally accept to have notifications for that kind of stuff
  • unsupported content type : we could show a notification saying « you have a new message » (or show the fallback)
  • empty message notification, mostly for read receipts : these can really absolutely destroy our users experience

So let's really focus on how we can ship something with isSender + empty notifications for read receipts?

How would you feel about making the content type public in the envelope? This would be a great way for notifications servers to just stop sending notifications for content types that are not supported by the client & the content types that we know are not supposed to be notified (i.e. read receipts). It is not that sensitive an information, is that something that XMTP would consider?

If it’s really a no go, we could also probably just add a public boolean flag shouldNotify to an envelope.
When sending a message, the client could decide on the value of this flag, but there would be a default value for that flag according to the content type, especially read receipts where that flag would default to true.

Thanks a lot in helping us make the XMTP mobile experience good on Apple device 🙏

@richardhuaaa
Copy link
Contributor

I am in favor of shouldNotify - this leaks the minimum amount of data, compared to exposing the contentType (regardless of whether it's exposed only to the push server or is public).

Regarding the HMAC for identifying messages sent by yourself, consider two scenarios:

  1. I log in briefly into an unknown app, 'DodgyApp'. I realize later that I don't like it and uninstall, possibly even revoking my keys. But DodgyApp company gets all of my hmac keys for existing conversations uploaded to their push server, and can surveil my activity on these conversations into perpetuity.
  2. A government issues requests to multiple app developers to collect hmac keys from their push servers, and correlate them with user identifiable information that these apps collect. They are then able to unmask the participants and activity of participants in all existing conversations, for perpetuity. Alternatively, a single popular app has their backend compromised by a bad actor who then collects the keys for a large percentage of the network.

I'd like to suggest that we derive the key like this:

hmac_key = HKDF(root_key, salt=$topic . $months_since_epoch)

It's true that after not logging into an app for a month, it will no longer be able to filter push notifications effectively. However I would argue that this is a feature, not a bug. Note any other solution for rotating keys (e.g. uploading a new PrivateKeyBundle) would have the same problem. Additionally, provided that the app is still installed on the user's phone, the app developer should be able to send a push notification, triggering the Notification Service Extension to update the keys on the push server.

@richardhuaaa
Copy link
Contributor

richardhuaaa commented Jan 12, 2024

The fact that the key is symmetric doesn't strike me as an issue, since the key would only be shared between the account installations and the push server. If the push server wanted to not send a notification, it could just not do that -- it doesn't need to forge a token first.

@Bren2010 I wonder if this is an issue given there will be multiple push servers operated by different apps. So a malicious push server could use their knowledge of the key to influence the pushability of payloads on other push servers. Arguably the effects of this are mitigated by the fact that most users only use a single app, and having automatic key rotation as above can limit the scope. We could potentially sign and then HMAC the result, but not sure if this is overkill.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants