Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fraud prevention trusted server #3

Open
p-j-l opened this issue Feb 10, 2022 · 7 comments
Open

Fraud prevention trusted server #3

p-j-l opened this issue Feb 10, 2022 · 7 comments

Comments

@p-j-l
Copy link

p-j-l commented Feb 10, 2022

We propose developing a high-level document to capture use-cases and requirements for trustworthy anti-fraud servers. This is a call for collaboration among interested members of the community group.

Fraud detection and enforcement is one common use case that relies on third-party cookies and sensitive user data. There are more details about the need for this functionality in advertising use cases here. As browsers proceed to remove support for third-party cookies, this is an important use case that needs to continue to be supported.

One avenue to support this use case is the use of trustworthy servers to process relevant data. There are existing technical proposals that allow sensitive user data to be safely sent from the browser to a server, provided that there are guarantees for what the server does with the data (this is what makes it trustworthy). An example of this is the Aggregate Reporting Service that uses cross-site user data. Note that non-technical guarantees, such as auditing, are out of scope for this proposal at the moment.

We would like to use this discussion to then propose developing a similar scheme where the browser would send a minimum set of signals necessary for running ad fraud detection algorithms to one or more servers that could determine whether this is fraudulent traffic or not. One assumption we’re making here is that the algorithms are compute-heavy and therefore too costly to run in the browser itself (e.g. Machine Learning model evaluation) - we’d like to validate that first and determine whether it introduces a DoS risk. Another assumption is that attackers control the browser and so moving processing out of that environment may be beneficial.

There are many open questions in this area that we’d like to explore:

  1. What signals are necessary to get various quality results?
  2. What do ad fraud detection algorithms require to run beyond their input data? For example:
    • Are the algorithms proprietary and/or open source? Can these algorithms be publicly shared without compromising them or exposing their developers to reverse engineering risks?
    • How complex are the calculations that they perform?
  3. What sorts of technical protections best match the requirements above?
    • There are various useful things in this area: Secure Multi-Party Computation, Fully Homomorphic Encryption, Trusted Execution Environments, etc.
  4. Approximate ad fraud decisions can be made with a minimal set of signals, but how do we fine-tune what signals go into that set? Put another way: how do we experiment with new signals?
    • In practice, novel attacks can be entirely undetected if they are not covered by current signals and methods. Can system operators examine raw signals in some constrained setting in order to understand attacks?
  5. Who would own and operate the servers?

We’d like to start an effort to explore this approach, starting with requirements gathering, in the Anti-Fraud Community Group, and would welcome collaboration.

Related work:

@darobin
Copy link

darobin commented Feb 15, 2022

I fully support documenting this, but I would like to ask that two things be treated separately: what the server needs to do, and how it can be trusted.

The reason I ask is because there are several proposals that rely in one way or another on a trusted server, and I think that we would benefit from trying to pool the "how it can be trusted" part, ideally alongside looking at what requirements might be managed in common. Otherwise we're going to end up with a whole menagerie of servers with different properties.

I have a proposal (in need of an update, coming) that was designed for PARAKEET-like things and the PATCG has been chatting about it in support of other proposals like IPA (possibly as a hybrid with MPC). I think we could all benefit from alignment.

@p-j-l
Copy link
Author

p-j-l commented Feb 15, 2022 via email

@darobin
Copy link

darobin commented Feb 15, 2022 via email

@chris-wood
Copy link

@pjl-google before putting together any sort of requirements for a trusted server, I think we should get clarity on the problem here, and in particular what types of signals might be useful in addressing that problem. What do you think?

@p-j-l
Copy link
Author

p-j-l commented May 5, 2022

@chris-wood sounds good, I think the signals needed will be a key part of the requirements. I also haven't really had a chance to get into this work yet and I'm looking to come back to it soon.

@dvorak42
Copy link
Member

dvorak42 commented Oct 25, 2022

We'll be briefly (3-5 minutes) going through open proposals at the Anti-Fraud CG meeting this week. If you have a short 2-4 sentence summary/slide you'd like the chairs to use when representing the proposal, please attach it to this issue otherwise the chairs will give a brief overview based on the initial post.

@dvorak42
Copy link
Member

From the brief discussion of this proposal, there was some interest in trying to nail down specific signals/capabilities these servers could have that would be useful in the anti-fraud space. There was also interest in seeing how this tech was used in other APIs in the W3C and what anti-fraud problems arise out of that.

It would be good to nail down specific instances of these sorts of signal and having a side meeting/discussion on those to then bring that to another CG meeting.

There was also some interest in potential uses for this in the device score space (#16 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants