-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: propose DeputyGuardianModule improvements #167
base: main
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,125 @@ | ||
# Deputy Guardian Module Improvements | ||
|
||
## Context | ||
|
||
The Foundation Safe is given the ability to act as the Security Council Safe for a limited number | ||
of safety-net actions on OP Mainnet. For instance, the Foundation Safe can act as the Security | ||
Council Safe to trigger the Superchain-wide pause function. These powers are given to the | ||
Foundation Safe through the `DeputyGuardianModule` installed in the Security Council Safe. | ||
|
||
## Problem Statement | ||
|
||
The existing `DeputyGuardianModule` has a number of problems that we'd like to be able to solve in | ||
a simple manner that makes as few contract changes as possible. We describe each of these problems | ||
individually for clarity. | ||
|
||
### Pre-Signed Pause Transactions | ||
|
||
The Foundation Safe pre-signs certain transactions to reduce response time in case of an emergency. | ||
Because pre-signed Foundation Safe transactions are dependent on the nonces of that Safe, these | ||
pre-signed transactions become invalidated any time the Foundation Safe is required to sign | ||
literally any other transaction. In an attempt to reduce the overhead from this state of affairs, | ||
a *second* Foundation Safe sometimes referred to as the "Foundation Operations Safe" with the exact | ||
same configuration as the original Foundation Safe was made to be the Deputy Guardian. | ||
|
||
Even with this second Foundation Safe, pre-signed pauses are invalidated on a regular basis | ||
whenever an upgrade touches the `DeputyGuardianModule`. Gripes with the pre-signed pause system | ||
could fill a whole role of toilet paper and are not just limited to the issues noted above. | ||
|
||
### Unclear Responsibilities | ||
|
||
The `DeputyGuardianModule` has an unclear security model and underlying purpose. The original | ||
intent behind the `DeputyGuardianModule` was to provide a fast way to respond to potential | ||
incidents in a manner that can impact liveness but not system safety. A lack of strict clarity | ||
behind this design has meant that the `DeputyGuardianModule` has become a source of security | ||
implications that are not often easy to reason about. | ||
|
||
For instance, the `DeputyGuardianModule` currently allow the Deputy Guardian account to change the | ||
dispute game type that is respected within the `OptimismPortal` contract from the | ||
`FaultDisputeGame` to the `PermissionedDisputeGame`. The `PermissionedDisputeGame` has a security | ||
model that differs from the `FaultDisputeGame` which essentially means that the Deputy Guardian is | ||
given the ability to change the security model of the system. This is a departure from the original | ||
design of the `DeputyGuardianModule` which should only impact liveness and should not modify the | ||
security model of the system. | ||
|
||
## Proposed Solution | ||
|
||
We propose resolving the above problems by modifying the existing `DeputyGuardianModule` to be more | ||
restrictive in the actions it can carry out while also being more permissive in who can carry out | ||
these actions. | ||
|
||
The `DeputyGuardianModule` would be permitted to: | ||
|
||
1. Trigger the Superchain-wide pause. | ||
1. Blacklist an individual dispute game. | ||
|
||
The `DeputyGuardianModule` would **no longer** be permitted to: | ||
|
||
1. Undo the Superchain-wide pause ("unpause"). | ||
1. Change the respected game type. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This creates a serious gap in our ability to respond to issues with the fault dispute system. We would need to escalate any issue affecting a significant number of games to a superchain wide pause which we had previously explicitly designed to avoid. It's important to note that changing the respected game type can only switch to a game implementation previously approved by governance and deployed by the security council. I can see the argument that it changes the security model but I think there's an equally valid view that it doesn't because the permissioned game is always a part of the system and thus part of its security model. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does your opinion here change if we add the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not really. I guess we could call that regularly until the security council changes the game type but it doesn't seem like a good incident response process. The case I'm thinking about is where there's a bug in the fault dispute game that someone is using to continuously cause invalid output roots to resolve as valid. I guess given timelines we could call There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I could be convinced of allowing the deputy guardian to switch the respected game type, need to think about it a bit more. My primary concern is that if we're making the deputy guardian more permissive then giving it the ability to change the respected game type becomes a bit more dangerous. A malicious/leaked deputy key would be able to swap the respected game type back to the permissionless game which would probably force you to trigger the pause to avoid an endless back-and-forth. That said I think the likelihood of this happening is relatively low. My short-term proposal would then be that we give the deputy guardian access to everything except for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was going to come up with a broader proposal but I think the reality is that with Stage 1.4 + interop it's possible that the landscape around security responses looks very different in 6 months, so I'd be happy with a simplified/permissive deputy guardian for now, even if it's not ideal. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could also potentially allow the deputy guardian module to switch the respected game type to permissioned but not allow it to switch back to permissionless. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's actually a pretty good compromise. When kona is deployed we would typically want to switch to it instead of permissioned if we can so need to think through if that should be allowed by the deputy guardian or if we're ok with requiring the security council for that. e.g. we could allow the deputy guardian to change the respected game arbitrarily unless it is already the permissioned game. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Alright going to update this proposal with what we've talked about so far. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Isn't that exactly the issue that we're trying to fix? ie. allowing the guardian to change the state transition is a safety, not liveness, consideration. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Although I agree with this in principle, I'm trying to find a middle ground that improves on the status quo right now without needing to make a bunch of contract changes |
||
1. Set the anchor state in the `AnchorStateRegistry`. | ||
smartcontracts marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The `DeputyGuardianModule` would be accessible to: | ||
|
||
1. Any member of the Security Council Safe. | ||
1. An EOA operated by the Optimism Foundation. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we make this scalable to |
||
|
||
Please note that this proposal supercedes | ||
[Design Doc 162](https://github.com/ethereum-optimism/design-docs/pull/162) that introduced the | ||
concept of the `DeputyPauseModule`. | ||
|
||
### Reasoning | ||
|
||
The above changes make the role of the `DeputyGuardianModule` significantly more obvious. The | ||
`DeputyGuardianModule` is now designed to be a fast response mechanism that can ONLY impact | ||
liveness and cannot impact safety. | ||
|
||
1. We remove the ability to unpause the system because this makes it possible for the | ||
`DeputyGuardianModule` to diminish safety by unpausing the system when it was explicitly placed | ||
into a paused state for security reasons. | ||
1. We remove the ability to change the respected game tybe because this allows the | ||
`DeputyGuardianModule` to modify the security model of the system and does not align with the | ||
goal of limiting the module to liveness-impacting actions. | ||
1. We remove the ability to set the anchor state within the `DisputeGameFactory` because this can | ||
potentially impact safety if the provided anchor state game is sufficiently old. | ||
|
||
## Alternatives Considered | ||
|
||
### Deputy Pause Module | ||
|
||
This proposal supercedes a previous proposal that would have installed a `DeputyPauseModule` into | ||
the Foundation Safe. We would be able to get the benefits of this proposal while also simplifying | ||
the role of the Deputy Guardian if we accepted this proposal instead. | ||
|
||
## Risks & Uncertainties | ||
|
||
### Audits | ||
|
||
Any module will need to be carefully audited. The `DeputyGuardianModule` was previously audited but | ||
would need to be re-audited as a result of these changes. | ||
|
||
### Leaked Deputy | ||
|
||
Our worst-case scenario is a leaked deputy private key. Since the `DeputyGuardianModule` would now | ||
permit any member of the Security Council Safe or a dedicated Foundation EOA to trigger | ||
liveness-impacting actions, such a leak would likely cause a temporary liveness failure. A majority | ||
of the Security Council Safe would need to coordinate a transaction to remove or replace the | ||
offending account and resolve the liveness failure. | ||
|
||
### Compromised Deputy | ||
|
||
A motivated attacker with access to a deputy's private key could try to drain the deputy's wallet | ||
constantly to prevent it from being used to trigger transactions. Although this could be | ||
side-stepped with private RPC/bundling tools, it would not be considered good practice to include | ||
live third-party infrastructure in the hot path for critical security actions. It therefore seems | ||
prudent that the deputy be able to act via signature instead of directly via `msg.sender` check. | ||
|
||
If we allow the deputy to act via signature then we must also make sure that the signature includes | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How different is this from just using 1/1 SAFEs instead of raw EOA addresses? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If 1/1 safe is compromised then owner can be changed and you lose access to pause. Using raw EOA makes this impossible. |
||
some sort of nonce to prevent the same signature from being used more than once. | ||
|
||
### Process Updates | ||
|
||
Various processes and runbooks will need to be updated to reflect this new state of affairs. We'll | ||
be happy, but it will take some effort. All of these updates will need to be made before we | ||
actually go live with these changes and we should run drills on testnet and propose that drills be | ||
run on mainnet. |
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about changing what the DGM can do, but not also changing what that Guardian role is capable of doing. IMO it is much cleaner to maintain the property that "Anything the Guardian can do, the DGM can do", so I'd prefer to see us move any safety impacting actions (ie. unpausing, fallback activation) to the Upgrade Controller (2of2 Safe).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would violate the condition that L2Beat wants that the Security Council alone can unpause the system. I fully agree with you but that's the constraint I was operating under.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally speaking, my goal with this proposal is to solve a lot of problems in the short-term and then come back to this problem in ~6 months when the landscape has changed a bit with Stage 1.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, what I'm advocating or is to also update the auth in the system.
It should not be too much of a lift once the Isthmus contract changes are in, as they introduce a new
upgrader
role that is stored in theSuperchainConfig
, so all you'd need to do is update the auth check inunpause()
andsetRespectedGameType()
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree with this in principle, my goal with this proposal is to be able to ship something into prod ASAP and then work on these more in-depth changes later