-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validator Re-Enabling #5724
base: master
Are you sure you want to change the base?
Validator Re-Enabling #5724
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a quick pass by focusing mainly on the approach. It looks good, nice work @Overkillus!
I've left some thoughts about a corner case with the re-enabling.
bot fmt |
@Overkillus https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/7692315 was started for your command Comment |
@Overkillus Command |
Is it possible to move most of the disabling logic to It seems staking does not really need to know anything about disabling. Currently, we are maintaing two copies of disabled validator indices in both There is a larger reason as to why I am flagging this. With staking moving to AH, the offence lifecycle would be something like this:
In practice, tl;dr: Given that Session and Staking communication would become async, this disabling logic doesn’t seem compatible or, at the very least, "good design" with that in mind. |
Co-authored-by: Ankan <[email protected]>
(assuming we want to do this) I will discuss if this is a good idea later below but even if we assume we want to do this I don't believe it is appropriate to do this in this PR. This PR only aims to, with minimal changes and refactors, simply allow for validator re-enabling using all the previous already in-place logic. I want to make sure that this PR is minimal and adheres to what was already pre-agreed in the design document for disabling, to limit audit time and just be more sure that everything does exactly what we think it does. If we have plans to refactor the staking pallet (which as you mentioned will be done anyway during the port to AH) then it should be done in a separate primarily refactor/port PR.
(answering if we want to do this) Staking pallet as it stands is responsible for more than just slashing. It holds important params with regards to the active validator set. For instance min, ideal, and max validator counts. What it means to be disabled is not within the purview of the staking pallet, what it does is simply keeps track of highest offenders up to some limit and makes this info available for others. This is highly customisable and different users of this pallet can add new strategies with new limits. Since we keep those validators for an era it makes sense to keep track of their offences (potential disabling status) also for a full era. Storing this information in session when we aim to keep it for era seems like an unintuitive approach. I might agree that we could make it less opinionated by moving disabling strategies to session and instead keeping a history of all offences (unfiltered by disabling strategies) in staking. Nevertheless in this approach we still should keep track of all offences in staking. An offence is more than a slash, there might by other byproducts or consequences and we need to allow for them. In the new pipeline for offences that you suggested, where do you think a history of offences within an era should be stored? Session - Staking coupling It is okay to keep staking bounded to session, but session should not be dependant on staking. Adding logic to session that is actually era-wide in scope through hacky reads of signals from staking seems well, hacky. This information should be kept in a context that is explicitly era-wide and shared with session. TLDR: This might be a needed change but is outside of the scope of this PR and some of that information should still live in era-wide and not session-wide scopes. |
I wouldn't try to block you if you want to go ahead with this PR, especially since most of the things I flagged should ideally had been flagged with the earlier PRs, and this PR does not introduce a new design decision. That said:
Since this PR touches the logic that we know we will have to migrate in next couple of months, it is reasonable enough enough to make those changes in this PR.
I believe you only require current count of active validator set. Could you point me any param that you need specifically from
Why does offence or disabling has to relate with an era?
I think we have already tried to hack around with the concept of IMO, this is a flawed design, in addition to the fact that it also doesn’t fit well with the post-AH migration setup. |
That is what I would suggest. This impl is simply a part of an earlier design which was pre-approved by SRLabs in the design's audit. It just adds some new functionality using the same infrastructure without altering it too much.
We have the power to open multiple PRs to separate unrelated changes between the PRs. Enabling new functionality in the current design should not come with a major refactor if it is not neccesary. I am open to the refactor at a later stage but nevertheless it should be a separate PR. This PR does not make the future refactor harder or easier. I would understand withholding the change if it made the refactor harder. It does not make it harder and is orthogonal. On top of all of that this PR is a security-related fix. It has a higher priority than a genral refactor and should not be budled with it without a good reason. It might only obfuscate the auditing process. The rest of the discussion dives into why the refactor might be a good idea later on which we should separate into a separate issue or ticket. I'd be happy to participate in those discussions and help as much as I can. |
I agree with @Overkillus here. Considering that this PR is part of the disabling strategy roll out I am strongly against doing any out of scope re-factorings. What @Ank4n suggests definitely makes sense and we should do it but as a separate effort. Let's focus on the disabling strategy in this PR. |
Given that this PR doesn't make it harder to do the migration in the future, your suggestion is best handled as separate task/PR. In general I think it is not a good idea to require unrelated refactorings in the scope of a change that is concerned with security. @Ank4n please take another look and if there are no other causes of concern I would say we should merge. We want to get this change in production as soon as possible. |
@tdimitrov @sandreim The refactoring needed is not unrelated. But blocking also serves no purpose, and we can handle this in a followup issue. @Overkillus Could you check if its possible to get rid of the storage item @gpestana You may want to take a quick look at the changes as well. |
Added all the defensive suggestions from @Ank4n Good eye for spotting them, thanks!
This is not trivial. 99% of the disabling logic lives in staking so moving it over is not easy and this is a major part of the needed refactor. |
Aims to implement Stage 3 of Validator Disbling as outlined here: #4359
Features:
Testing & Security:
Closes #4745
Closes #2418