-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle errors as no-op for execution requests #14826
Conversation
@@ -84,14 +85,20 @@ func ProcessOperations( | |||
} | |||
st, err = ProcessDepositRequests(ctx, st, requests.Deposits) | |||
if err != nil { | |||
return nil, errors.Wrap(err, "could not process deposit requests") | |||
if errors.Is(err, errNilExecutionRequest) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no logs required here? because they are processed inside?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's logged inside ProcessDepositRequests
} | ||
amount := wr.Amount | ||
isFullExitRequest := amount == params.BeaconConfig().FullExitRequestAmount | ||
// If partial withdrawal queue is full, only full exits are processed | ||
if n, err := st.NumPendingPartialWithdrawals(); err != nil { | ||
return nil, err | ||
log.WithError(err).Error("Could not get number of pending partial withdrawals") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess this and the logs below would potentially print many times for what isn't really affected by the loop above. wouldn't all of these continually fail? shouldn't this just print once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably have a way to wrap and return this error so it only logs once, unless we want to print multiple times in which case we should add fields for the unique identifier of the request that caused it.
@@ -113,7 +113,8 @@ func ProcessWithdrawalRequests(ctx context.Context, st state.BeaconState, wrs [] | |||
} | |||
validator, err := st.ValidatorAtIndexReadOnly(vIdx) | |||
if err != nil { | |||
return nil, err | |||
log.WithError(err).Error("Could not get validator at index") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it exists in validator index by pubkey can it actually fail here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is needed we can provide some of the request information along with the log
3ecd5c0
to
a354632
Compare
@@ -209,7 +209,7 @@ func ProcessConsolidationRequests(ctx context.Context, st state.BeaconState, req | |||
} | |||
|
|||
if npc, err := st.NumPendingConsolidations(); err != nil { | |||
log.WithError(err).Error("failed to fetch number of pending consolidations") | |||
log.WithError(err).Errorf("failed to fetch number of pending consolidations at index %d", i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be better to use .WithField("index",i) instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Electra, when processing execution requests, any failure—except for malformed requests (i.e., nil)—should be treated as a no-op per consensus specifications. However, there's some ambiguity regarding what constitutes a failure.
I don't believe the specifications are ambiguous here. See the comment I left below. If there's an error which should never happen (eg uint64 overflow), the specifications will throw an exception. It will not just continue.
I argue that it should be treated as a no-op because triggering a state transition failure would result in the block being marked invalid and added to the invalid block cache. This could lead to a fork from other consensus layer implementations.
I think I agree with this, but I'm not entirely sure. I originally brought this up (in the doc I shared with the team) because I noticed inconsistent uses of return
and continue
in a few spots. It seemed that the dominant pattern was to return when the error should never happen. This made sense to me. But marking the block as invalid and adding it to the invalid block cache does feel dangerous. Wouldn't there be a consensus error in either situation (return vs continue)?
withdrawableEpoch, err := exitQueueEpoch.SafeAddEpoch(params.BeaconConfig().MinValidatorWithdrawabilityDelay) | ||
if err != nil { | ||
return nil, errors.Wrap(err, "failed to add withdrawability delay to exit queue epoch") | ||
log.WithError(err).Error("Could not compute withdrawable epoch") | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, this really never should happen. The specification would throw an exception here too.
...
tests/core/pyspec/eth2spec/test/electra/block_processing/test_process_consolidation_request.py:1221: in run_consolidation_processing
spec.process_consolidation_request(state, consolidation)
tests/core/pyspec/eth2spec/electra/minimal.py:5469: in process_consolidation_request
if current_epoch < source_validator.activation_epoch + config.SHARD_COMMITTEE_PERIOD:
venv/lib/python3.13/site-packages/remerkleable/basic.py:88: in __add__
return self.__class__(super().__add__(self.__class__.coerce_view(other)))
cls = <class 'eth2spec.electra.minimal.Epoch'>, value = 18446744073709551679
def __new__(cls, value: int):
if value < 0:
raise ValueError(f"unsigned type {cls} must not be negative")
byte_len = cls.type_byte_length()
if value.bit_length() > (byte_len << 3):
> raise ValueError(f"value out of bounds for {cls}")
E ValueError: value out of bounds for <class 'eth2spec.electra.minimal.Epoch'>
Edit: wrong function but it's the same idea.
Sorry, I meant client implementation.
I think a client implementation bug is more likely to happen than something going out of bounds. The point is, if a client bug exists, it shouldn't be a footgun that causes the block to become invalid. An invalid block is the absolute worst-case scenario here, and we should avoid it |
This makes sense to me. With this in mind, the PR looks good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are different kinds of errors that may happen on state transition:
- A context deadline (we should check this and not declare the block invalid in this case) The best option I see is to make this change in
validateStateTransition
directly, and instead of marking the block as invalid check first ifctx.Err() != nil
- An error on Prysm when processing, for example we can't get some fields or can't access some state required etc. These errors do not immediately indicate that the block is invalid, but continuing processing will alter the resulting state root and therefore declaring the block root as invalid later on. We should stop on these and not declare the block as invalid.
- Errors because the block actually fails to process, these should be marked as invalid.
} else if npc >= pcLimit { | ||
return nil | ||
} | ||
|
||
activeBal, err := helpers.TotalActiveBalance(st) | ||
if err != nil { | ||
return err | ||
log.WithError(err).Error("failed to fetch total active balance") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for these kinds of errors the request may be valid and we fail to get the active balance for whatever reason, by continuing we would have a consensus split anyway since the state root will fail to match. I think we need to return an error and not mark the block as invalid though.
b423c05
to
1e50745
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add tests for this scenario
for _, d := range requests.Deposits { | ||
if d == nil { | ||
return nil, errors.New("nil deposit request") | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this validation logic be in ProcessDepositRequests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of this PR is to separate out the error. ProcessDepositRequests
should returns error typed execReqErr
which that caller can handle it appropiately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree with this. ProcessOperations should not be doing any input validations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline, this is planned to be reworked in a later change.
1e50745
to
dd3091d
Compare
if electra.IsExecutionRequestError(err) { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't check this here, this leaks internal logic up the stack, instead, if you keep the error in ProcessOperations
, you can check on the caller.
} | ||
st, err = ProcessWithdrawalRequests(ctx, st, requests.Withdrawals) | ||
if err != nil { | ||
return nil, errors.Wrap(err, "could not process withdrawal requests") | ||
return nil, execReqErr{errors.Wrap(err, "could not process withdrawal requests")} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not simply deal with this logic within the core function. If the spec expects you to simply continue on these cases, why not ditch the request right there and not return an error?
We'll close this and we figured out a better solution. A new one will be opened soon |
36838e0
to
d950af3
Compare
if wr == nil { | ||
return nil, errors.New("nil execution layer withdrawal request") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove this?
for _, d := range requests.Deposits { | ||
if d == nil { | ||
return nil, errors.New("nil deposit request") | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree with this. ProcessOperations should not be doing any input validations.
d950af3
to
b8c6e72
Compare
b8c6e72
to
c86f8fa
Compare
for _, d := range requests.Deposits { | ||
if d == nil { | ||
return nil, errors.New("nil deposit request") | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline, this is planned to be reworked in a later change.
"github.com/prysmaticlabs/prysm/v5/testing/util" | ||
) | ||
|
||
func TestProcessOperationsWithNilRequests(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is a good start for testing this method.
In Electra, when processing execution requests, any failure—except for malformed requests (i.e.,
nil
)—should be treated as a no-op per consensus specifications. However, there's some ambiguity regarding what constitutes a failure:While (1) has been addressed, (2) remains unclear in terms of whether it should be treated as a no-op or a failure state transition. I argue that it should be treated as a no-op because triggering a state transition failure would result in the block being marked invalid and added to the invalid block cache. This could lead to a fork from other consensus layer implementations. Such a scenario might arise if, for example, a bug exists in an Electra helper function or if a bug is introduced in future updates. Treating all failures as no-ops, except for malformed requests, provides a much safer approach.
What this PR accomplishes:
nil
requests.nil
requests, preventing potential issues from being introduced later.