-
Notifications
You must be signed in to change notification settings - Fork 20.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core/state: move state log mechanism to a separate layer #30569
base: master
Are you sure you want to change the base?
Conversation
ec06411
to
4a25e24
Compare
After having considered it some more, I am even more convinced that the approach of #30441, adding read-hooks inside the It does not discriminate between event sources.
The "solution" to these sorts of problems would be to, in certain situations, disable the With the layered solution, there are no such complexities, as long as we can switch between the logging-statedb and the non-logging-statedb.
Switching between one and the other can probably be done in many ways, I'm open to suggestions. One way would be to have two interfaces type LoggingEnabled interface {
WithLoggingDisabled() vm.StateDB
}
type LoggingDisabled interface {
WithLoggingEnabled() vm.StateDB
} Example how that would look, going from a dual-layered logging statedb to a single-layered raw statedb: if evm.Config.Tracer != nil && evm.Config.Tracer.OnTxStart != nil {
ctx := evm.GetVMContext()
newctx := &(*ctx) // shallow copy
if sdb, ok := ctx.StateDB.(vm.LoggingEnabled); ok {
newctx.StateDB = sdb.WithLoggingDisabled()
}
evm.Config.Tracer.OnTxStart(newctx, tx, msg.From)
if evm.Config.Tracer.OnTxEnd != nil {
defer func() {
evm.Config.Tracer.OnTxEnd(receipt, err)
}()
}
} |
cae550b
to
68a0aff
Compare
Minus the ugliness regarding swapping between shimmed and non-shimmed state, this PR is mostly done. Ideas for how to make the switching nicer are appreciated |
My understanding for the main use-case of the read hooks is to collect the prestate for the transaction/call. So the ordering and how often we emit a OnReadBalance doesn't matter for the tracers. IF that were to matter you are right, then we would need to add a reason to specify what is this read about.
This was an interesting realization. I think the friction point here is statedb emitting logs for the same methods that are exposed to the tracers via a statedb instance. Honestly I don't like so much that we are exposing statedb to the tracers. We had to do it exactly to fetch prestate values. So IMO if we add read hooks we can drop the statedb. But I'd ask for opinion from users before committing to that. Generally I am ok with your approach if it allows us to keep the read hooks :) |
Honestly I don't like so much that we are exposing statedb to the tracers. We had to do it exactly to fetch prestate values. So IMO if we add read hooks we can drop the statedb.
Well, the prestate tracer is perhaps the first driving usecase, but querying state from a tracer is *very* useful and powerful. I would prefer it to remain, definitely! Trying to collect state by catching per-scope readhooks sounds like a nightmare in comparison :)
Generally I am ok with your approach if it allows us to keep the read hooks :)
With my approach as a basis, I won't object to you read-hooking the entire statedb interface.
|
I think if it's possible to remove state read access from tracers, we should do it. Full state access will become impossible later with stateless clients, so it will have to be removed at that time anyway. |
Well, I would assume users of somewhat advanced tracing to be using stateful clients. Anyway, it is a decision unrelated to this PR. It is related to the other PR, since that one doesn't have the ability to switch between logging/nonlogging statedb. If we want to remove state access, let's someone make a PR and discuss it then, IMO. @fjl , you are a type artist. Any ideas for making the hacks in this PR neater? |
@rjl493456442 implemented an alternative state overriding here: #29950. In this implementation, we switch out the backend from underneath the // Reader defines the interface for accessing accounts and storage slots
// associated with a specific state.
type Reader interface {
// Account retrieves the account associated with a particular address.
//
// - Returns a nil account if it does not exist
// - Returns an error only if an unexpected issue occurs
// - The returned account is safe to modify after the call
Account(addr common.Address) (*types.StateAccount, error)
// Storage retrieves the storage slot associated with a particular account
// address and slot key.
//
// - Returns an empty slot if it does not exist
// - Returns an error only if an unexpected issue occurs
// - The returned storage slot is safe to modify after the call
Storage(addr common.Address, slot common.Hash) (common.Hash, error)
// Copy returns a deep-copied state reader.
Copy() Reader
} But if the the core parts of So we could have e.g. I am not really sure what are the pros and cons with either approach. @rjl493456442 any thoughts? |
0a4002e
to
54d7790
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Notes:
|
Notes:
|
…te layer core/state: wip move state log mechanism in to a separate layer core/state: fix tests core/state: fix miscalc in OnBalanceChange eth/tracers: fix tests core/state: re-enable verkle witness generation internal/ethapi: fix simulation + new logging schema
…g burn core/state, core/vm: refactor statedb hooking core/state: trace consensus finalize and system calls eth/tracers/internal/tracetest: fix tests after refactor core/state: some renaming and cleanup of statedb-hooking system core/state: remove unecessary methods, implement hooked subbalance, more testing
16fa089
to
8ca0cd5
Compare
@@ -98,7 +99,10 @@ func (p *StateProcessor) Process(block *types.Block, statedb *state.StateDB, cfg | |||
receipts = append(receipts, receipt) | |||
allLogs = append(allLogs, receipt.Logs...) | |||
} | |||
|
|||
var tracingStateDB = vm.StateDB(statedb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we create a hookedStatedb in the first place, ahead of transaction execution?
Now we create a hooked one for each transaction, it looks wasteful for me, perhaps I miss something?
defer func() { | ||
evm.Config.Tracer.OnTxEnd(receipt, err) | ||
}() | ||
var tracingStateDB = vm.StateDB(statedb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we allocate a hooked statedb and reuse it for all transactions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially, I created the hooked statedb higher up in the call hierarchy, but the is that at this location, we still need the *state.StateDB
, to perform the IntermediateRoot
.
The higher we place the instantiation of hooked, we're forced to either cram more methods into vm.StateDB
interface (which I did initially), or we need to pass both the *state.StateDB
struct and the vm.StateDB
interface side by side.
I'll take another look at what to do. Reusing is nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this method needs the *state.StateDB
here, in order to create the receipt before return. We could create it in the caller, and pass both of them... but there are three separate callsites which call in here, and IMO doing this in those three locations is messier than constructing it here.
It's a pretty small object, shouldn't make that much difference IMO.
return prev | ||
} | ||
|
||
func (s *hookedStateDB) SetNonce(address common.Address, nonce uint64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rename this SetNonce
to IncreaseNonce
? The semantic always expect to increment the nonce by 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good idea!
@rjl493456442 your commit 8c7526c undoes an intentional change:
|
350d4bb
to
21f5a77
Compare
@holiman I undo my cleanup commit and push various fixes on top |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a really nice refactor. Left a couple thoughts. Looking forward to trying another statedb wrapper for witness gathering.
|
||
// hookedStateDB represents a statedb which emits calls to tracing-hooks | ||
// on state operations. | ||
type hookedStateDB struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracingStateDB?
hookedStateDB means very little to me. I would have kinda expected it to have generic hooks support instead of just tracing hooks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially called it that. But then I thought, it's using the *tracing.Hooks
. I can go with either
if !prev.IsZero() && changed { | ||
if s.hooks.OnBalanceChange != nil { | ||
s.hooks.OnBalanceChange(address, prev.ToBig(), new(big.Int), tracing.BalanceDecreaseSelfdestruct) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this is now wrong. This balancechange here only triggers for the decrease
, so presumably theres a corresponding increase
trigger somewhere. It may be that selfdestruct(self)
now triggers the +
but not the -
. I'll check it once I have a functional IDE again.. Would be good to have this covered by a testcase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this happens prior
func opSelfdestruct6780(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
if interpreter.readOnly {
return nil, ErrWriteProtection
}
beneficiary := scope.Stack.pop()
balance := interpreter.evm.StateDB.GetBalance(scope.Contract.Address())
interpreter.evm.StateDB.SubBalance(scope.Contract.Address(), balance, tracing.BalanceDecreaseSelfdestruct)
interpreter.evm.StateDB.AddBalance(beneficiary.Bytes20(), balance, tracing.BalanceIncreaseSelfdestruct)
interpreter.evm.StateDB.Selfdestruct6780(scope.Contract.Address())
Regardless of who the recipient is (self or other), and whether the contract is new or old, both decrese
and increase
fires here.
The cases that exist are
new
contract, beneficiaryother
:- decrease-event in opcode
- increase-event in opcode
- No events triggered in statedb.
new
contract, beneficiaryself
:- decrease-event in opcode
- increase-event in opcode
- triggers decrease-event in statedb + sets to zero
old
contract, beneficiaryother
:- decrease-event in opcode
- increase-event in opcode
- No events triggered in statedb.
old
contract, beneficiaryself
:- decrease-event in opcode
- increase-event in opcode
- No events triggered in statedb.
I think all of these cases are handled correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, except for one thing. The 2.iii
, when a new contract is self-destructed with self-recipient, then it should be traced as a burn
, not a balancedecreasselfdestruct
(beneficiary).
There are basically two burns: one immediate when a contract does selfdestruct-to-self, and one later on, which only ever happens if someone sends some funds to an already destructed account.
So as I see it, line 204
and 214
should use BalanceDecreaseSelfdestructBurn
.
However, it seems @s1na has already considered this and decided to not handle it that way:
// BalanceDecreaseSelfdestructBurn is ether that is sent to an already self-destructed
// account within the same tx (captured at end of tx).
// Note it doesn't account for a self-destruct which appoints itself as recipient.
BalanceDecreaseSelfdestructBurn BalanceChangeReason = 14
So, in other words, all good!
In this PR, I have moved the logging-facilities out of
*state.StateDB
, in to a wrapping struct which implementsvm.StateDB
instead.In most places, it was pretty straight-forward.
Some internal code uses the direct object-accessors to mutate the state, particularly in testing and in setting up state overrides, which means that these changes are unobservable for the hooked layer. This is fine, how we configure the overrides are not necessarily part of the API we want to publish.
The trickiest part about the layering is that when the selfdestructs are finally deleted during
Finalise
, there's the possibility that someone sent some ether to it, which is burnt at that point, and thus needs to be logged. The hooked layer reaches into the inner layer to figure out these events.In package
vm
, the conversion fromstate.StateDB + hooks
into a hookedvm.StateDB
is performed where needed.