core/tracing: v1.1 #30441

s1na · 2024-09-16T08:46:58Z

Implements #30356

s1na · 2024-09-16T11:05:26Z

core/tracing/hooks.go

+	NonceReadHook = func(addr common.Address, nonce uint64)
+
+	// CodeReadHook is called when EVM reads the code of an account.
+	CodeReadHook = func(addr common.Address, code []byte)


Open question: should we add codeHash here to be consistent with OnCodeChange?

core/tracing/CHANGELOG.md

s1na · 2024-10-08T16:21:14Z

Ah seems like the journal has a crasher:

revisions: [{0 2} {1 4} {2 4} {3 4} {4 6} {5 6} {6 9} {7 11} {8 12} {9 18} {10 18} {11 20} {12 22} {13 24} {14 24} {18 27}]
panic: revision id 17 cannot be reverted

goroutine 10470 [running]:
github.com/ethereum/go-ethereum/core/tracing.(*journal).revertToSnapshot(0xc050c41c70, 0x11, 0xc0570360e0)
        github.com/ethereum/go-ethereum/core/tracing/journal.go:170 +0x185
github.com/ethereum/go-ethereum/core/tracing.(*journal).OnExit(0xc050c41c70, 0x0, {0xc13a87fe30, 0x64, 0x64}, 0x48dc9, {0x203f680, 0xc018bcc978}, 0x1)
        github.com/ethereum/go-ethereum/core/tracing/journal.go:206 +0x6f
github.com/ethereum/go-ethereum/core/vm.(*EVM).captureEnd(0xc13a9e0780?, 0x0, 0x12e208, 0xe543f, {0xc13a87fe30, 0x64, 0x64}, {0x203da40, 0x2e05070})

core/tracing/journal_test.go

karalabe · 2024-10-10T09:11:27Z

core/tracing/CHANGELOG.md

+
+### New methods
+
+- `OnReorg(reverted []*types.Block)`: This hook is called when a reorg is detected. The `reverted` slice contains the blocks that are no longer part of the canonical chain.


Here types block is very very heavy. You should at most pass headers and allow chain access to pull the blocks on demand (chain access in someconstructor, ha)

On second thought what is the issue? it is a slice so passed by reference and the memory can be freed as soon as OnReorg processing is done.

Ugh, this is annoying. So reorg in the blockchain at some point in the past used to collect blocks. Turned out that sometimes it became insanely heavy and we've switched so it operates on headers. I guess later someone refactored it back to operate on blocks again. This is an issue when you do setHead or any similar operation; of even if finality fails for a while and you have blocks reorging back and forth. It's very very bad to pull all the block in from disk IMO.

CC @holiman @rjl493456442 ?

I agree. I don't particularly recall switching from headers to blocks....

core/tracing/hooks.go

s1na · 2024-10-10T09:23:23Z

core/tracing/hooks.go

@@ -194,6 +221,30 @@ type Hooks struct {
 	OnCodeChange    CodeChangeHook
 	OnStorageChange StorageChangeHook
 	OnLog           LogHook
+	// State reads
+	OnBalanceRead  BalanceReadHook
+	OnNonceRead    NonceReadHook


Question from triage: how exactly is OnNonceRead used?

holiman

I don't see any need for these: OnBalanceRead etc. It adds non-generic handlers for certain opcodes, but a more generic solution already exists, using the per-opcode step function.

Here's the old prestate tracer js:

	// step is invoked for every opcode that the VM executes.
	step: function(log, db) {
		// Add the current account if we just started tracing
		if (this.prestate === null){
			this.prestate = {};
			// Balance will potentially be wrong here, since this will include the value
			// sent along with the message. We fix that in 'result()'.
			this.lookupAccount(log.contract.getAddress(), db);
		}
		// Whenever new state is accessed, add it to the prestate
		switch (log.op.toString()) {
			case "EXTCODECOPY": case "EXTCODESIZE": case "EXTCODEHASH": case "BALANCE":
				this.lookupAccount(toAddress(log.stack.peek(0).toString(16)), db);
				break;
			case "CREATE":
				var from = log.contract.getAddress();
				this.lookupAccount(toContract(from, db.getNonce(from)), db);

The existing way to it is arguably slower, since it's on the hot-path and invoked on every opcode. We could mitigate that, if e.g. tracers declare a whitelist of ops that they are interested in (e.g. optionally expose a method which spits out a list).

The existing way is perhaps a bit clunky, in that it's up to the tracer to make sense of the stack arguments, but otoh the stack arguments are not something that is changed frequently, since it's consensus-critical, and can only be changed in hardforks.

It's also a bit clunky to see the poststate: for op X, you see the stack prior to the execution of X. In order to see the stack after , you need to check on the next op too. Which might be difficult, especially if we have whitelisted X only -- but we could improve this too, e.g. by using a returnvalue saying hey I want to be notified about the next op too.

All in all, I think we should iterate on the existing generic solution, and not litter the code with these hooks.

karalabe · 2024-10-10T09:48:56Z

core/tracing/hooks.go

@@ -134,6 +136,9 @@ type (
 	// GenesisBlockHook is called when the genesis block is being processed.
 	GenesisBlockHook = func(genesis *types.Block, alloc types.GenesisAlloc)

+	// ReorgHook is called when a segment of the chain is reverted.
+	ReorgHook = func(reverted []*types.Block)


The consensus was to drop Reorg hook for now because it's not clear what the best API would be. We can shit the rest and then iterate on this one with whoever wants to use it before comitting.

daleksov · 2024-10-10T12:53:37Z

I don't see any need for these: OnBalanceRead etc. It adds non-generic handlers for certain opcodes, but a more generic solution already exists, using the per-opcode step function.

Here's the old prestate tracer js:
	// step is invoked for every opcode that the VM executes.
	step: function(log, db) {
		// Add the current account if we just started tracing
		if (this.prestate === null){
			this.prestate = {};
			// Balance will potentially be wrong here, since this will include the value
			// sent along with the message. We fix that in 'result()'.
			this.lookupAccount(log.contract.getAddress(), db);
		}
		// Whenever new state is accessed, add it to the prestate
		switch (log.op.toString()) {
			case "EXTCODECOPY": case "EXTCODESIZE": case "EXTCODEHASH": case "BALANCE":
				this.lookupAccount(toAddress(log.stack.peek(0).toString(16)), db);
				break;
			case "CREATE":
				var from = log.contract.getAddress();
				this.lookupAccount(toContract(from, db.getNonce(from)), db);
The existing way to it is arguably slower, since it's on the hot-path and invoked on every opcode. We could mitigate that, if e.g. tracers declare a whitelist of ops that they are interested in (e.g. optionally expose a method which spits out a list).

The existing way is perhaps a bit clunky, in that it's up to the tracer to make sense of the stack arguments, but otoh the stack arguments are not something that is changed frequently, since it's consensus-critical, and can only be changed in hardforks.

It's also a bit clunky to see the poststate: for op X, you see the stack prior to the execution of X. In order to see the stack after , you need to check on the next op too. Which might be difficult, especially if we have whitelisted X only -- but we could improve this too, e.g. by using a returnvalue saying hey I want to be notified about the next op too.

All in all, I think we should iterate on the existing generic solution, and not litter the code with these hooks.

Hi @holiman,

We're really excited about the live tracing feature and see immense value in it, especially for our specific use case. Currently, we fetch blocks from nodes (clients) in a polling fashion and re-execute them using a customized EVM that performs more detailed tracing. By utilizing live tracing directly on the node (client), we can significantly boost both performance and correctness, and it would allow us to completely remove the re-execution and re-processing logic from our pipeline. This is why having more explicit hooks, like OnBalanceRead and others, is crucial for us. These hooks would allow us to optimize our tracing workflow, making it more efficient and accurate, which is why we strongly favor this approach and would love to keep it in place.

Here are some of the key benefits we see in favor of keeping the more explicit hooks:

Accurate State Tracking
Explicit read hooks ensure immediate and precise state initialization (like balances and nonces) during live tracing. Without them, we'd have to manually infer state access from opcodes, adding complexity and increasing the chance of errors.

Separation of Concerns
By using explicit read hooks, we separate state management from opcode handling, keeping the code cleaner, more modular, and easier to maintain. This avoids cluttering the opcode logic and reduces the risk of introducing bugs.

State Consistency
These hooks capture essential pre-state information (before any changes happen), which is crucial for our use case, ensuring accurate comparisons between pre- and post-execution states, especially for debugging and analysis which is essential for us and all our customers.

Performance Optimization
Explicit read hooks allow us to focus on relevant state interactions without needing to manually parse the stack for every opcode. This simplifies the logic and reduces performance overhead on our tracer side by handling only the necessary state accesses.

Future-Proofing:
As Ethereum evolves, explicit read hooks for fundamental state elements like account balances and nonces provide the flexibility to handle new state access patterns, even in the event of future hard forks. This ensures that the tracer can adapt without requiring major changes to the code, allowing it to remain compatible with protocol updates and any state access modifications introduced through hard forks.
Additionally, maintaining this pattern helps ensure consistency across different Go-Ethereum forks, forcing them to support live-tracing without breaking its functionality, thereby preserving compatibility across ecosystems.

Transition from Full Archive to Full Node
The most beneficial aspect for us is the ability to move from a full archive node to a full node. Through live tracing, we can store state information and re-execute transactions that are older than 128 blocks. This is especially important as full archive support is being phased out on our side, but with read hooks, we can still access the necessary data without needing a full archive node. This makes live tracing an ideal solution for our use case.

Dalibor from Tenderly crew!

holiman · 2024-10-10T13:12:30Z

Explicit read hooks ensure immediate and precise state initialization (like balances and nonces) during live tracing. Without them, we'd have to manually infer state access from opcodes, adding complexity and increasing the chance of errors.

For nonce, a nonce is opaque from the evm execution (it is implicitly visible whenever a contract is created via CREATE, where the address depends on the nonce). Why do you want a read access for that? It is only ever modified during contract-creation or during state processing, when the transaction sender nonce is increased.

For balance, there's SELFBALANCE and EXTBALANCE (a.k.a BALANCE). These take an address on the stack, and leave the balance on the stack. It's pretty straight-forward, with the caveat that you'd want to capture both the pre-exec (inputs) and post-exec (outputs). Alternativly, you can ignore the post-exec, and simply fetch the balance at this point, which would fulfill the requirement: "essential pre-state information (before any changes happen), which is crucial for our use case,"

holiman · 2024-10-10T13:38:34Z

Note, if we can get something like this to work, then I'm a lot more open to having all sorts of hooks: #30569 I don't like the deep integration, but if it's possible via a separate layer then "let's go wild" imo

s1na · 2024-10-14T04:49:47Z

@fjl regarding the backwards-compatibility I have for now added a OnSystemCallStartV2 in the same hooks object. What do you think?

s1na · 2024-10-17T09:25:42Z

I dropped OnReorg and merged in changes from master.

s1na added 7 commits August 26, 2024 15:45

core/tracing: add vm context to system call hook

8659e68

core/tracing: add GetCodeHash to statedb interface

b4e0174

core/tracing: emit state change events for journal reverts

f670a7f

core/tracing: add hook for reverted out blocks

cf873c3

log selfdestructs balance revert

365b715

Add state read hooks

aac4024

add tracing journal

dbe5f83

s1na requested review from karalabe, holiman and rjl493456442 as code owners September 16, 2024 08:46

s1na commented Sep 16, 2024

View reviewed changes

s1na added 8 commits September 16, 2024 13:26

update changelog

b87c4fe

fix indent

702a42f

add block hash read hook

c915bed

resolve merge conflict

838fc25

fix code and nonce param order

1cc58cf

update test

3c58155

pass-through non-journaled hooks

501f302

missed two hooks

1a64297

maoueh reviewed Oct 5, 2024

View reviewed changes

core/tracing/CHANGELOG.md Show resolved Hide resolved

s1na added 2 commits October 8, 2024 20:09

fix journal cur rev Id

1862333

add note on balanceChangeRevert reason

6650000

s1na added the status:triage label Oct 9, 2024

refactor WrapWithJournal to use reflection

d9de74e

karalabe reviewed Oct 10, 2024

View reviewed changes

core/tracing/journal_test.go Show resolved Hide resolved

karalabe reviewed Oct 10, 2024

View reviewed changes

holiman reviewed Oct 10, 2024

View reviewed changes

core/tracing/hooks.go Show resolved Hide resolved

s1na commented Oct 10, 2024

View reviewed changes

core/tracing/hooks.go Show resolved Hide resolved

s1na commented Oct 10, 2024

View reviewed changes

holiman reviewed Oct 10, 2024

View reviewed changes

karalabe reviewed Oct 10, 2024

View reviewed changes

s1na added 2 commits October 10, 2024 12:44

add license to journal_test

d2ba76f

add desc for revert change reason

a2ca5f8

s1na removed the status:triage label Oct 10, 2024

holiman mentioned this pull request Oct 11, 2024

core/state: move state log mechanism to a separate layer #30569

Open

add OnSystemCallStartV2

85a85d0

s1na added 3 commits October 17, 2024 11:07

drop OnReorg

2754b41

rm newline

92337d8

Merge branch 'master' into tracing/v1.1

36b4194

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core/tracing: v1.1 #30441

core/tracing: v1.1 #30441

s1na commented Sep 16, 2024

s1na Sep 16, 2024

s1na commented Oct 8, 2024

karalabe Oct 10, 2024

s1na Oct 14, 2024

karalabe Oct 14, 2024

holiman Oct 14, 2024

s1na Oct 10, 2024

holiman left a comment

karalabe Oct 10, 2024

daleksov commented Oct 10, 2024

holiman commented Oct 10, 2024

holiman commented Oct 10, 2024 via email

s1na commented Oct 14, 2024

s1na commented Oct 17, 2024


		### New methods

		- `OnReorg(reverted []*types.Block)`: This hook is called when a reorg is detected. The `reverted` slice contains the blocks that are no longer part of the canonical chain.

core/tracing: v1.1 #30441

Are you sure you want to change the base?

core/tracing: v1.1 #30441

Conversation

s1na commented Sep 16, 2024

s1na Sep 16, 2024

Choose a reason for hiding this comment

s1na commented Oct 8, 2024

karalabe Oct 10, 2024

Choose a reason for hiding this comment

s1na Oct 14, 2024

Choose a reason for hiding this comment

karalabe Oct 14, 2024

Choose a reason for hiding this comment

holiman Oct 14, 2024

Choose a reason for hiding this comment

s1na Oct 10, 2024

Choose a reason for hiding this comment

holiman left a comment

Choose a reason for hiding this comment

karalabe Oct 10, 2024

Choose a reason for hiding this comment

daleksov commented Oct 10, 2024

holiman commented Oct 10, 2024

holiman commented Oct 10, 2024 via email

s1na commented Oct 14, 2024

s1na commented Oct 17, 2024