Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Side-effect-free Yul interpreter #15464

Open
wants to merge 89 commits into
base: develop
Choose a base branch
from

Conversation

quangloc99
Copy link
Contributor

This interpreter implementation is following this specification from #15435 , and is intended to use in #15358 .

This PR has followed spec for the implementation.

For testing, the existing YulInterpreterTest was not used. This is because this pure interpreter has no memory/storage access, it can not use the same trace as the current interpreter from 'test/libyul/tools'. But to keep it minimal, the YulInterpreterTest test suit was copied and applied minimal changes. The new test suit now uses the function call trace and the outter most variable values for snapshot comparison. Most of the existing test from the YulInterpreterTest was ported to the new test suit. New tests were also added to test all return status, as well as to test all evm instructions.

Copy link

Thank you for your contribution to the Solidity compiler! A team member will follow up shortly.

If you haven't read our contributing guidelines and our review checklist before, please do it now, this makes the reviewing process and accepting your contribution smoother.

If you have any questions or need our help, feel free to post them in the PR or talk to us directly on the #solidity-dev channel on Matrix.

Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I won't manage to do a proper review until next week, but for now here's some initial feedback from a quick pass over it.

Comment on lines 297 to 305
static std::set<std::string> const NON_INSTRUCTION_BUILTIN_NAME = {
"datasize",
"dataoffset",
"datacopy",
"memoryguard",
"loadimmutable",
"setimmutable",
"linkersymbol"
};
Copy link
Member

@cameel cameel Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of hard-coding those, you should just check that the instruction field in BuiltinFunctionForEVM is empty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change as suggested. I was just afraid that there might be builtin other than instructions that can be evaluated in the future. This part was added to have the same runtime warning effect as I was doing with the switch exhaustive check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is, we'll quickly find out in tests and fix it, so don't worry about it.

The only other thing are IIRC the dynamically generated verbatim functions, but you reject those as well.

Comment on lines +58 to +63
m_config.maxTraceSize = m_reader.sizetSetting("maxTraceSize", 128);
m_config.maxExprNesting = m_reader.sizetSetting("maxExprNesting", 64);
m_config.maxSteps = m_reader.sizetSetting("maxSteps", 512);
m_config.maxRecursionDepth = m_reader.sizetSetting("maxRecursionDepth", 64);

m_printHex = m_reader.boolSetting("printHex", false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For testing, the existing YulInterpreterTest was not used. This is because this pure interpreter has no memory/storage access, it can not use the same trace as the current interpreter from 'test/libyul/tools'.

That's actually easy to solve. Just add a setting you can set to mark a test as one that requires access to memory or storage. Then add it to those tests that do not pass without it. As you can see in the snippet above, adding settings is pretty easy.

For this we could call the setting sideEffectFree and it true by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand how to mark a test that requires memory access. However, I’d still insist to keep the two test suites separate. While they do share the parsing part, the settings and trace printing are quite different. I’m concerned that merging the two might make it harder to manage the tests.

I also believe the two interpreter implementations might grow in different directions, so separating the test suites makes more sense to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're ok with leaving the interpreters separate for now to avoid bringing over unnecessary baggage from the test interpreter, the test suite should definitely be shared. Even if the implementations diverge significantly, they ultimately both use Yul as their interface so it should not matter for testing. The same set of tests should be usable for both and I strongly think this is the point we should be starting from. Seeing the new interpreter pass the existing tests (at least the ones it can run) will be an additional confirmation that it works correctly.

I also don't see why settings and trace would have to be substantially different. If the new ones are better, we should just change them in the old test suite as well. I'd suggest actually extracting that into a PR of its own, which we'd merge first.

Copy link
Contributor Author

@quangloc99 quangloc99 Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the testing interface is the same. I will adjust the test suit to use the existing one.

The settings is also not that different. The only notable settings that the pure interpreter has is the max recursion depth. I decided to add this, because this interpreter will be used while walking the AST. This allows the code to be interpreted with a big maxStep, while stack memory is still protected from overflowing.

The trace is definitely different. The trace of the interpreter in test/tools consists the following:

  • Storage layout
  • Memory layout
  • Transient layout
  • History traces about memory access, log

The pure interpreter does not evaluate instructions that alter memory/storage, or produces log. Therefore such trace can not be created. I have changed the trace in to the call trace and the values of the outer most variables.

The new trace can be included into the test/tools interpreter, but not the other way around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another suggestion. I will merge the two testsuit together. For each test, I will print the traces from both interpreter.
How does this sound?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that this will be a hard-coded limit in the compiler, so why use a different one in tests?

I forgot that we indeed hardcode the limit in the compiler. In the test, I tried to make it as flexible as possible. But since we'll hardcode anyway, I'll remove settings and test with a known limit.

We can make the pure interpreter always print those the same way the current interpreter does when it's running on pure code.

I still dont see the point of this. The trace of the current interpreter is all about state changes. The pure interpreter does not touch the state, so there is no such trace to print.

Or we could make the existing interpreter not show this when the traces are empty.

In case of empty trace, the current interpreter does have zero trace about state change. But by default, the pure interpreter also have no trace about state change as well. That's why I added another kind of trace so we can see if it works correctly.

For each test, I will print the traces from both interpreter.

But one will always be empty so why even print it?

Not exactly.

  • For the case the code has state change, the pure interpreter will still have the call trace up to the point where the state change happens, and the ImpureInstructionEncountered will be reported.
  • For the case where there is no state change, the current interpreter will not have any trace. But we know that for this test, the current interpreter will still do its job.

My idea for the suggestion to print both traces is to test both interpreter at the same test. Both interpreter can run the same yul code, just with different terminated status and traces. And I think this work wrll for the PR: it will use the same testsuit, but it does not need to touch the code for the current interpreter.

Copy link
Member

@cameel cameel Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so it looks like your trace and the trace in the current interpreter are different things. I assumed they were the same and that your trace was just a subset of the old one. Looks like instead you're recording function calls, while the current interpreter records builtins?

Another thing is that I assumed that the trace was a feature of the test case, not the interpreter itself. It's a debug feature, and this made sense in an interpreter built for testing, but for executing correct code in the optimizer that's pure overhead. I think that this should be done by having the base interpreter provide virtual functions as hooks, and having a derived test-only variant that stores the trace.

Overall, there's no technical reason why both interpreters couldn't provide the same output and IMO the best solution would be to just extend the current interpreter to report function calls too. But ok, that's extra work, which I didn't expect this would require so I guess just printing one kind in one interpreter and the other kind in the other is acceptable for this PR. We should give them different names though, because they're really different things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for understanding!

... so I guess just printing one kind in one interpreter and the other kind in the other is acceptable for this PR

With this, I'll merge the 2 test-suits into one, and print both traces for the same test case.

We should give them different names though, because they're really different things.

I am not sure whose name are you referring to. You mean the 2 interpreters should have different names, or their traces should have different names?

... IMO the best solution would be to just extend the current interpreter to report function calls too

That what I have originally done in #15358 🙏 . I moved the interpreter from test/tools into libyul and have the pure interpreter extending that interpreter. I did not change the trace though but that can also be done.

... but for executing correct code in the optimizer that's pure overhead

I am positive that my way of printing trace do not have overhead. Here is godbolt the demo for my trace. If we hardcode maxTraceSize to 0, compiler should eliminate all addTrace calls.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure whose name are you referring to. You mean the 2 interpreters should have different names, or their traces should have different names?

Traces. Just make it clear that one shows the function call stack while the other one is the record of evaluated builtins.

That what I have originally done in #15358 🙏.

Yeah, and that still would be an option if you want to do it. But I don't want to add too much extra work here. Just printing both things is fine for now too.

If we hardcode maxTraceSize to 0, compiler should eliminate all addTrace calls.

That's good, bit I'd still rather have the trace functionality only in the test case. The less complexity in the actual implementation, the better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, my suggestion would be to submit your test cases and the changes in the YulInterpreterTest as a separate PR so that we can review and merge that first. They should already all pass on the existing interpreter. Then we can continue with this PR.

// Increment step for each loop iteration for loops with
// an empty body and post blocks to prevent a deadlock.
if (_forLoop.body.statements.size() == 0 && _forLoop.post.statements.size() == 0)
if (auto terminated = incrementStatementStep()) return *terminated;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some general notes on style to get that out of the way:

  • We always put return on a new line.
  • Please also avoid abbreviations in names. Things like fun, cnt, x, g, vec, res are hard to read and often ambiguous. TBH, we have a lot of bad names already since the code is old and it's also hard to enforce, so some of that might have been just copied, but when writing new code, please try to avoid them.
  • Constructor spacing: ExecutionOk{ControlFlowState::Default};
  • 4-space indents, including in .yul files.
  • You don't have to include the empty string as message in asserts. We made it optional at some point, but did not strip it from everywhere. When writing new code, you can omit the message.
  • I'd avoid making files hidden (with a leading dot in the name).
  • In Python code we use the same style for splitting long calls as in C++. I.e.
        gen_test(
            'sgt',
            param_cnt=2,
            calc=lambda p: u2s(p[0]) > u2s(p[1])
        )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for commenting on the styling. For this part, I have a question about the formatter.

For C++ code style, I see that there is .clang-format configuration included. However, using clang-format changes the code drastically. So I deliberately turned off the formatter. Should I use clang format for the new code, or should I try to adjust the code manually? And also the same question applied to Python code.

The is the only questions. I will address the other comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the code style, is it OK to use macro for the terminated case? I think that check is repetitive, and because it also return immediately, I can only think of macro to clean up the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I use clang format for the new code, or should I try to adjust the code manually?

No, that's an incomplete config that does not fully reflect our style unfortunately. There is an effort underway to get to a workable clang config, but for the time being we format the code manually.

And also the same question applied to Python code.

For Python we also generally format manually, but our style is generally very close to the one used by black, so if you use that, it should be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have resolved most of the comments for the styling. Hopefully I did not miss any renaming.

I still have the pending question on using macro.

And also I'm not sure where to put the test generator for the pure instructions. My idea was that I want to test the calculation of those instructions across some test, against a different implementation (in this case is my Python test generator). I am not sure if there a better way to do this. For now it seems to me that putting the test generator near the tests is still the best choice. Please give some advice for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have the pending question on using macro.

Sorry, I saw it, but forgot to answer before moving on to the next comment.

I'm not sure about that one yet. I'm not a fan of hiding control flow like this and it seems like there aren't that many places where you check this, so I don't see it as a big problem. It seems quite manageable, especially since the interpreter is the only place where this style is used. Maybe a helper to simplify the check bit would be enough?

EvaluationResult result = evaluate(_expressionStatement.expression, 0);
if (terminated(result))
	return result;

On the other hand, the old interpreter used exceptions and that also hides the flow, so a macro may not be worse than that. I do agree that these checks seem tedious and easy to forget so I wouldn't completely reject the idea. An ideal solution would be something like ? in Rust, but we don't really have an equivalent in C++. For now, the helper approach seems to me like it might be a good compromise.

And also I'm not sure where to put the test generator for the pure instructions.

Not sure about this one either. We don't have a standard practice for this. We generally put scripts into the scripts/ directory, but then that would break the connection between the generator and those tests. On the other hand, even if it's alongside the tests, I doubt we'll generally pay attention to it in the long term anyway. That's just what happens with these things - we'll likely end up with manual modifications in these tests over time and you won't be able to safely regenerate them without losing those modifications. It may be best to treat the script more as documentation on how they were initially generated rather than something that we'll commit to maintaining in the repo. As such it might be better to just post the script in a comment in the PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the macro part, I'll use macro.

An ideal solution would be something like ? in Rust, but we don't really have an equivalent in C++.

Well this is one reason why I proposed using boost's outcome instead of std::variant. They have well documented TRY macro that looks almost natural as Rust ?. If you like this idea then I guess I can use it too.

For the python test generator, I'll put it into the comment for the corresponding test it generated.

Copy link
Member

@cameel cameel Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this is one reason why I proposed using boost's outcome instead of std::variant. They have well documented TRY macro that looks almost natural as Rust ?. If you like this idea then I guess I can use it too.

Let's try that then. I actually do like the idea of trying something from a well established library for this rather than hand-implementing it. It has a better chance of being recognized as something conventional and not being as surprising despite the unusual control flow.

For the python test generator, I'll put it into the comment for the corresponding test it generated.

Regarding that, one alternative solution that came to my mind is that we could have one of the CI jobs (e.g. the one that tests Python scripts, since that's guaranteed to have Python) run the script and fail if changes are detected. With this it would actually be acceptable to have it in the repo, because we could rely on people not editing these tests randomly.

I'd also make the script insert a comment at the beginning of the test, saying that it was automatically generated and how to rerun it.

Comment on lines 284 to 285
std::vector<Expression> const& /* _arguments */, // This was required to execute some builtin.
// But all of them are impure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are they though? I'd expect dataoffset(), datasize() and linkersymbol() at least to be pure. We won't be able to evaluate them without information that's only available at bytecode generation time, but technically they end up being just constants :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the point. I think my comment here is misleading too, so I will change that.

I think for the implementation, we can try returning special values for these cases. But that sounds more complicated, and it is go beyond the point of code interpretation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably fine not to do anything about it. I don't think we'll be able to actually implement those. Just wanted to correct the comment :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the comment about impure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can actually remove the whole argument. It was used for literal arguments of datasize() and dataoffset(), which the new interpreter does not support.

Literal arguments are basically compile-time-only values. They're not physically stored on the stack and may not even fit in a slot. This means we can't pass them around as u256. When the value is needed, it is obtained from the relevant AST node instead. Here it will never happen though.

for (size_t i = 0; i < functionDefinition.returnVariables.size(); ++i)
variables[functionDefinition.returnVariables.at(i).name] = 0;

std::unique_ptr<PureInterpreter> interpreter = makeInterpreterCopy(std::move(variables));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it lives only until the end of the function. Why even allocate it on the heap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The heap allocation was copied from the original interpreter. Originally I did not think much about it, but then I think making heap allocation should help save more stack space, so we can allow more recursion calls.

But I also added the config to limit the number of recursion calls, so if you think this part does not have much effect on the stack, then I can change it to allocation on stack.

Copy link
Member

@cameel cameel Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that was my impression too, but then if stack space was a real concern then it's only a half-measure and a proper solution would be to use a non-recursive implementation instead. I think we should just put it on stack and revisit if it turns out to be an actual problem. We should have a test that uses nested calls up to the limit to verify that.

In practice we won't have much nesting anyway, so we could live even with a very low limit.

You'll probably find more questionable decisions like this in the old implementation. If it does not seem justified, we want it simplified. The general idea here is to treat it as a new implementation that's carefully reviewed from scratch.

Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a full review yet, but here's a few things I found looking through the implmentation today.

Mostly suggestions about things that could be simplified or removed.

Comment on lines +52 to +53
using namespace solidity::evmasm;
using evmasm::Instruction;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These don't seem to be necessary. Could be removed both from there and from the original interpreter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The forward declaration of YulString in the header seems unused as well. Overall I have an impression that there might be some unused includes and forward declarations left over from the original implementation.

/**
* Interprets EVM instructions based on the current state without side-effect.
*/
class PureEVMInstructionInterpreter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pure interpreter does not seem to carry much state. How about simplifying it by removing the class and turning eval() and evalBuiltin() into standalone functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I can do that. My impression when porting this to pure interpreter is because of the dialect. A new instruction interpreter need to be provided if there is a new dialect.

Though I think simple functions will serve the job just fine. I can also scope them inside a namespace too.

{
std::vector<std::optional<LiteralKind>> const* literalArguments = nullptr;
if (BuiltinFunction const* builtin = m_dialect.builtin(_functionCall.functionName.name))
if (!builtin->literalArguments.empty())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't support any builtins with literal arguments anyway, you can simplify this by returning UnlimitedLiteralEncountered already from here when !builtin->literalArguments.empty() and removing _literalArguments argument from evaluateArgs().

std::map<YulName, FunctionDefinition const&> m_definedFunctions;
std::vector<YulName> m_declaredVariables;
std::map<Block const*, std::unique_ptr<Scope>> m_subScopes;
Scope* const m_parent;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our convention now is to always initialize variables of POD types in declaration (this might not have been the case when the original interpreter was written).

We do that even if they get a value assigned later in constructor - in that case I'd just use the default initialization with {} to emphasize that the value used here does not matter, only that it does get some deterministic value.

Suggested change
Scope* const m_parent;
Scope* const m_parent{};

You could use the same style with PureInterpreter::m_recursionDepth.

Comment on lines +40 to +43
m_definedFunctions(),
m_declaredVariables(),
m_subScopes(),
m_parent(_parent)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
m_definedFunctions(),
m_declaredVariables(),
m_subScopes(),
m_parent(_parent)
m_parent(_parent)

Comment on lines +35 to +42
it->second = std::make_unique<Scope>(this);
Scope* subscope = it->second.get();

for (auto const& statement: _block.statements)
if (auto const* functionDefinition = std::get_if<FunctionDefinition>(&statement))
subscope->m_definedFunctions.emplace(functionDefinition->name, *functionDefinition);

return subscope;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This initialization looks like something that belongs in the constructor. Here I'd do just this:

Suggested change
it->second = std::make_unique<Scope>(this);
Scope* subscope = it->second.get();
for (auto const& statement: _block.statements)
if (auto const* functionDefinition = std::get_if<FunctionDefinition>(&statement))
subscope->m_definedFunctions.emplace(functionDefinition->name, *functionDefinition);
return subscope;
auto& [block, subScope] = it->second;
subScope = std::make_unique<Scope>(this, block);
return subScope.get();

and let the scope find its own functions.

Also, I did not expect this function to create a scope since it's called getSubscope(). createSubscope() would be a more self-explanatory name.

@cameel
Copy link
Member

cameel commented Oct 16, 2024

The PR needs a rebase.

Also note that some EOF-related PRs were merged recently. There were some minor changes in the interpreter, please take a look at history. Especially with #15467 you'll probably need to pass EOFVersion to the Dialect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants