-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(forge): isolated execution #7186
Conversation
This solution seems overly complicated, why clone and merge states? Wouldn't it be simpler to maintain two separate VMs: the test runner & the actual environment. Then you map calls from the test environment as transactions in the actual and vice versa map results/receipts as return values. |
@Philogy we'd have to clone VMs if we want to commit between calls I agree that approach with test EVM + N other EVMs might feel more intuitive, but I don't think that it's less complicated for example, if we wouldn't merge states, we'd have to figure out workarounds for correct processing of BALANCE, DELEGATECALL, EXTCODESIZE, etc opcodes on the level of test env as changes to actual EVM won't appear in its journal atm this PR is a draft I'm experimenting with to find out all implicit assumtions we and users have about foundry EVM behavior. once it's figured out ths impl will probably change a lot and might become closer to the approach you've mentioned |
Thanks for kicking off this work klkvr. Commenting so I am subscribed |
@@ -11,13 +11,13 @@ contract Issue3653Test is DSTest { | |||
Token token; | |||
|
|||
constructor() { | |||
fork = vm.createSelectFork("rpcAlias", 10); | |||
fork = vm.createSelectFork("rpcAlias", 1000000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
block 10 had gas limit of 5000 which is not enough to deploy a contract in isolated mode
So right now CI passess except for 1 test from sablier v2 which I believe has incorrect assumptions about the foundry EVM. On latest nightly that call reverts with MemoryOOG which I believe is unrealted to gas limit, and with isolated execution call executes without halts and reaches revert in contract with custom error type, causing revert reason mismatch in The current impl has several workarounds to make its behavior closer to how non-isolated EVM works now. I believe that it was important to create a PoC showing that we can switch to isolated mode without breaking changes, however, not sure if all semantics of non-isolated mode should be kept:
In general, the final impl of inspector call chain looks like this:
|
I will do more experiments with optimism codebase as there are some tests that are failing, and I am still not sure why. After that, I plan to make this an opt-in, look into performance and gas metering upd: seems like failing optimist tests are caused by hardcoded gas limits |
I've added a couple tests for tstore/tload and selfdestructs which are only passing with isolation mode. It's possible now to enable isolation by either using Isolation is enabled automatically when gas report is requested for tests via |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
crates/cheatcodes/src/inspector.rs
Outdated
data.journaled_state.state().get_mut(&broadcast.new_origin).unwrap(); | ||
|
||
account.info.nonce += 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this change for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently we are increasing nonce before broadcasting CALL to simulate nonce increase during on-chain transaction
this is incorrect for--isolate
because we need nonce to be up-to-date at the point when we are creating a transaction
so the change is to increase nonce after the CALL
however, thinking of it now, with new workaround when we explicitly decrease nonces in isolation, this is not really needed as long as we touch the account when pre-increase its nonce, updated in 33814e5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah can we also doc that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call
// Only include top-level calls which accout for calldata and base (21.000) cost. | ||
// Only include Calls and Creates as only these calls are isolated in inspector. | ||
if trace.depth != 1 && | ||
(trace.kind == CallKind::Call || | ||
trace.kind == CallKind::Create || | ||
trace.kind == CallKind::Create2) | ||
{ | ||
return; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great
fn transact_inner<DB: DatabaseExt + DatabaseCommit>( | ||
&mut self, | ||
data: &mut EVMData<'_, DB>, | ||
transact_to: TransactTo, | ||
caller: Address, | ||
input: Bytes, | ||
gas_limit: u64, | ||
value: U256, | ||
) -> (InstructionResult, Option<Address>, Gas, Bytes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would love if we unit tested this with a simple regression test, but can do in follow up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't really run any tests for isolation rn besides two I've added for selfdestruct/tstore, not sure how we can address this without running all tests for both modes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a flag to run these on CI perhaps every 24hrs on --isolate
? Something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that could work, we'd have to filter some tests out for such runs because we have some fixtures with hardcoded gas usage which doesn't match for --isolate
and also that one failing sablier-v2 test
@klkvr When running tests with
There are a bunch of contracts being forked and setup Adding Any ideas? |
So this can be reproduced with the following test: import "forge-std/Test.sol";
contract C is Test {}
contract GasWaster {
function waste() public {
for (uint256 i = 0; i < 100; i++) {
new C();
}
}
}
contract GasLimitTest is Test {
function test() public {
vm.createSelectFork("mainnet");
GasWaster waster = new GasWaster();
waster.waste();
}
} What happening here is that we are setting block gas limit to the real gas limit of the forked chain: foundry/crates/evm/core/src/fork/init.rs Line 68 in 4a91072
However, without isolation we are never really validating gas usage because such checks are performed in With isolation, we are capping transaction gas limit at block gas limit value to not hit
IMO, revert here is a correct behavior, that one failing sablier v2 test tries to test exactly this (tx exceeding block gas limit) However, users might need to disable those checks because it's pretty easy to go beyond 30M with various utility testing contracts. I can see 3 approaches for this:
All approaches are pretty similar, it will be easier to decide once we figure out if we want this to become a breaking change or keep backwards compatibility @mattsse wdyt? |
Thanks for the simple reproduction. imo this is already a breaking change in |
fwiw @frontier159 foundry is not stable and breaking changes can happen—albeit we try hard not to rip the bandaid if necessary. But we do understand the frustration 😄 I think adding a flag to disable the block gas limit here might be the move @klkvr. I'm also for the opposite (disable it by default, enable with flag) if we want to "remove" the breaking change. |
Allg - I enjoy living my namesake and on the Frontier. Appreciate the work you kings are doing here |
* [wip] feat(forge): isolated execution * small fixes * don't panic on transaction error + fixture fix * stricter call scheme check * refactor and more fixes * wip * fix * wip * wip * rm cheatcodes check * clippy * update commit logic * opt-in * enable in gas reports * --isolate * isolation tests * smaller diff * fmt * simplify logic * docs * fmt * enable isolation properly for --gas-report * change nonce incrementing * document why we touch
Motivation
aka authentic execution described in #6910
The idea is to execute all test/script-level calls as separate transactions initialized with empty journaled state and triggering all pre/post transaction actions and clean-ups, such as: including calldata, base 21k gas cost, clearing transient storage, correctly processing selfdestructs, etc
Solution
All calls of depth 1 are getting caught in
InspectorStack
and executed as following:EVMImpl
withTxEnv
for the current call/create context.When we are transacting the inner EVM, the call depth decreases from 1 to 0, and because of that, I've added logic to adjust
journaled_state.depth
which inspectors receive if we are in inner EVM context.revm
will also callInspectorStack
hooks for the second time for the call that's being delegated (first time in main context with depth 1, second time for inner context with depth 0), but we want it to be processed only in the main context.Open questions
I've been testing it on several codebases with a lot of various tests and it seems to work, however, there are some issues with that approach (and ci will probably show more):
prank
cheatcodes it's possible to craft test-level contract interaction wheretx.origin != msg.sender
. It's not possible when the call is a top-level transaction. Some of foundry and forge-std tests breake because of thatInspectorStack
does not commit?