Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VM] update VM test runner #849

Merged
merged 6 commits into from
Sep 17, 2020
Merged

[VM] update VM test runner #849

merged 6 commits into from
Sep 17, 2020

Conversation

jochem-brouwer
Copy link
Member

@jochem-brouwer jochem-brouwer commented Aug 26, 2020

Closes #842, #877, #744

The goal of this PR is to harden the test runner with a focus on the integrity of Blockchain tests. It add support for "transition tests" which verifies that the VM switches correctly from one fork to another. Overview of fixes/additions:

  • Tests: error was expected, but no error was raised: no "failing test" was raised
  • Not all blocks of tests were sometimes ran
  • Fix the filters in the test loader so there's not a lot of spam "no test available, skip test..."
  • Add flag --expected-test-amount=n to verify at least (>=) n assertions run in test (otherwise throw 1 failing test)
  • Add flag --verify-test-amount-alltests: assume all tests are being ran, test afterwards if enough (>=) assertions are run, otherwise throw a failing test
  • Document CLI arguments
  • Fix cases where blockchain gets locked in case a function which invokes semaphore throws before unlocking the lock, Fix blockchain hanging forever in case code throws between a semaphore lock/unlock #877
  • Throw test in case there is a reorg in the test (these tests are now temporarily disabled until we fix this, Expand support of Blockchain for reorgs #879
  • Found a single test which runs into a rather complex Block issue, Refactor block lib #878 (refactor block)
  • Rewrote the PR CI. MuirGlacier simply re-runs the Istanbul tests, so this (almost) is a waste of CI time. (Still runs in Nightly)
  • Nightly CI now runs all tests including the transition tests.
  • Add VM _updateOpcodes to reflect the opcodes of the current Common hardfork
  • Track which tests+files fail. If there were any failing tests, then this will be reported after the tests (so if your console cuts of the tests you can still see which failed!)

Summary of this PR in a picture:

image

@codecov
Copy link

codecov bot commented Aug 26, 2020

Codecov Report

Merging #849 into master will decrease coverage by 0.10%.
The diff coverage is 54.54%.

Impacted file tree graph

Flag Coverage Δ
#account 92.85% <ø> (ø)
#block 77.25% <55.55%> (-0.12%) ⬇️
#blockchain 80.45% <55.55%> (-0.66%) ⬇️
#common 92.85% <ø> (-0.19%) ⬇️
#ethash 83.33% <ø> (ø)
#tx 93.98% <ø> (-0.14%) ⬇️
#vm 82.19% <50.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

@jochem-brouwer
Copy link
Member Author

jochem-brouwer commented Aug 26, 2020

Some personal notes here:

/LegacyTests/ has the following network params:

  Byzantium,
  Constantinople,
  ConstantinopleFix,
  EIP150,
  EIP158,
  Frontier,
  Homestead

Normal tests have;

  Istanbul,
  Byzantium,
  Constantinople,
  ConstantinopleFix,
  EIP150,
  EIP158,
  Frontier,
  Homestead,
  ByzantiumToConstantinopleFixAt5,
  EIP158ToByzantiumAt5,
  FrontierToHomesteadAt5,
  HomesteadToDaoAt5,
  HomesteadToEIP150At5

@jochem-brouwer
Copy link
Member Author

I noticed that ethereum/tests has blockchain transition tests, but our VM currently does not support this. I added functionality in runBlock which checks if we are at a HF block and if so, then update the Common and update the opcodes of VM. I also need to add this functionality in Tx/Block. This will be a rather complex change: if we import Blocks, for instance, then we can supply a Common which is currently set to (for instance) Petersburg while we are importing a Byzantium block. We should then thus (temporarily) set Common to Byzantium (to verify difficulty) and then revert it back. I'm not sure how to do this in a clean way. The same goes for importing Frontier txs if we are at a Homestead fork (Homestead disallows certain txs which were allowed in Frontier).

Another problem is that we currently default the chain to mainnet, but this implies that if an user sets the VM to a non-Frontier fork it suddenly force the VM to go back to Homestead at the Homestead block.

@jochem-brouwer jochem-brouwer mentioned this pull request Aug 27, 2020
@@ -28,7 +28,7 @@ module.exports = async function runBlockchainTest(options, testData, t) {

const blockchain = new Blockchain({
db: blockchainDB,
hardfork,
common: options.common,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small style nit: instead of setting { common: options.common } multiple times here and below you can do const { common } = options once then simply pass { common } every time like i did with hardfork. no big deal though :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks 😄

@jochem-brouwer
Copy link
Member Author

Updated the tests! Don't merge yet, because some tests are now failing! (These are tests which are expected to error, but did not - we had no tests in place!)

@jochem-brouwer
Copy link
Member Author

Since some consensus tests were updated in #853, will wait until that's merged and then see if tests are still failing.

// create a new VM (need access to new opcodes)
vm._updateOpcodes()
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified with Common.setHardforkByBlockNumber()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah there is some weird stuff going on over here, because if I re-instantiate a VM after I've updated Common (assuming that is what you mean here) then suddenly a lot more tests start failing. This used to be the way how most blockchain/state tests passed. 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth investigating why this happens though as it should definitely work if we re-instantiate the VM with a correctly-set Common 👍

@holgerd77
Copy link
Member

Just going through some issues, if it fits you can also integrate #744 here.

@jochem-brouwer
Copy link
Member Author

jochem-brouwer commented Sep 16, 2020

Will post (and edit) the findings here of the failing tests. The changes for these are too big to fit in this PR.

file: bcBlockRLPAsList test: BLOCK_difficulty_GivenAsList_Frontier (fork: Frontier)

Problem: the RLP of the block header contains a difficulty which is a list of Buffers, and not a Buffer itself. header.ts should throw, but does not throw: since we are now checking for tests which should throw but do not, this test fails.

Related issues: ethereumjs/organization#56, #683

file: bcInvalidRLPTest test: BLOCK_difficulty_GivenAsList (fork: Frontier)

Same as above

file: blockChainFrontierWithLargerTDvsHomesteadBlockchain test: blockChainFrontierWithLargerTDvsHomesteadBlockchain_FrontierToHomesteadAt5

blockChainFrontierWithLargerTDvsHomesteadBlockchain2 test: blockChainFrontierWithLargerTDvsHomesteadBlockchain2_FrontierToHomesteadAt5 (both fork FrontierToHomesteadAt5)

Both these tests essentially run two chains: but due to the order of these blocks, the "first chain" should be the head of the blockchain. Since this is not yet implemented (I think this should be implemented in the client, not in blockchain itself) and it requires a rather big overhaul of blockchain I will add these to skipped tests as well.

@holgerd77 holgerd77 mentioned this pull request Sep 16, 2020
@jochem-brouwer jochem-brouwer force-pushed the rewrite-tests branch 3 times, most recently from fac2e04 to dd729d0 Compare September 16, 2020 08:56
@jochem-brouwer
Copy link
Member Author

Temporarily running the nightly tests here.

@jochem-brouwer
Copy link
Member Author

jochem-brouwer commented Sep 16, 2020

Good news! Blockchain tests pass on all forks! (Except 1)

(data here previously was outdated)

To do:

  • Check if MuirGlacier actually runs on a VM which has been set to the MuirGlacier Common
  • Remove the "skip state..." messages from StateTests
  • Check why some blockchain tests are configured wrong, DAO and FrontierToHomesteadAt5
  • Add assertion checks for state tests aswell

@jochem-brouwer
Copy link
Member Author

jochem-brouwer commented Sep 16, 2020

Right, I assumed that the tests would actually run the blocks in order, but this is not the case. It seems like we should support reorgs, which we currently do not support (EDIT: actually seems we support this 😓 ). I am not sure if I would think this is a feature of the client or of the blockchain, I first thought it should be a client feature but I'm starting to lean more to blockchain.

Situation is, for instance in DaoTransactions_HomesteadToDaoAt5:

Run block 1-4. Throw on block 5. The head of the chain is here block 4 with hash A.

Run block 1-4. All hashes of these blocks are now different than above (so it is a new chain). Throw on block 5. Head of chain is now block 4 of this chain with hash B.

Insert a new block 5 which now has parentHash set to hash A. (This is a valid block)

This behavior is not supported by blockchain currently: the reason is that the blockchains' iterator just gets the next block by increasing the block number. It can, if it is has multiple blocks to choose from, thus sometimes (like in the tests) choose the wrong block.

The problem is that if we add a block which does not have parentHash which is currently the head block, this block is not being ran, so the test is only ran partially.

Will disable these tests here, have added a re-org check in the tests which will throw if it encounters a re-org. We should either add support for re-orgs or permanently disable these tests.

@jochem-brouwer jochem-brouwer force-pushed the rewrite-tests branch 2 times, most recently from d7556ff to f0d8d8d Compare September 16, 2020 16:48
@jochem-brouwer
Copy link
Member Author

jochem-brouwer commented Sep 16, 2020

Data per new skip tests, blockchain, assertion amount:

--excludeDir=stTimeConsuming
Frontier 4385
Homestead 6997
TangerineWhistle 4255
SpuriousDragon 4305
Byzantium 15379
Constantinople 17189
Petersburg 17174
Istanbul 19817
Berlin 33
--dir=GeneralStateTests/stTimeConsuming
Frontier 0
Homestead 0
TangerineWhistle 0
SpuriousDragon 0
Byzantium 0
Constantinople 15561
Petersburg 15561
Istanbul 15561
Berlin 0

Note: MuirGlacier is the same as Istanbul so not included.

This data is now in tests/config.

@jochem-brouwer jochem-brouwer force-pushed the rewrite-tests branch 2 times, most recently from bba85a1 to 4013502 Compare September 16, 2020 18:43
[VM] do not return after an error

[VM] tests: fail test if we expect an error but VM did not throw
…ailing tests in config

[VM] add support for transition tests common

[VM] block now transitions to new HF on a transition block
[VM] remove auto-switch fork

[VM] add method to get directories for fork

[VM] tests: identify broken test, dump list of broken tests after run
temporarily test forks in parallel

fix get number

[VM] skip reorg tests

[VM] tests: add --expected-test-amount

fix yaml

[VM] add flag to verify against known test count

[VM] StateTests: apply filter in testLoader

[VM] add state tests assert count checks
Copy link
Member

@holgerd77 holgerd77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thing to clarify, otherwise looks good, thanks Jochem, great PR! 😄

FrontierToHomesteadAt5: 12,
HomesteadToDaoAt5: 18,
HomesteadToEIP150At5: 3,
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fantastic to have this even aggregated here in the config, makes things so much more transparent and sets reliable expectations on the test runs.

.catch((reason) => {
this._lock.release()
throw reason
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. 👍

@@ -194,6 +194,10 @@ export default class VM extends AsyncEventEmitter {
this._emit = promisify(this.emit.bind(this))
}

_updateOpcodes() {
this._opcodes = getOpcodesForHF(this._common)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this round, but I wonder if we can come to a more coherent concept on how to re-initialize the VM properly on a HF switch in Common? Optimally this should happen automatically [tm], one first-shot idea would otherwise to repurpose the newly introduced init() function a bit (without loosing the existing functionality) and pass along the info to the library user that this function should be called once a HF change occurs?

Another idea would be for Common to emit an event on an occurred HF change and the VM to listen to and adopt accordingly?


if (expectException) {
t.fail("expected exception but test did not throw an exception")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, nice case and good catch.

skipFn = (name) => {
return ((forkFilter.test(name) === false) || skipTest(name, args.skipTests))
skipFn = (name, test) => {
return ((forkFilter.test(test.network) === false) || skipTest(name, args.skipTests))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, that was one of the big fixes, right?

Phew. 😃

name: hf.name,
forkHash: hf.forkHash,
block: null
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I can not follow what this code actually does respectively why it is necessary to rebuild the whole hardfork stack in such a laborious manner when at the end the test is executed on the hardfork set with const mainnetCommon = new Common('mainnet', hfName) anyhow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I semi agree that this is a lot of code which does not do a lot. But if we use mainnetCommon = new Common('mainnet', hfName), this means that we are (for instance) running Istanbul on block 0 while Common actually says that we should run Frontier. I do not think that is right, this is to semi future-proof the tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might be hanging a bit on this block-number thing too much 🙂 , Common just has this information on the underlying block numbers but is not enforcing it in any ways. And Common is used in non-block-number-dependent contexts all the time, e.g. also the VM needs to keep this as one version of using it by e.g. people expecting a Byzantium VM and then feeding it with test blocks where the number plays no role at all (and it should still behave as Byzantium).

Anyhow, I have the impression we generally need to give this some more reflection, where and how a HF is reset and who is taking responsibility here, especially when moving over to client development.

For now we might want to keep this code, eventually we can remove later depending on the design decision we take along these questions on the VM.


if (currentBlock < lastBlock) {
// "re-org": rollback the blockchain to currentBlock (i.e. delete that block number in the blockchain plus the children)
t.fail("re-orgs are not supported by the test suite")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not completely getting this re-org scenario: is this actually triggered somewhere in the tests? Then some of the tests should fail according to the code here I would assume? Or is this a "just in case" implementation? 😛

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that if you disable this, most tests pass (see the skipped tests which have a few tests which are added), but not all of these blocks are evaluated (these are the re-org blocks). So the test is only executed partially. There is only one test which actually fails due to this. I just don't like that tests are only partially executed. One of the features of these tests seems to be that it does this reorg stuff, but we are not testing this, so I do not think we should just allow the tests and make them pass while we know that they are only partially executed.

It is rather cumbersome that these tests setup the canonical chain, then do some reorg, and then basically "reorg back" to the initial canonical chain, which means that the tip block of the chain in our implementation is also the tip block which is expected by the test suite, which is also the reason why these tests pass. The blocks in the reorg are not evaluated. If not clear let me know 😄 Related issue which tracks this problem: #879

Copy link
Member

@holgerd77 holgerd77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@holgerd77 holgerd77 merged commit 601026b into master Sep 17, 2020
@holgerd77 holgerd77 deleted the rewrite-tests branch September 17, 2020 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rewrite the test runner
3 participants