-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dapptools-style coverage #99
Comments
sadly theirs is pretty bad tbh, it doesn't have any branching info for == statements, etc |
can you recommend an algorithm, and we can start scoping it out along with how we'd utilize sourcemaps? |
only good one i know is solidity-coverage's but it works by instrumenting the solidity whereas we'd probably want to do it at the VM level? |
Yes, I didn't love the injected events approach...what is the hevm approach missing? is it hard? |
More notes:
|
Notes on how we can achieve coverage:
Points where I'm stuck:
|
One thing to note is when improving fuzzing as part of #387, we may want to hook into the coverage results to enable coverage-guided fuzzing, i.e. the fuzzer would know what parts of code have or have not been covered and use that to guide input mutations. I'm not familiar enough with the internals to know if the above plan enables that but just wanted to mention it in case it affects coverage implementation |
Just want to add a strong vote for this feature. For context, our monorepo relies heavily on solidity coverage, and CI checks will fail when a change would reduce coverage below a certain threshold. |
@maurelian I'm planning to post an initial UX proposal for coverage later today (though deferring to @gakonst on actual implementation timelines), which will be based on my experience with dapptools + Instanbul style coverage reports, since those are the main coverage tools I've used. If you have any UX suggestions or tools who's reports/workflows you like/can link to, that would be great! |
ACK - coverage is next up on our prio list. |
UX ProposalThis is largely based on how Istanbul's JS coverage reports work, since that's what I've used the most and haven't had any big issues with it's approach/formatting (whereas the dapptools format is clunky and not great IMO). Istanbul is what Hardhat's coverage tool uses to generate output data/reports. As a result, this initial proposal pretty much just describes how Istanbul works, but happy to iterate and change things based on feedback. Standard SyntaxCoverage should not be the default since the added instrumentation will likely significantly slow down tests, so instead we should run coverage with:
It would be great if this also supported watch mode and the various match flags, so you can do
CI SyntaxI'd be curious to hear more from @maurelian about how this should work, but a CI mode would be great so you can configure your CI to fail if aggregate coverage drops below some threshold. One possible syntax is instead of running the above command you'd run:
This will cause the process to exit with an error if coverage is < 95% Terminal OutputAt the bottom of the terminal output, append a table summary of coverage to the standard test output. Something like the below code block, where >= 80% coverage is colored green, 65% <= x < 80% coverage is yellow, and < 65% coverage is red (I believe these are the thresholds used by Istanbul). $ forge test --coverage
[⠆] Compiling...
No files changed, compilation skipped
Running 1 test for src/test/MyContract.t.sol:MyContractConstructor
[PASS] testConstructor() (gas: 31072)
Test result: ok. 1 passed; 0 failed; finished in 10.32ms
-------------------|-----------|-----------|-----------|-----------|
File | % Stmts |% Branches | % Funcs | % Lines |
-------------------|-----------|-----------|-----------|-----------|
src/ | 100 | 100 | 100 | 100 |
MyContract.sol | 100 | 100 | 100 | 100 |
-------------------|-----------|-----------|-----------|-----------|
All files | 100 | 100 | 100 | 100 |
-------------------|-----------|-----------|-----------|-----------|
=============================== Coverage summary ===============================
Statements : 100% ( 350/350 )
Branches : 100% ( 176/176 )
Functions : 100% ( 81/81 )
Lines : 100% ( 317/317 )
================================================================================ File OutputIn addition to the terminal output, output files containing more details should be saved to a
The syntax used by Istanbul in these reports can be found here. To summarize:
Screenshots from random reports I found online are below: |
Coverage should generate LCOV files at the very least so it is uploadable to tools like CodeCov. The LCOV file can also be used to generate the HTML reports and so on, so really, we "only" need the LCOV file and the rest can be built on top. For coverage thresholds, I think people mostly defer to other tools that also take LCOV? |
Ah perfect that makes sense, thanks.
Seems this is what Optimism does. Though is there a reason we wouldn't want to integrate at least a simple coverage threshold natively so it can be used to trigger a CI failure? Seems nice to not need a third-party tool, and it'd facilitate having a better One thing I forgot to mention is how fuzz/invariant tests should affect coverage reports. I think there's a few ways we can handle this:
|
Yeah, I can't find it now that we've migrated from GHA to Circle CI, but we previously dependedon tooling from CodeCov to cause the CI failure.
When I think of coverage I think of unit tests, and I'd want to be able to use the report to direct me to where further tests are needed. So, my default is to not mix the two kinds of coverage up, at most have a separate report for each. |
Hmm, interesting! I find sometimes I only write fuzz tests for certain portions of code, since having a separate concrete test for the same code is redundant. But I see your use case, so perhaps an (Side note: Above I suggested |
Hello, I'm very interested in helping out with this feature! I posted some discussion here #1348 before I realised this issue existed. Initial thoughts after catching up:
And finally, the gold standard here would be to integrate the coverage report with the solidity plugins for editors, so that you can see the coverage results directly in your editor every time you save. This massively plays to The Go plugin does this and it makes the feedback loop for tests incredibly fast I am completely new to the codebase but keen to help out. I think the best way to parallelise would be to try and understand any dependencies or refactors needed and possibly try to create some distinct issues 🙏 |
I think the current thinking is that we would not use the AST. We would instead on a VM-level check jump instructions and map those to the source code using source maps, but the approach hasn't been validated so that might change.
Do you have any additional info on how this works? I'd assume it uses LCOV - if that's the case, then integration should be simple. We probably just need a way to output the raw LCOV to stdout |
Ah right, thanks. I had a question on this; if the compiler makes optimisations like removing branches, can we still map the bytecode back to source lines? Users clearly operate on the source-line level, so seeing missing coverage on a line because of something the compiler does would be confusing. I've seen that some languages seem to implement coverage with LLVM so I assume this is a solved problem, but not sure how it works.
Go has its own native coverage format (which is an extremely simple text file). The editor runs the tests and the file is output to a tmp directory, which is then read, parsed and rendered in the editor. I'd be happy to try and take this part on because I might be more use there, as I'm very unfamiliar with the vm code and rust. |
Probably not. The source maps in Solidity are really flaky when you turn the optimizer on, so we would probably need the optimizer to be off. I'm not sure, though, I'll figure that out as I implement the feature
Ah, I misunderstood as well - you don't want Forge to necessarily support Go's coverage format, just a coverage format that can be used in editors? If so, I think most editors should be able to support LCOV (used by C and others), or at least something proximal to it |
Running it with the optimizer off makes sense 👍 For the coverage; yes LCOV format would be good. The Go description was just an example of the how the output might be used in editors. As long as |
|
I also think There are also some clashes between |
Issue Status: 1. Open 2. Started 3. Submitted 4. Done This issue now has a funding of 20000.0 DAI (20000.0 USD @ $1.0/DAI) attached to it as part of the synapsecns fund.
|
Trying to summarize the TODOs
Anyone already started working on this ? |
Yes, I am actively working on it. Will post a status update tomorrow (I am not home) The todo list is no longer valid since that was based on Sputnik, and we've since moved to REVM :) |
Issue Status: 1. Open 2. Started 3. Submitted 4. Done Work has been started. These users each claimed they can complete the work by 264 years, 6 months from now. 1) bastek8 has applied to start work (Funders only: approve worker | reject worker). Program i wszystlo z hun zwiazane I was already working on this feature before the bounty was added I work plan and whoever don’t want to come back to work they can find a new job Learn more on the Gitcoin Issue Details page. |
So, as promised, here is a status update. I was already working on this issue prior to the bounty (which is why I am assigned). Initially, the thought was to analyse BackgroundThe primary mode of operation for the coverage tool will be to collect the following pieces of information:
This information will be collected from the AST of the Solidity files. Then the tests are run with a special coverage collector that marks opcodes as hit as they are executed, which we later use (along with source maps) to map back to the coverage items. Report typesThe above information will be output either to stdout as a simple report, or as an LCOV file that you can upload to services like CodeCov. A stretch goal would be to add a HTML report as well, but since tools already exist for converting LCOV to HTML, it isn't a deal breaker if it is not in the first version. Compared to other coverage tools
Current statusI went with the jump analysis first, but as I found it insufficient, I had to scrap most of my work and start over with the method described above. The coverage collector is done, but I am currently having some issues constructing the higher level coverage data that the reports will use (see below) Current issues
|
Sorry guys, this is taking longer than expected. I've opened up a draft PR with a status update, and I intend to update that PR as I go with new status updates if relevant. #1576 |
Reopening this - still some edge cases and missing stuff. See this label:
Cmd-forge-coverage
|
May I ask what the status is with coverage generation of fuzz tests? It looks to me as though they are not yet included and I don't yet see an |
Close-able? @mds1 |
Let's leave this open for now, eventually all coverage stuff will be aggregated into #4442 and I want to make sure things in here don't get missed |
Marking as complete as the feature has been implemented (minus some small bugs and follow-up features). Closing in favor of #4442 |
Implemented at: https://github.com/dapphub/dapptools/blob/728a9245fa5f78589b0cedec0ade2da5433ad792/src/hevm/src/EVM/UnitTest.hs#L251-L382
Example: https://twitter.com/dapptools/status/1435973810545729536
The text was updated successfully, but these errors were encountered: