ci: Out of the tarpit #2923

MarcoPolo · 2024-08-15T18:52:07Z

This PR gets us out of the tarpit of flaky tests. It reruns failed tests to see if they are flaky. If they consistently fail, the job fails. If they are flaky, they are recorded as such, but the job succeeds. The job outputs a sqlite database of all test runs. The table includes information on whether the test passed or failed and how many times for each.

The summary now includes what specific test failed, so you don't have to hunt through the logs to see what failed. It will also link you to an observable dashboard that lets you dive deeper. example

I think it's worth experimenting with this, and we can always revert it later.

Some valid concerns:

Q: Why do this?
A: Making changes has been feeling like walking through a tarpit with regard to CI. And a bad CI desensitizes you to actual failures. CI failures should be very high signal and very low noise.

Q: Won't this hide flaky tests?
A: No more than we are already doing by rerunning the tests. I'd even argue it hides it less because all the flakiness is presented on a single page rather than spread out across runs.

Q: What happens when a test is so flaky it fails everytime?
A: Sounds like an easy test to fix. Most, if not all, of our flakes happen less than 20%.

Q: Does this replace the Flaky Tests dashboard?
A: Not yet. Although I have a FlakyTestsV2 dashboard that makes better use of the sqlite format. This method is much easier to iterate on and much faster than the original Flaky Tests Dashboard.

Q: Why are you "forking" go-test-template.yml?
A: There's no other way of doing the changes I want to do. I think go-libp2p can graduate to owning its own go-test workflow. I prefer the simple workflow calling composite actions strategy to the shared reusable workflow across many projects. For now, the changes are minimal between the in-tree go-test-template.yml and the upstream one. In the future I may remove even more things from it.

This PR isn't intended to be the final state. I'd like to clean up the summaries and dashboard a lot more. But I do think this current branch is a lot better than what we have.

How it works:

It introduces a go test helper script that:

Runs all the tests i.e. go test ./...
Saves the test results in a sqlite file.
If there are any failures it will rerun the failed tests n more times.
Uploads the sqlite db as an artifact of the job.

There's also an Observable dashboard that:

Parses all the sqlite db artifacts.
Shows which tests failed along with the relevant logs.
Derives a "flakiness score" which represents "for the given state of flakiness, how likely is a job to fail because of these flakes?". (n.b. It's a lot like rolling a dice with disadvantage.)

Fixes nil pointer issue in staticcheck

sukunrt

Can't wait to start using this ❤️

scripts/test_analysis/main.go

scripts/test_analysis/cmd/gotest2sql/main.go

scripts/test_analysis/main.go

* Lint fixes * Use latest go version for go-check Fixes nil pointer issue in staticcheck * Add test_analysis helper script * Use custom go-test-template * Add some tests to the test_analysis script * Always upload test_results db * Attempt to fix test on windows * Better if statement * Try to fix flaky test * Disable caching setup-go on Windows * Better if statement * Tweak * Always upload summary and artifact * Close db * No extra newline

MarcoPolo requested a review from sukunrt August 15, 2024 21:22

MarcoPolo added 15 commits August 16, 2024 11:30

Lint fixes

629e0d5

Use latest go version for go-check

da1bef3

Fixes nil pointer issue in staticcheck

Add test_analysis helper script

aecea13

Use custom go-test-template

111bd0c

Add some tests to the test_analysis script

3703da9

Always upload test_results db

094715d

Attempt to fix test on windows

a8e1628

Better if statement

e8b28b0

Try to fix flaky test

4bf455f

Disable caching setup-go on Windows

ebeb15d

Better if statement

ef97c64

Tweak

c9a846f

Always upload summary and artifact

0e81408

Close db

e632aa2

No extra newline

5f96d50

MarcoPolo force-pushed the marco/out-of-the-tarpit branch from 40ad7a2 to 5f96d50 Compare August 16, 2024 18:30

sukunrt approved these changes Aug 20, 2024

View reviewed changes

scripts/test_analysis/main.go Show resolved Hide resolved

scripts/test_analysis/cmd/gotest2sql/main.go Show resolved Hide resolved

scripts/test_analysis/main.go Show resolved Hide resolved

MarcoPolo merged commit 6c12e22 into master Aug 20, 2024
13 checks passed

MarcoPolo deleted the marco/out-of-the-tarpit branch August 20, 2024 18:13

MarcoPolo mentioned this pull request Sep 4, 2024

[DX Streamline] Establish flaky test eradication program filecoin-project/lotus#12127

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: Out of the tarpit #2923

ci: Out of the tarpit #2923

MarcoPolo commented Aug 15, 2024 •

edited

Loading

sukunrt left a comment

ci: Out of the tarpit #2923

ci: Out of the tarpit #2923

Conversation

MarcoPolo commented Aug 15, 2024 • edited Loading

Some valid concerns:

How it works:

sukunrt left a comment

Choose a reason for hiding this comment

MarcoPolo commented Aug 15, 2024 •

edited

Loading