Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RFC to disable Chromium stability checks #131

Merged
merged 3 commits into from
Jun 6, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions rfcs/disable_chrome_stability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# RFC 131: Disable wpt-chrome-dev-stability runs

## Summary

The wpt-chrome-dev-stability job on WPT fails quite frequently and
it's impacting developer productivity. The job should be changed to
non-blocking, while keeping a record of potential flakiness
introductions that could be useful in further investigations.

## Details

Stability jobs are intended to help us ensure that new and modified
tests aren't flaky. In particular it's designed to ensure that the
test author is aware when they have written a test that is flaky in a
different browser. This is because authors writing a test in a
specific browser engine might be unaware that they are depending on
details of that implementation that make the test unreliable in other
engines. Leaving this to other developers to fix is expensive, because
they have to spend time becoming familiar with details of the test
which are already understood by the original author. In addition, when
there are many unstable web-platfrom-tests, it undermines confidence
in the overall suite and, where such instability is automatically
handled in CI systems (e.g. by marking a test as [PASS, FAIL] in
metadata), reduces the ability of the tests to catch regressions.

Currently, stability checks for the corresponding browser are skipped
when [merging export PRs from their own
repositories](https://github.com/web-platform-tests/wpt/issues/29737)
(i.e. Chromium exports don't run Chrome stability checks, and
likewise for Firefox). This is because we assume that any real
stability issue is likely to have been handled in the source
repository. It's common for such changesets to include code changes
to the browser itself that fix intermittents, and in the source
repository the tests will run together with the corresponding browser
changes. However on GitHub we are using the latest development
release of the browser, which is unlikely to contain code fixes at
the time of export. This makes the stability checks on these PRs
highly prone to misleading failures.

The main problem with the stability checks as currently implemented is
that although the test author is in the best place to understand the
behaviour of the test, they may not understand how to reproduce the
failure in another browser, or may be unable to fix the problem
(e.g. if it's a browser bug rather than a test bug).

Faced with these tradeoffs, the Chromium developers believe the
project is better served by allowing tests which are unstable in
Chrome to land in web-platform-tests and using the tooling they have
available in the Chromium CI system to investigate the
flakiness. Therefore the job will be marked as non-blocking, so we are
able to see where it fails and assess whether the correct tradeoff has
been made.

Marking a job as non-blocking will require changes in the taskcluster
configuration so that the job does not block the sink task.

Disabling the job entirely would also reduce the CI load. This can be
considered once the impact of the change is well understood.

## Risks

New intermittent failures specific to Chromium could be introduced
upstream and only noticed when a WPT import to Chromium CI
happens. Chromium developers believe the tradeoff is worthwhile, as
the Chromium project has robust tooling to investigate and handle such
cases.