Send notifications when marking tests as intermittent during a landing #1636

jgraham · 2022-12-05T19:56:45Z

There is a concern that web-platform-tests might unexpectedly be marked as intermittent, hiding a regression in the test/code.

Therefore we should allow filing notifications during landings, particuarly for the case where a test goes from a stable to intermittent status.

This will be a little bit different from notifications that we do for PRs, because we won't necessarily have try push data to work with, but it should be possible to get mozilla-central results from before the push (although this is complicated a bit by the fact that by the time we do the stability analysis we're already based on autoland and can't assume a full set of try results exist; in that case we might need to actually analyze the metadata changes to see what happened, rather than just using the wptreport.json files).

jesup · 2022-12-05T20:07:16Z

Probably a separate issue is we need some sort of workflow that helps identify formerly-intermittent tests that are now perma-pass or perma-fail. This doesn't need to be noticed immediately, especially if we can backfill to figure out where a change probably occurred. (For example, look back through m-c wpt tests the last fail, and then iterate forward from there to find a likely failure point.) This also doesn't have to be manual, but the idea is to avoid tests falling into a black hole where all the results are ignored forevermore.

jgraham · 2022-12-05T23:02:58Z

I agree that's important, but it's probably out of scope for the bot; it doesn't seem very reasonable to do this just using logfiles that we download without any external datastore that holds the result of each test in each build (or maybe one could hack something really specific together where each central build has an artifact that just lists known-intermittents and their actual status, and then you just have a bot that downloads like a month's worth of those or something, and maybe that's few enough tests that it it can actually work, but it's still not a small problem).

Ideally this is something that could be worked on as a general piece of test infrastructure, instead of something wpt-specific.

(I also think further discussions on these lines should happen elsewhere to avoid confusion in this issue)

whimboo · 2023-11-23T13:39:52Z

I have just one more thing to add here.

When for a stability try push a test fails 1 out of 4 jobs it is automatically marked as intermittent. As of this point it is unknown how frequent this failure happens and in a couple of cases I was able to remove those multiple status after the downstream sync because they do not really apply.

It would be nice that in case of a failure we would not immediately mark the test as intermittent but maybe automatically trigger some more retries for this job on the same stability push. Given that tests do not run this long it would probably add not that much overhead time-wise but would instead mark lesser tests as intermittent failing. This would be kinda helpful to have given the amount of statuses in mozilla-central. Note that because of that behavior it also makes it easier to actually miss regressions because we accept a fail or timeout state inappropriately.

As of now it really requires a lot of attention by individual triage owners to actually detect those changes and then also to investigate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Send notifications when marking tests as intermittent during a landing #1636

Send notifications when marking tests as intermittent during a landing #1636

jgraham commented Dec 5, 2022

jesup commented Dec 5, 2022

jgraham commented Dec 5, 2022

whimboo commented Nov 23, 2023

Send notifications when marking tests as intermittent during a landing #1636

Send notifications when marking tests as intermittent during a landing #1636

Comments

jgraham commented Dec 5, 2022

jesup commented Dec 5, 2022

jgraham commented Dec 5, 2022

whimboo commented Nov 23, 2023