Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send notifications when marking tests as intermittent during a landing #1636

Open
jgraham opened this issue Dec 5, 2022 · 3 comments
Open

Comments

@jgraham
Copy link
Member

jgraham commented Dec 5, 2022

There is a concern that web-platform-tests might unexpectedly be marked as intermittent, hiding a regression in the test/code.

Therefore we should allow filing notifications during landings, particuarly for the case where a test goes from a stable to intermittent status.

This will be a little bit different from notifications that we do for PRs, because we won't necessarily have try push data to work with, but it should be possible to get mozilla-central results from before the push (although this is complicated a bit by the fact that by the time we do the stability analysis we're already based on autoland and can't assume a full set of try results exist; in that case we might need to actually analyze the metadata changes to see what happened, rather than just using the wptreport.json files).

@jesup
Copy link
Member

jesup commented Dec 5, 2022

Probably a separate issue is we need some sort of workflow that helps identify formerly-intermittent tests that are now perma-pass or perma-fail. This doesn't need to be noticed immediately, especially if we can backfill to figure out where a change probably occurred. (For example, look back through m-c wpt tests the last fail, and then iterate forward from there to find a likely failure point.) This also doesn't have to be manual, but the idea is to avoid tests falling into a black hole where all the results are ignored forevermore.

@jgraham
Copy link
Member Author

jgraham commented Dec 5, 2022

I agree that's important, but it's probably out of scope for the bot; it doesn't seem very reasonable to do this just using logfiles that we download without any external datastore that holds the result of each test in each build (or maybe one could hack something really specific together where each central build has an artifact that just lists known-intermittents and their actual status, and then you just have a bot that downloads like a month's worth of those or something, and maybe that's few enough tests that it it can actually work, but it's still not a small problem).

Ideally this is something that could be worked on as a general piece of test infrastructure, instead of something wpt-specific.

(I also think further discussions on these lines should happen elsewhere to avoid confusion in this issue)

@whimboo
Copy link
Collaborator

whimboo commented Nov 23, 2023

I have just one more thing to add here.

When for a stability try push a test fails 1 out of 4 jobs it is automatically marked as intermittent. As of this point it is unknown how frequent this failure happens and in a couple of cases I was able to remove those multiple status after the downstream sync because they do not really apply.

It would be nice that in case of a failure we would not immediately mark the test as intermittent but maybe automatically trigger some more retries for this job on the same stability push. Given that tests do not run this long it would probably add not that much overhead time-wise but would instead mark lesser tests as intermittent failing. This would be kinda helpful to have given the amount of statuses in mozilla-central. Note that because of that behavior it also makes it easier to actually miss regressions because we accept a fail or timeout state inappropriately.

As of now it really requires a lot of attention by individual triage owners to actually detect those changes and then also to investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants