-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
7660 Avoid CI stability checks timing out #32202
7660 Avoid CI stability checks timing out #32202
Conversation
If the repeat runs stop early to avoid, timeout, a message will be written in the results explaining so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally LGTM as an implementation of my old design. If ok, I would rather leave it to current WPT maintainers to do a 'real' review of the code, as they will be the ones maintaining it long term.
Side-note: there's a bunch of changes here that seem to be the result of some autoformatting happening in your editor. It would be great to revert those changes, as (a) they're inconsistent with WPT styling elsewhere I think (though we don't have a real style guide afaik...) but more importantly (b) they make it much harder to spot which deltas are new and which are just reformats.
# Use the number of repeated test suites that were run | ||
# to process the results if the runs were stopped to | ||
# avoid hitting the maximum run time. | ||
if kwargs["avoided_timeout"]["did_avoid"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using kwargs as an output mechanism from wptrunner.run_tests
is clever, but potentially confusing. I leave it to other reviewers (who are still involved with WPT and will have to maintain this ;)) to decide if they want to suggest a different method for communicating this information back to stability.py
from the runner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was hesitant about taking this approach. I know we definitely want to have access to the number of actual iterations that were run, but returning that information from wptrunner's run_tests
implementation in a circumstance that it is only used when called by stability tests didn't feel right either. I am totally open to changing this implementation if anyone has suggestions or thinks a rework of run_tests
is worth the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some general comments:
- If you're refactoring code it's really helpful to have that in a seperate commit first and then the actual changes in a commits, such that the commits are reviewable one-at-a-time. Of course that doesn't always work out, but in this case I think it could have done.
- It looks like some kind of automated code formatting was applied. In general we've relied on pyflakes and not adopted automatic formatting (we couldn't enforce it anyway, since the project is used across multiple repos, and asking them to all use a specific version of a specific formatter is implausible). Also we're not aiming for 80 column lines; breaking before 100 is preferred.
In terms of the actual change; I think it's mostly good, but @stephenmcgruer is right (as usual) that using the kwargs as in in/out parameter feels like a bad idea. Even though we do a reasonable amount of in-place manipulation of the kwargs before passing them to run_tests
, they aren't currently used for output. I think it will work out OK to update the function signature to return the extra data we need, and just drop it in start
so we maintain compatibility with tools using that entry point.
Thanks for working on this!
|
||
return unexpected_total == 0 | ||
return evaluate_runs(counts, iteration_timeout, kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make run_tests
return counts
directly. Then start
can discard the extra information and we can avoid putting the output in a kwarg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, looking at this possible solution, it's a relatively simple fix that can be easily implemented and will not affect other tests using wptrunner. I made a change to only return counts["repeat"]
to only give back the number of iterations of the test suite that were run, rather than all of counts
. Do you think this is a good idea? Or would it be functionally better to return all of counts
in the chance that other uses might arise and need further access to the other counts?
Now that wptrunner's run_tests returns more than 1 value, the return type will be a tuple for the older variables that expect only 1 value. We need to ensure that we pull the expected first value (boolean) out of that tuple.
wptrunner's run_tests would return a tuple only if not issues arose while running, and would return only a boolean in the case of some expected issue. Now a tuple is returned in all instances as expected.
Thank you @stephenmcgruer and @jgraham for the reviews - I tried to remove some of the reformatting and I'll be sure to avoid any unnecessary auto-formatting changes and keep code refactoring in separate commits from now on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking pretty reasonable now, but I think at least returning the whole of counts
(or even a new Status
class like a static version of counts
) as the second argument is preferable to just returning the number of iterations, since it gives us a way to return richer data in the future without making breaking changes to the API.
run_tests will now return a new TestStatus object rather than returning only the number of iterations run. This will allow for more robust statistics to be shown in the future.
reducing some comments and logs to taking less vertical space.
TestStatus docstring
@jgraham, I tried to implement as you suggested. Normally I would create a dataclass for this, but since I we need to support Python 3.6 and dataclasses were added in 3.7, I just created a class with attributes to return. Maybe there is a better way to handle this? wpt/tools/wptrunner/wptrunner/wptrunner.py Lines 273 to 281 in 9d27581
What would @jgraham do in this scenario? (I imagine I will be asking this often, if only to myself sometimes 😄 ) |
I apologise in advance for any kind distress caused by thinking along those lines ;) In this case I wonder if we could just have the |
forego the use of defaultdict counts keeping track of test info/results and instead use the custom class TestStatus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think this looks good now. If the CI passes feel free to squash and merge with an appropriate commit message. If the decision task fails again we'll need to investigate what's going on; it seems unlikely to be this specific changeset.
addresses issue #7660
This is implemented based on the proposed solution described by @stephenmcgruer.
Changes:
--repeat-max-time
flag that will set how many minutes wptrunner should attempt to run the iterations of the test suite for the stability checks. Defaults to 100 minutes.wptrunner.run_tests
has had some refactoring to trim down the size of the function to be more readable.