-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reenable multithreading tests #850
Conversation
I'm happy to rerun tests on this PR a few times to see if we can reproduce the failures again before we merge. |
Codecov Report
@@ Coverage Diff @@
## master #850 +/- ##
==========================================
+ Coverage 86.27% 86.63% +0.35%
==========================================
Files 49 49
Lines 4577 4585 +8
Branches 905 908 +3
==========================================
+ Hits 3949 3972 +23
+ Misses 451 431 -20
- Partials 177 182 +5
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
""" | ||
|
||
_backend = __name__ # type: ignore | ||
_supports_threading = True | ||
_supports_threading = _IsWindows() # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_supports_threading = _IsWindows() # type: ignore | |
_supports_threading = not _IsWindows() # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've been spending too much time in compiled languages and so I created _IsWindows
just to have something that would be very obviously something evaluated at "runtime" (as opposed to a module-level constant evaluated at "compile-time" even though that obviously doesn't exist in Python). In practice I don't know that this adds any extra clarity and just increases boilerplate. What if I just inlined
_supports_threading = not ( sys.platform.startswith("win32") or sys.platform.startswith("cygin"))
Then your previous two comments also become moot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done this for now, let me know if you decide you don't like it and we can go back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having a dedicated private constant for this makes sense and makes the code more readable and overall more consistent. We could reuse that variable also in the tests, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could. I wouldn't want to leak a constant like that into our public API, but that's the sort of gray area where I would be OK with importing internals into tests I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -34,6 +34,7 @@ omit = | |||
*/signac/common/configobj/*.py | |||
|
|||
[tool:pytest] | |||
xfail_strict = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I consider this a rather drastic change in the behavior of the overall test suite. Should we consider to apply this a bit more granular, i.e., use strict=True
where this is relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, wanting to enable this was my original goal and what made me remember that these multithreaded tests had been xfailed. IMO setting this can only be a good thing because it forces us to immediately deal with changes that incidentally and unexpectedly fix a bug or change behaviors in other unexpected ways that we only catch because a test suddenly starts xpassing. The value of strict
passed to the decorator supersedes the value in the file, so I would rather have xfail_strict=True
configured here and then manually mark with strict=False
cases where for whatever reason we actually need to allow xfails, for instance because a test is flaky. I would prefer to force developers to jump through hoops in order to enable flaky tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm +1 for making the change at a global level, but I would be okay with doing it in a separate PR, or at least adding a separate changelog line to indicate this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for separate PR IMO (albeit cleaner), but +1 for dedicated changelog entry.
""" | ||
|
||
_backend = __name__ # type: ignore | ||
_supports_threading = True | ||
_supports_threading = _IsWindows() # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having a dedicated private constant for this makes sense and makes the code more readable and overall more consistent. We could reuse that variable also in the tests, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for tracking this down and citing the Python bug. This looks like an appropriate fix to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to go! :)
…ab/signac into testing/reenable_multithreading
@vyasr Excluding both |
Description
Tests of multithreaded support for synced collections were added as part of the development effort. The tests were generally reliable, but we saw sporadic failures on Windows for unknown reasons, so they were marked as conditionally expected to fail on Windows (but done so incorrectly, see below). The tests were changed to be unconditionally
xfail
ed due to development difficulties in #777. This PR adds them back in with a more target scope for skipping. This change also allows us to enablexfail_strict
in our testing.Motivation and Context
I finally have access to a Windows machine myself, so I looked into these again this morning. I cannot reproduce the failures locally on WSL no matter what I try (with/without pytest-xdist, many repetitions with pytest-repeat, etc), but I can reproduce the failures in normal Windows. The error is consistently a
PermissionError
encountered in theos.replace
call. There is a Python bug clearly indicating that Python intentionally avoids documenting such incompatibilities, but that this is a known problem withos.replace
in multithreaded settings on Windows due to the underlying filesystem instruction. As such, we should not expect this to work on Windows, and the original exclusion Windows-basedskip
prior to #777 is correct, not the current blanketxfail
.Our previous thread justifying the change in testing behavior identified that there was a bug in our old filtering. We were using
"win" in platform.python_implementation()
as a way to identify whether we were on Windows, and using that to determine whether or not to run the tests.platform.python_implementation()
indicates what Python implementation we are using (CPython, Jython, PyPy, etc), not the operating system, so this filter was clearly incorrect. However, the other claim in that thread is also obviously false:At the time we ran tests on Pypy and CPython, and both
"win" in "CPython"
and"win" in PyPy
are alwaysFalse
, not alwaysTrue
. The effect of this error therefore could not have been to hide failures on non-Windows platforms, all it could have done is run tests on Windows that we expected to fail. Therefore, my suspicion is that what was observed in that PR were only Windows-specific errors. There were additional reports of failures in #776, but they appear to have been scattered among various other failures that occurred during the CI transition, so it is unclear whether we ever had reliable examples of failures on other operating systems.This PR reenables these tests on the platforms where we should expect them to succeed. Given the upcoming release of signac 2.0, if we start seeing failures in these tests again we should reinvestigate our claims of multithreading support. The tests don't affect any behavior that signac relies on, only otherwise unused features of synced collections, so this is mostly relevant for documenting synced collections when we split those into a separate package post 2.0.
Checklist: