Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reenable multithreading tests #850

Merged
merged 19 commits into from
Nov 8, 2022
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ omit =
*/signac/common/configobj/*.py

[tool:pytest]
xfail_strict = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider this a rather drastic change in the behavior of the overall test suite. Should we consider to apply this a bit more granular, i.e., use strict=True where this is relevant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, wanting to enable this was my original goal and what made me remember that these multithreaded tests had been xfailed. IMO setting this can only be a good thing because it forces us to immediately deal with changes that incidentally and unexpectedly fix a bug or change behaviors in other unexpected ways that we only catch because a test suddenly starts xpassing. The value of strict passed to the decorator supersedes the value in the file, so I would rather have xfail_strict=True configured here and then manually mark with strict=False cases where for whatever reason we actually need to allow xfails, for instance because a test is flaky. I would prefer to force developers to jump through hoops in order to enable flaky tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +1 for making the change at a global level, but I would be okay with doing it in a separate PR, or at least adding a separate changelog line to indicate this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for separate PR IMO (albeit cleaner), but +1 for dedicated changelog entry.

filterwarnings =
error::FutureWarning:signac.*
error::DeprecationWarning:signac.*
Expand Down
30 changes: 25 additions & 5 deletions signac/_synced_collections/backends/collection_json.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import errno
import json
import os
import sys
import uuid
from collections.abc import Mapping, Sequence
from typing import Callable, FrozenSet
Expand Down Expand Up @@ -116,6 +117,16 @@ def json_attr_dict_validator(data):
"""


class _IsWindows:
"""A truthy type that is False on Windows and True otherwise."""
csadorf marked this conversation as resolved.
Show resolved Hide resolved

def __bool__(self):
if sys.platform.startswith("win32") or sys.platform.startswith("cygin"):
return False
else:
return True
csadorf marked this conversation as resolved.
Show resolved Hide resolved


class JSONCollection(SyncedCollection):
r"""A :class:`~.SyncedCollection` that synchronizes with a JSON file.

Expand All @@ -127,11 +138,13 @@ class JSONCollection(SyncedCollection):

**Thread safety**

The :class:`JSONCollection` is thread-safe. To make these collections safe, the
``write_concern`` flag is ignored in multithreaded execution, and the
write is **always** performed via a write to temporary file followed by a
The :class:`JSONCollection` is thread-safe on Unix-like systems (not
Windows, see the Warnings section). To make these collections safe, the
``write_concern`` flag is ignored in multithreaded execution, and the write
is **always** performed via a write to temporary file followed by a
replacement of the original file. The file replacement operation uses
:func:`os.replace`, which is guaranteed to be atomic by the Python standard.
:func:`os.replace`, which is guaranteed to be atomic by the Python
standard.

Parameters
----------
Expand All @@ -145,10 +158,17 @@ class JSONCollection(SyncedCollection):
\*\*kwargs :
Keyword arguments forwarded to parent constructors.

Warnings
--------
This class is _not_ thread safe on Windows. It relies on ``os.replace`` for
atomic file writes, and that method can fail in multithreaded situations if
open handles exist to the destination file withiin the same process on a
different thread. See https://bugs.python.org/issue46003 for more
information.
"""

_backend = __name__ # type: ignore
_supports_threading = True
_supports_threading = _IsWindows() # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_supports_threading = _IsWindows() # type: ignore
_supports_threading = not _IsWindows() # type: ignore

Copy link
Contributor Author

@vyasr vyasr Nov 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've been spending too much time in compiled languages and so I created _IsWindows just to have something that would be very obviously something evaluated at "runtime" (as opposed to a module-level constant evaluated at "compile-time" even though that obviously doesn't exist in Python). In practice I don't know that this adds any extra clarity and just increases boilerplate. What if I just inlined

_supports_threading = not ( sys.platform.startswith("win32") or sys.platform.startswith("cygin"))

Then your previous two comments also become moot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done this for now, let me know if you decide you don't like it and we can go back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having a dedicated private constant for this makes sense and makes the code more readable and overall more consistent. We could reuse that variable also in the tests, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could. I wouldn't want to leak a constant like that into our public API, but that's the sort of gray area where I would be OK with importing internals into tests I suppose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

_validators: Sequence_t[Callable] = (require_string_key, json_format_validator)

def __init__(self, filename=None, write_concern=False, *args, **kwargs):
Expand Down
12 changes: 9 additions & 3 deletions signac/contrib/job.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,10 +131,16 @@ def _save(self):
raise

# Since all the jobs are equivalent, just grab the filename from the
# last one and init it. Also migrate the lock for multithreaded support.
old_lock_id = self._lock_id
# last one and init it. Also migrate the lock for multithreaded
# support. The sequence of operations here is imoprtant since changing
# the filename updates the lock id.
if type(self)._threading_support_is_active:
old_lock_id = self._lock_id

self._filename = job._statepoint_filename
type(self)._locks[self._lock_id] = type(self)._locks.pop(old_lock_id)

if type(self)._threading_support_is_active:
type(self)._locks[self._lock_id] = type(self)._locks.pop(old_lock_id)
csadorf marked this conversation as resolved.
Show resolved Hide resolved

if should_init:
# Only initializing one job assumes that all changes in init are
Expand Down
14 changes: 0 additions & 14 deletions tests/test_synced_collections/synced_collection_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -451,13 +451,6 @@ class A:
with pytest.raises(TypeError):
synced_collection[key] = testdata

@pytest.mark.xfail(
reason=(
"This test sometimes fails. This may indicate a race condition. "
"The test fails more consistently on Windows but also appears on "
"Linux in CI."
),
)
def test_multithreaded(self, synced_collection):
"""Test multithreaded runs of synced dicts."""
if not type(synced_collection)._supports_threading:
Expand Down Expand Up @@ -770,13 +763,6 @@ def test_nested_list_with_dict(self, synced_collection):
assert isinstance(child2, SyncedCollection)
assert isinstance(child1, SyncedCollection)

@pytest.mark.xfail(
reason=(
"This test sometimes fails. This may indicate a race condition. "
"The test fails more consistently on Windows but also appears on "
"Linux in CI."
),
)
def test_multithreaded(self, synced_collection):
"""Test multithreaded runs of synced lists."""
if not type(synced_collection)._supports_threading:
Expand Down
75 changes: 39 additions & 36 deletions tests/test_synced_collections/test_json_buffered_collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import itertools
import json
import os
import sys
import time
from concurrent.futures import ThreadPoolExecutor

Expand All @@ -23,6 +24,8 @@
)
from signac._synced_collections.errors import BufferedError, MetadataError

WINDOWS = sys.platform.startswith("win32") or sys.platform.startswith("cygin")
vyasr marked this conversation as resolved.
Show resolved Hide resolved


class BufferedJSONCollectionTest(JSONCollectionTest):
def load(self, collection):
Expand Down Expand Up @@ -376,11 +379,11 @@ def multithreaded_buffering_test(self, op, tmpdir):
# truly flushed correctly.
assert self._collection_type.get_current_buffer_size() == 0

@pytest.mark.xfail(
@pytest.mark.skipif(
WINDOWS,
reason=(
"This test sometimes fails. This may indicate a race condition. "
"The test fails more consistently on Windows but also appears on "
"Linux in CI."
"The JSONCollection cannot be safely used in multithreaded settings "
"on Windows due to https://bugs.python.org/issue46003."
),
)
def test_multithreaded_buffering_setitem(self, tmpdir):
Expand All @@ -392,11 +395,11 @@ def setitem_dict(sd, data):

self.multithreaded_buffering_test(setitem_dict, tmpdir)

@pytest.mark.xfail(
@pytest.mark.skipif(
WINDOWS,
reason=(
"This test sometimes fails. This may indicate a race condition. "
"The test fails more consistently on Windows but also appears on "
"Linux in CI."
"The JSONCollection cannot be safely used in multithreaded settings "
"on Windows due to https://bugs.python.org/issue46003."
),
)
def test_multithreaded_buffering_update(self, tmpdir):
Expand All @@ -407,11 +410,11 @@ def update_dict(sd, data):

self.multithreaded_buffering_test(update_dict, tmpdir)

@pytest.mark.xfail(
@pytest.mark.skipif(
WINDOWS,
reason=(
"This test sometimes fails. This may indicate a race condition. "
"The test fails more consistently on Windows but also appears on "
"Linux in CI."
"The JSONCollection cannot be safely used in multithreaded settings "
"on Windows due to https://bugs.python.org/issue46003."
),
)
def test_multithreaded_buffering_reset(self, tmpdir):
Expand All @@ -422,11 +425,11 @@ def reset_dict(sd, data):

self.multithreaded_buffering_test(reset_dict, tmpdir)

@pytest.mark.xfail(
@pytest.mark.skipif(
WINDOWS,
reason=(
"This test sometimes fails. This may indicate a race condition. "
"The test fails more consistently on Windows but also appears on "
"Linux in CI."
"The JSONCollection cannot be safely used in multithreaded settings "
"on Windows due to https://bugs.python.org/issue46003."
),
)
def test_multithreaded_buffering_clear(self, tmpdir):
Expand Down Expand Up @@ -474,11 +477,11 @@ def test_multithreaded_buffering_clear(self, tmpdir):
# truly flushed correctly.
assert self._collection_type.get_current_buffer_size() == 0

@pytest.mark.xfail(
@pytest.mark.skipif(
WINDOWS,
reason=(
"This test sometimes fails. This may indicate a race condition. "
"The test fails more consistently on Windows but also appears on "
"Linux in CI."
"The JSONCollection cannot be safely used in multithreaded settings "
"on Windows due to https://bugs.python.org/issue46003."
),
)
def test_multithreaded_buffering_load(self, tmpdir):
Expand Down Expand Up @@ -670,11 +673,11 @@ def multithreaded_buffering_test(self, op, requires_init, tmpdir):
# truly flushed correctly.
assert self._collection_type.get_current_buffer_size() == 0

@pytest.mark.xfail(
@pytest.mark.skipif(
WINDOWS,
reason=(
"This test sometimes fails. This may indicate a race condition. "
"The test fails more consistently on Windows but also appears on "
"Linux in CI."
"The JSONCollection cannot be safely used in multithreaded settings "
"on Windows due to https://bugs.python.org/issue46003."
),
)
def test_multithreaded_buffering_setitem(self, tmpdir):
Expand All @@ -686,11 +689,11 @@ def setitem_list(sd, data):

self.multithreaded_buffering_test(setitem_list, True, tmpdir)

@pytest.mark.xfail(
@pytest.mark.skipif(
WINDOWS,
reason=(
"This test sometimes fails. This may indicate a race condition. "
"The test fails more consistently on Windows but also appears on "
"Linux in CI."
"The JSONCollection cannot be safely used in multithreaded settings "
"on Windows due to https://bugs.python.org/issue46003."
),
)
def test_multithreaded_buffering_extend(self, tmpdir):
Expand All @@ -701,11 +704,11 @@ def extend_list(sd, data):

self.multithreaded_buffering_test(extend_list, False, tmpdir)

@pytest.mark.xfail(
@pytest.mark.skipif(
WINDOWS,
reason=(
"This test sometimes fails. This may indicate a race condition. "
"The test fails more consistently on Windows but also appears on "
"Linux in CI."
"The JSONCollection cannot be safely used in multithreaded settings "
"on Windows due to https://bugs.python.org/issue46003."
),
)
def test_multithreaded_buffering_append(self, tmpdir):
Expand All @@ -717,11 +720,11 @@ def append_list(sd, data):

self.multithreaded_buffering_test(append_list, False, tmpdir)

@pytest.mark.xfail(
@pytest.mark.skipif(
WINDOWS,
reason=(
"This test sometimes fails. This may indicate a race condition. "
"The test fails more consistently on Windows but also appears on "
"Linux in CI."
"The JSONCollection cannot be safely used in multithreaded settings "
"on Windows due to https://bugs.python.org/issue46003."
),
)
def test_multithreaded_buffering_load(self, tmpdir):
Expand Down
13 changes: 13 additions & 0 deletions tests/test_synced_collections/test_json_collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
# This software is licensed under the BSD 3-Clause License.
import json
import os
import sys

import pytest
from attr_dict_test import AttrDictTest, AttrListTest
Expand All @@ -15,6 +16,8 @@
JSONList,
)

WINDOWS = sys.platform.startswith("win32") or sys.platform.startswith("cygin")
vyasr marked this conversation as resolved.
Show resolved Hide resolved


class JSONCollectionTest:

Expand Down Expand Up @@ -42,6 +45,16 @@ def synced_collection_positional(self, tmpdir):
def test_filename(self, synced_collection):
assert os.path.basename(synced_collection.filename) == self._fn

@pytest.mark.skipif(
WINDOWS,
reason=(
"The JSONCollection cannot be safely used in multithreaded settings "
"on Windows due to https://bugs.python.org/issue46003."
),
)
def test_multithreaded(self, synced_collection):
return super().test_multithreaded(synced_collection)


class TestJSONDict(JSONCollectionTest, SyncedDictTest):

Expand Down