Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unittest: execute tests in parallel #82054

Closed
donpellegrino mannequin opened this issue Aug 16, 2019 · 8 comments
Closed

unittest: execute tests in parallel #82054

donpellegrino mannequin opened this issue Aug 16, 2019 · 8 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@donpellegrino
Copy link
Mannequin

donpellegrino mannequin commented Aug 16, 2019

BPO 37873
Nosy @terryjreedy, @giampaolo, @ezio-melotti, @voidspace, @bharel, @tirkarthi, @donpellegrino

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2019-08-16.14:15:27.971>
labels = ['type-feature', 'library', '3.9']
title = 'unittest: execute tests in parallel'
updated_at = <Date 2020-09-02.09:57:36.866>
user = 'https://github.com/donpellegrino'

bugs.python.org fields:

activity = <Date 2020-09-02.09:57:36.866>
actor = 'vstinner'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2019-08-16.14:15:27.971>
creator = 'user93448'
dependencies = []
files = []
hgrepos = []
issue_num = 37873
keywords = []
message_count = 5.0
messages = ['349864', '349938', '350080', '375126', '375128']
nosy_count = 7.0
nosy_names = ['terry.reedy', 'giampaolo.rodola', 'ezio.melotti', 'michael.foord', 'bar.harel', 'xtreak', 'user93448']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue37873'
versions = ['Python 3.9']

Linked PRs

@donpellegrino
Copy link
Mannequin Author

donpellegrino mannequin commented Aug 16, 2019

The unittest documentation makes reference to a potential parallelization feature:

"Note that shared fixtures do not play well with [potential] features like test parallelization and they break test isolation. They should be used with care." (https://docs.python.org/3/library/unittest.html)

However, it seems that executing tests in parallel is not yet a feature of unittest. This enhancement request is to add parallel execution of tests to unittest.

A command line option may be a good interface. Ideally, it would be compatible with test discovery. Outside of the Python ecosystem, a common practice is to define test cases in a Makefile and then execute GNU Make with the '-j' flag (https://www.gnu.org/software/make/manual/html_node/Parallel.html#Parallel). Adding such an option to unittest would be a convenience and may save the effort of bringing in additional libraries or tools for parallel unit test execution.

@donpellegrino donpellegrino mannequin added 3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Aug 16, 2019
@terryjreedy
Copy link
Member

test.regrtest has a -j option. Perhaps some of the Python coding for that could be used for unitest also.

@tirkarthi
Copy link
Member

See also https://mail.python.org/pipermail/python-ideas/2017-September/047100.html . One of the ideas in the thread was to move test.regrtest parallel execution functionality into unittest. I think this would be good to have it in unittest like support in pytest for -j.

@donpellegrino
Copy link
Mannequin Author

donpellegrino mannequin commented Aug 10, 2020

Leveraging GNU Parallel (https://www.gnu.org/software/parallel/) might help simplify implementation. Perhaps that could be used as a subprocess call?

@vstinner
Copy link
Member

Leveraging GNU Parallel (https://www.gnu.org/software/parallel/) might help simplify implementation. Perhaps that could be used as a subprocess call?

In general, we attempt to avoid depending on the availability of external tool. For example, I don't expect this tool to be available on Windows, whereas it would be better to support parallel execution on Windows as well.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@gpshead
Copy link
Member

gpshead commented Nov 19, 2022

An example of implementing sharding within a single unittest so they can run in parallel across whatever number of processes and machines you want is what we use at work: https://github.com/abseil/abseil-py/blob/v1.3.0/absl/testing/absltest.py#L2359 using the Bazel sharding protocol.

If test.regrtest supported using such a sharding protocol and unittest adopted that TestCase.getTestCaseNames implementation logic to compute the shards for each process, we'd really pull in our long tail test time in CPython's own unittest suite.

One of the ideas in the thread was to move test.regrtest parallel execution functionality into unittest. I think this would be good to have it in unittest like support in pytest for -j.

Yes. Though wider Python community wise you'll just find many people saying "just use pytest-xdist" in your project for -j support. I can't disagree with that.

I still want parallelization for CPython's own unittest suite (launched via python -m test) regardless of that. It'd help save core dev & contributor time as well as increasing buildbot and CI thruput.

@gpshead gpshead added 3.12 bugs and security fixes and removed 3.9 only security fixes labels Nov 19, 2022
zitterbewegung pushed a commit to zitterbewegung/cpython that referenced this issue Apr 25, 2023
zitterbewegung added a commit to zitterbewegung/cpython that referenced this issue Apr 25, 2023
zitterbewegung added a commit to zitterbewegung/cpython that referenced this issue Apr 26, 2023
zitterbewegung added a commit to zitterbewegung/cpython that referenced this issue Apr 27, 2023
zitterbewegung added a commit to zitterbewegung/cpython that referenced this issue Apr 27, 2023
gpshead pushed a commit that referenced this issue Apr 30, 2023
…lel by sharding. (#103927)

This runs test_asyncio sub-tests in parallel using sharding from Cinder. This suite is typically the longest-pole in runs because it is a test package with a lot of further sub-tests otherwise run serially. By breaking out the sub-tests as independent modules we can run a lot more in parallel.

After porting we can see the direct impact on a multicore system.

Without this change:
  Running make test is 5 min 26 seconds
With this change:
  Running make test takes 3 min 39 seconds

That'll vary based on system and parallelism. On a `-j 4` run similar to what CI and buildbot systems often do, it reduced the overall test suite completion latency by 10%.

The drawbacks are that this implementation is hacky and due to the sorting of the tests it obscures when the asyncio tests occur and involves changing CPython test infrastructure but, the wall time saved it is worth it, especially in low-core count CI runs as it pulls a long tail. The win for productivity and reserved CI resource usage is significant.

Future tests that deserve to be refactored into split up suites to benefit from are test_concurrent_futures and the way the _test_multiprocessing suite gets run for all start methods. As exposed by passing the -o flag to python -m test to get a list of the 10 longest running tests.

---------

Co-authored-by: Carl Meyer <[email protected]>
Co-authored-by: Gregory P. Smith <[email protected]> [Google, LLC]
@gpshead gpshead removed the 3.12 bugs and security fixes label Apr 30, 2023
@gpshead
Copy link
Member

gpshead commented Apr 30, 2023

I'm leaving this open as there's plenty more opportunity to do things here (up to and including potentially making my full parallelism PR work reliably). The just merged PR at least splits our longest test, test_asyncio, up into parallel sub-tests given it was already defined as a package of tests in separate files.

Refactoring test_concurrent_futures.py and the entire _test_multiprocessing.py suite and its trio of start-method runners into similar package of sub-test.py files would allow it to use the already committed pieces upon make test or python -m test when no list of tests is specified to further pull in the long tail on default run-everything CI, buildbot, and developer runs.

carljm added a commit to carljm/cpython that referenced this issue May 1, 2023
* main: (26 commits)
  pythongh-104028: Reduce object creation while calling callback function from gc (pythongh-104030)
  pythongh-104036: Fix direct invocation of test_typing (python#104037)
  pythongh-102213: Optimize the performance of `__getattr__` (pythonGH-103761)
  pythongh-103895: Improve how invalid `Exception.__notes__` are displayed (python#103897)
  Adjust expression from `==` to `!=` in alignment with the meaning of the paragraph. (pythonGH-104021)
  pythongh-88496: Fix IDLE test hang on macOS (python#104025)
  Improve int test coverage (python#104024)
  pythongh-88773: Added teleport method to Turtle library (python#103974)
  pythongh-104015: Fix direct invocation of `test_dataclasses` (python#104017)
  pythongh-104012: Ensure test_calendar.CalendarTestCase.test_deprecation_warning consistently passes (python#104014)
  pythongh-103977: compile re expressions in platform.py only if required (python#103981)
  pythongh-98003: Inline call frames for CALL_FUNCTION_EX (pythonGH-98004)
  Replace Netlify with Read the Docs build previews (python#103843)
  Update name in acknowledgements and add mailmap (python#103696)
  pythongh-82054: allow test runner to split test_asyncio to execute in parallel by sharding. (python#103927)
  Remove non-existing tools from Sundry skiplist (python#103991)
  pythongh-103793: Defer formatting task name (python#103767)
  pythongh-87092: change assembler to use instruction sequence instead of CFG (python#103933)
  pythongh-103636: issue warning for deprecated calendar constants (python#103833)
  Various small fixes to dis docs (python#103923)
  ...
vstinner pushed a commit to vstinner/cpython that referenced this issue Sep 2, 2023
… parallel by sharding. (python#103927)

This runs test_asyncio sub-tests in parallel using sharding from Cinder. This suite is typically the longest-pole in runs because it is a test package with a lot of further sub-tests otherwise run serially. By breaking out the sub-tests as independent modules we can run a lot more in parallel.

After porting we can see the direct impact on a multicore system.

Without this change:
  Running make test is 5 min 26 seconds
With this change:
  Running make test takes 3 min 39 seconds

That'll vary based on system and parallelism. On a `-j 4` run similar to what CI and buildbot systems often do, it reduced the overall test suite completion latency by 10%.

The drawbacks are that this implementation is hacky and due to the sorting of the tests it obscures when the asyncio tests occur and involves changing CPython test infrastructure but, the wall time saved it is worth it, especially in low-core count CI runs as it pulls a long tail. The win for productivity and reserved CI resource usage is significant.

Future tests that deserve to be refactored into split up suites to benefit from are test_concurrent_futures and the way the _test_multiprocessing suite gets run for all start methods. As exposed by passing the -o flag to python -m test to get a list of the 10 longest running tests.

---------

Co-authored-by: Carl Meyer <[email protected]>
Co-authored-by: Gregory P. Smith <[email protected]> [Google, LLC]
(cherry picked from commit 9e011e7)
vstinner added a commit that referenced this issue Sep 3, 2023
…108820)

* Revert "[3.11] gh-101634: regrtest reports decoding error as failed test (#106169) (#106175)"

This reverts commit d5418e9.

* Revert "[3.11] bpo-46523: fix tests rerun when `setUp[Class|Module]` fails (GH-30895) (GH-103342)"

This reverts commit ecb09a8.

* Revert "gh-95027: Fix regrtest stdout encoding on Windows (GH-98492)"

This reverts commit b2aa28e.

* Revert "[3.11] gh-94026: Buffer regrtest worker stdout in temporary file (GH-94253) (GH-94408)"

This reverts commit 0122ab2.

* Revert "Run Tools/scripts/reindent.py (GH-94225)"

This reverts commit f0f3a42.

* Revert "gh-94052: Don't re-run failed tests with --python option (GH-94054)"

This reverts commit 1347607.

* Revert "[3.11] gh-84461: Fix Emscripten umask and permission issues (GH-94002) (GH-94006)"

This reverts commit 1073184.

* gh-93353: regrtest checks for leaked temporary files (#93776)

When running tests with -jN, create a temporary directory per process
and mark a test as "environment changed" if a test leaks a temporary
file or directory.

(cherry picked from commit e566ce5)

* gh-93353: Fix regrtest for -jN with N >= 2 (GH-93813)

(cherry picked from commit 36934a1)

* gh-93353: regrtest supports checking tmp files with -j2 (#93909)

regrtest now also implements checking for leaked temporary files and
directories when using -jN for N >= 2. Use tempfile.mkdtemp() to
create the temporary directory. Skip this check on WASI.

(cherry picked from commit 4f85cec)

* gh-84461: Fix Emscripten umask and permission issues (GH-94002)

- Emscripten's default umask is too strict, see
  emscripten-core/emscripten#17269
- getuid/getgid and geteuid/getegid are stubs that always return 0
  (root). Disable effective uid/gid syscalls and fix tests that use
  chmod() current user.
- Cannot drop X bit from directory.

(cherry picked from commit 2702e40)

* gh-94052: Don't re-run failed tests with --python option (#94054)

(cherry picked from commit 0ff7b99)

* Run Tools/scripts/reindent.py (#94225)

Reindent files which were not properly formatted (PEP 8: 4 spaces).

Remove also some trailing spaces.

(cherry picked from commit e87ada4)

* gh-94026: Buffer regrtest worker stdout in temporary file (GH-94253)

Co-authored-by: Victor Stinner <[email protected]>
(cherry picked from commit 199ba23)

* gh-96465: Clear fractions hash lru_cache under refleak testing (GH-96689)

Automerge-Triggered-By: GH:zware
(cherry picked from commit 9c8f379)

* gh-95027: Fix regrtest stdout encoding on Windows (#98492)

On Windows, when the Python test suite is run with the -jN option,
the ANSI code page is now used as the encoding for the stdout
temporary file, rather than using UTF-8 which can lead to decoding
errors.

(cherry picked from commit ec1f6f5)

* gh-98903: Test suite fails with exit code 4 if no tests ran (#98904)

The Python test suite now fails wit exit code 4 if no tests ran. It
should help detecting typos in test names and test methods.

* Add "EXITCODE_" constants to Lib/test/libregrtest/main.py.
* Fix a typo: "NO TEST RUN" becomes "NO TESTS RAN"

(cherry picked from commit c76db37)

* gh-100086: Add build info to test.libregrtest (#100093)

The Python test runner (libregrtest) now logs Python build information like
"debug" vs "release" build, or LTO and PGO optimizations.

(cherry picked from commit 3c89202)

* bpo-46523: fix tests rerun when `setUp[Class|Module]` fails (#30895)

Co-authored-by: Jelle Zijlstra <[email protected]>
Co-authored-by: Łukasz Langa <[email protected]>
(cherry picked from commit 9953860)

* gh-82054: allow test runner to split test_asyncio to execute in parallel by sharding. (#103927)

This runs test_asyncio sub-tests in parallel using sharding from Cinder. This suite is typically the longest-pole in runs because it is a test package with a lot of further sub-tests otherwise run serially. By breaking out the sub-tests as independent modules we can run a lot more in parallel.

After porting we can see the direct impact on a multicore system.

Without this change:
  Running make test is 5 min 26 seconds
With this change:
  Running make test takes 3 min 39 seconds

That'll vary based on system and parallelism. On a `-j 4` run similar to what CI and buildbot systems often do, it reduced the overall test suite completion latency by 10%.

The drawbacks are that this implementation is hacky and due to the sorting of the tests it obscures when the asyncio tests occur and involves changing CPython test infrastructure but, the wall time saved it is worth it, especially in low-core count CI runs as it pulls a long tail. The win for productivity and reserved CI resource usage is significant.

Future tests that deserve to be refactored into split up suites to benefit from are test_concurrent_futures and the way the _test_multiprocessing suite gets run for all start methods. As exposed by passing the -o flag to python -m test to get a list of the 10 longest running tests.

---------

Co-authored-by: Carl Meyer <[email protected]>
Co-authored-by: Gregory P. Smith <[email protected]> [Google, LLC]
(cherry picked from commit 9e011e7)

* Display the sanitizer config in the regrtest header. (#105301)

Display the sanitizers present in libregrtest.

Having this in the CI output for tests with the relevant environment
variable displayed will help make it easier to do what we need to
create an equivalent local test run.

(cherry picked from commit 852348a)

* gh-101634: regrtest reports decoding error as failed test (#106169)

When running the Python test suite with -jN option, if a worker stdout
cannot be decoded from the locale encoding report a failed testn so the
exitcode is non-zero.

(cherry picked from commit 2ac3eec)

* gh-108223: test.pythoninfo and libregrtest log Py_NOGIL (#108238)

Enable with --disable-gil --without-pydebug:

    $ make pythoninfo|grep NOGIL
    sysconfig[Py_NOGIL]: 1

    $ ./python -m test
    ...
    == Python build: nogil debug
    ...

(cherry picked from commit 5afe0c1)

* gh-90791: test.pythoninfo logs ASAN_OPTIONS env var (#108289)

* Cleanup libregrtest code logging ASAN_OPTIONS.
* Fix a typo on "ASAN_OPTIONS" vs "MSAN_OPTIONS".

(cherry picked from commit 3a1ac87)

* gh-108388: regrtest splits test_asyncio package (#108393)

Currently, test_asyncio package is only splitted into sub-tests when
using command "./python -m test". With this change, it's also
splitted when passing it on the command line:
"./python -m test test_asyncio".

Remove the concept of "STDTESTS". Python is now mature enough to not
have to bother with that anymore. Removing STDTESTS simplify the
code.

(cherry picked from commit 174e9da)

* regrtest computes statistics (#108793)

test_netrc, test_pep646_syntax and test_xml_etree now return results
in the test_main() function.

Changes:

* Rewrite TestResult as a dataclass with a new State class.
* Add test.support.TestStats class and Regrtest.stats_dict attribute.
* libregrtest.runtest functions now modify a TestResult instance
  in-place.
* libregrtest summary lists the number of run tests and skipped
  tests, and denied resources.
* Add TestResult.has_meaningful_duration() method.
* Compute TestResult duration in the upper function.
* Use time.perf_counter() instead of time.monotonic().
* Regrtest: rename 'resource_denieds' attribute to 'resource_denied'.
* Rename CHILD_ERROR to MULTIPROCESSING_ERROR.
* Use match/case syntadx to have different code depending on the
  test state.

Co-authored-by: Alex Waygood <[email protected]>
(cherry picked from commit d4e534c)

* gh-108822: Add Changelog entry for regrtest statistics (#108821)

---------

Co-authored-by: Christian Heimes <[email protected]>
Co-authored-by: Zachary Ware <[email protected]>
Co-authored-by: Nikita Sobolev <[email protected]>
Co-authored-by: Joshua Herman <[email protected]>
Co-authored-by: Gregory P. Smith <[email protected]>
@gpshead
Copy link
Member

gpshead commented Sep 23, 2023

With the refactorings @vstinner has done recently, we're close enough that I'm going to just close this issue. See my now closed draft PR #99637 for the full parallelism implementation I started with and reasoning for why it likely isn't worth us chasing at this point.

Python users who want that parallelism in their own test suites can use absltest from https://pypi.org/project/absl-py/ or other frameworks (pytest-xdist?) that offer it.

@gpshead gpshead closed this as completed Sep 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
Status: Done
Development

No branches or pull requests

4 participants