bpo-39207: Spawn workers on demand in ProcessPoolExecutor #19453

aeros · 2020-04-10T00:42:08Z

Roughly based on 904e34d, but with a few substantial differences.

https://bugs.python.org/issue39207

Automerge-Triggered-By: @pitrou

Lib/concurrent/futures/process.py

aeros · 2020-04-10T01:10:41Z

I think the MacOS failure can be resolved by setting _idle_workers_semaphore = None in executor.shutdown(), it appears to be an issue with excessive FDs used. I'll see if that addresses the problem.

aeros · 2020-04-10T02:07:34Z

Hmm... I could potentially try using a weakref of the executor to access the semaphore (and then deleting it) instead of directly using it in _process_worker; in theory, this should allow it to be GC'd sooner, potentially freeing up the associated file descriptor earlier, and ensuring we don't try to access it after the executor's resources have been finalized (which could be the main issue). For example:

# [executor_reference would be passed as an argument from _adjust_process_count()]
executor = executor_reference()
if executor is not None:
    executor._idle_worker_semaphore.release()
del executor

But, I'd also be willing to try any other possible solutions to address the MacOS failure. See the log for details.

Edit: Never mind, I just recalled that weakref doesn't work for processes since they're not pickle-able objects. So, in order to access the semaphore through the executor weakref, the semaphore release would have to be moved to the executor management thread instead of being within the process; I'll try that next.

aeros · 2020-04-10T04:53:16Z

Moving the semaphore to be accessed and released through a weakref to the executor in the management thread addressed the problem. I'm certain it was an issue with file descriptors, but not 100% sure if it was related to an excessive number of them being used at once or if it was trying to access a file descriptor that had already been removed.

Ultimately, it results in a slight delay between when the worker is finished to when the idle semaphore is released (compared to releasing immediately at the end of _process_worker()), but it will still be a significant overall improvement.

bedevere-bot · 2020-04-10T05:51:49Z

🤖 New build scheduled with the buildbot fleet by @aeros for commit 28bb669 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

aeros · 2020-04-10T06:27:52Z

Lib/test/test_concurrent_futures.py

+        executor = self.executor_type()
+        executor.submit(mul, 12, 7).result()
+        executor.submit(mul, 33, 25)
+        executor.submit(mul, 25, 26).result()
+        executor.submit(mul, 18, 29)
+        self.assertEqual(len(executor._processes), 2)


This test might be subject to race conditions, as indicated in https://buildbot.python.org/all/#/builders/296/builds/46:

====================================================================== FAIL: test_idle_process_reuse_multiple (test.test_concurrent_futures.ProcessPoolSpawnProcessPoolExecutorTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/root/buildarea/pull_request.angelico-debian-amd64/build/Lib/test/test_concurrent_futures.py", line 1002, in test_idle_process_reuse_multiple self.assertEqual(len(executor._processes), 2) AssertionError: 1 != 2

Instead, I think it would probably make more sense to check self.assertTrue(len(executor._processes) <= 2), after explicitly setting max workers to a specific amount, e.g. executor = self.executor_type(4), and then submitting _max_workers jobs. It's perfectly okay if there are less than 2 workers as a result of the jobs being completed quickly, but there should never be more than two if the idle workers are being properly reused (since we directly waited for the result() on two of them).

I'll take a look at this again tomorrow, after the buildbot tests have completed.

pitrou · 2020-04-10T07:23:35Z

@mrocklin, would you have any high-level opinion on this change?

mrocklin · 2020-04-10T15:24:10Z

From a performance perspective I guess that the point here is that we're making it faster to create a ProcessPoolExecutor, but making the first few tasks (or the first few times when we have enough concurrent tasks) slower. There might be some cost in that we could have spawned many processes concurrently with greater speed than little by little during execution (maybe).

There are use cases where I would not be surprised if users wanted to create all of the processes at startup time, probably because they want more predictable performance behavior. It might be nice if there was some convenient method to ensure that there were as many worker processes as requested workers. The people who are likely to care about this though are probably sophisticated to map time.sleep or something though.

Anyway, in short I don't have strong opinions either way. Personally, I had already assumed that the ProcessPoolExecutor created processes on demand (probably just because of familiarity with the ThreadPoolExecutor).

aeros · 2020-04-10T20:28:45Z

From a performance perspective I guess that the point here is that we're making it faster to create a ProcessPoolExecutor, but making the first few tasks (or the first few times when we have enough concurrent tasks) slower.

In the issue discussion, there is also the case where a one uses the default number of max workers (os.cpu_count()) on a device with a large number of cores. They may not need to use all of them for their jobs to be completed concurrently, but end up incurring the cost to spawn all of the processes at startup; only for many of them to be idle for the full duration of the executor.

The above would also apply to users with a smaller number of cores, but I think the most significant impact of this change would apply to users with a large number of cores.

It might be nice if there was some convenient method to ensure that there were as many worker processes as requested workers. The people who are likely to care about this though are probably sophisticated to map time.sleep or something though.

It could be possible to consider an implementation of _adjust_process_count() that spawns a worker process for each pending work item. I.E.

while (len(self._processes) < len(self._pending_work_items)
            and len(self._processes) < self._max_workers):
    p = self._mp_context.Process(
               target=_process_worker,
               args=(self._call_queue,
                           self._result_queue,
                           self._initializer,
                           self._initargs))
    p.start()
    self._processes[p.pid] = p

The tricky part though is that the length of the self._pending_work_items dict is an estimate when measured across threads (since it's being modified constantly in the executor management thread), and self._processes of course includes ones that are already in use. So, I'm not certain how helpful it would be in reality.

Personally, I had already assumed that the ProcessPoolExecutor created processes on demand (probably just because of familiarity with the ThreadPoolExecutor).

I'd wager that's the current mentality of most users. Unless they've explored the internals of ProcessPoolExecutor extensively, they'd have no reason to assume it handled worker spawning any differently than ThreadPoolExecutor. This is reinforced by the docs, which suggests that it spawns up to max_workers processes rather than always spawning max_workers processes:

Executor subclass that executes calls asynchronously using a pool of at most max_workers processes.

pitrou

This looks basically good to me, just a couple comments.

Lib/concurrent/futures/process.py

Lib/test/test_concurrent_futures.py

* Use try-finally for executor.shutdown() in test_saturation * Use assertLessEqual * Simplify semaphore acquire Co-authored-by: Antoine Pitrou <[email protected]>

aeros · 2020-04-19T01:10:13Z

Thanks for the review @pitrou, I believe I've addressed all of the comments. Would you care to give it a final look-over? This would be my first self-authored merge.

pitrou

Thank you @aeros . I will merge now.

vstinner · 2020-04-19T21:29:07Z

@pitrou: You could have let @aeros merges the PR, he is a now a core dev 😉

pitrou · 2020-04-19T21:30:51Z

Ah, sorry, I hadn't thought about that.

aeros · 2020-04-19T21:37:35Z

No worries :)

bpo-39207: Spawn workers on demand in ProcessPoolExecutor (pythonGH-19453) was implemented in a way that introduced child process deadlocks in some environments due to mixing threading and fork() in the parent process.

Spawn workers on demand in ProcessPoolExecutor

674574a

the-knights-who-say-ni added the CLA signed label Apr 10, 2020

bedevere-bot added the awaiting review label Apr 10, 2020

aeros added the performance Performance or resource usage label Apr 10, 2020

aeros commented Apr 10, 2020

View reviewed changes

Lib/concurrent/futures/process.py Outdated Show resolved Hide resolved

Set _idle_worker_semaphore to None in executor.shutdown()

5e5e4f1

aeros force-pushed the bpo39207-ppe-fix-idle-workers branch from dea5df4 to 5e5e4f1 Compare April 10, 2020 01:23

📜🤖 Added by blurb_it.

04e5927

Use executor weakref to access semaphore

28bb669

aeros requested review from pitrou and brianquinlan April 10, 2020 04:54

aeros added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Apr 10, 2020

bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Apr 10, 2020

aeros commented Apr 10, 2020

View reviewed changes

Fix race condition in unit tests

04e6b29

pitrou reviewed Apr 18, 2020

View reviewed changes

Lib/concurrent/futures/process.py Outdated Show resolved Hide resolved

Lib/test/test_concurrent_futures.py Outdated Show resolved Hide resolved

Lib/test/test_concurrent_futures.py Outdated Show resolved Hide resolved

Lib/test/test_concurrent_futures.py Show resolved Hide resolved

aeros and others added 2 commits April 18, 2020 19:50

Change mp.Semaphore to threading.Semaphore

75690c7

Several improvements to ProcessPoolExecutor tests.

d390e51

* Use try-finally for executor.shutdown() in test_saturation * Use assertLessEqual * Simplify semaphore acquire Co-authored-by: Antoine Pitrou <[email protected]>

pitrou approved these changes Apr 19, 2020

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting review labels Apr 19, 2020

pitrou added the 🤖 automerge label Apr 19, 2020

miss-islington merged commit 1ac6e37 into python:master Apr 19, 2020

bedevere-bot removed the awaiting merge label Apr 19, 2020

aeros deleted the bpo39207-ppe-fix-idle-workers branch April 19, 2020 22:48

tomMoral mentioned this pull request Apr 21, 2022

gh-90622: Do not spawn ProcessPool workers on demand via fork method. #91598

Merged

gaogaotiantian mentioned this pull request Feb 19, 2024

gh-115634: Force the process pool to adjust when a process worker exits #115642

Open

gpshead mentioned this pull request Oct 14, 2024

test_concurrent_futures.test_shutdown.test_processes_terminate() hangs randomly when run multiple times #125451

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-39207: Spawn workers on demand in ProcessPoolExecutor #19453

bpo-39207: Spawn workers on demand in ProcessPoolExecutor #19453

aeros commented Apr 10, 2020 •

edited by miss-islington

Loading

aeros commented Apr 10, 2020 •

edited

Loading

aeros commented Apr 10, 2020 •

edited

Loading

aeros commented Apr 10, 2020 •

edited

Loading

bedevere-bot commented Apr 10, 2020

aeros Apr 10, 2020 •

edited

Loading

pitrou commented Apr 10, 2020

mrocklin commented Apr 10, 2020

aeros commented Apr 10, 2020 •

edited

Loading

pitrou left a comment

aeros commented Apr 19, 2020

pitrou left a comment

vstinner commented Apr 19, 2020

pitrou commented Apr 19, 2020

aeros commented Apr 19, 2020

bpo-39207: Spawn workers on demand in ProcessPoolExecutor #19453

bpo-39207: Spawn workers on demand in ProcessPoolExecutor #19453

Conversation

aeros commented Apr 10, 2020 • edited by miss-islington Loading

aeros commented Apr 10, 2020 • edited Loading

aeros commented Apr 10, 2020 • edited Loading

aeros commented Apr 10, 2020 • edited Loading

bedevere-bot commented Apr 10, 2020

aeros Apr 10, 2020 • edited Loading

Choose a reason for hiding this comment

pitrou commented Apr 10, 2020

mrocklin commented Apr 10, 2020

aeros commented Apr 10, 2020 • edited Loading

pitrou left a comment

Choose a reason for hiding this comment

aeros commented Apr 19, 2020

pitrou left a comment

Choose a reason for hiding this comment

vstinner commented Apr 19, 2020

pitrou commented Apr 19, 2020

aeros commented Apr 19, 2020

aeros commented Apr 10, 2020 •

edited by miss-islington

Loading

aeros commented Apr 10, 2020 •

edited

Loading

aeros commented Apr 10, 2020 •

edited

Loading

aeros commented Apr 10, 2020 •

edited

Loading

aeros Apr 10, 2020 •

edited

Loading

aeros commented Apr 10, 2020 •

edited

Loading