-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-39207: Spawn workers on demand in ProcessPoolExecutor #19453
bpo-39207: Spawn workers on demand in ProcessPoolExecutor #19453
Conversation
I think the MacOS failure can be resolved by setting |
dea5df4
to
5e5e4f1
Compare
Hmm... I could potentially try using a weakref of the executor to access the semaphore (and then deleting it) instead of directly using it in # [executor_reference would be passed as an argument from _adjust_process_count()]
executor = executor_reference()
if executor is not None:
executor._idle_worker_semaphore.release()
del executor But, I'd also be willing to try any other possible solutions to address the MacOS failure. See the log for details. Edit: Never mind, I just recalled that weakref doesn't work for processes since they're not pickle-able objects. So, in order to access the semaphore through the executor weakref, the semaphore release would have to be moved to the executor management thread instead of being within the process; I'll try that next. |
Moving the semaphore to be accessed and released through a weakref to the executor in the management thread addressed the problem. I'm certain it was an issue with file descriptors, but not 100% sure if it was related to an excessive number of them being used at once or if it was trying to access a file descriptor that had already been removed. Ultimately, it results in a slight delay between when the worker is finished to when the idle semaphore is released (compared to releasing immediately at the end of |
Lib/test/test_concurrent_futures.py
Outdated
executor = self.executor_type() | ||
executor.submit(mul, 12, 7).result() | ||
executor.submit(mul, 33, 25) | ||
executor.submit(mul, 25, 26).result() | ||
executor.submit(mul, 18, 29) | ||
self.assertEqual(len(executor._processes), 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test might be subject to race conditions, as indicated in https://buildbot.python.org/all/#/builders/296/builds/46:
======================================================================
FAIL: test_idle_process_reuse_multiple (test.test_concurrent_futures.ProcessPoolSpawnProcessPoolExecutorTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/root/buildarea/pull_request.angelico-debian-amd64/build/Lib/test/test_concurrent_futures.py", line 1002, in test_idle_process_reuse_multiple
self.assertEqual(len(executor._processes), 2)
AssertionError: 1 != 2
Instead, I think it would probably make more sense to check self.assertTrue(len(executor._processes) <= 2)
, after explicitly setting max workers to a specific amount, e.g. executor = self.executor_type(4)
, and then submitting _max_workers
jobs. It's perfectly okay if there are less than 2 workers as a result of the jobs being completed quickly, but there should never be more than two if the idle workers are being properly reused (since we directly waited for the result()
on two of them).
I'll take a look at this again tomorrow, after the buildbot tests have completed.
@mrocklin, would you have any high-level opinion on this change? |
From a performance perspective I guess that the point here is that we're making it faster to create a ProcessPoolExecutor, but making the first few tasks (or the first few times when we have enough concurrent tasks) slower. There might be some cost in that we could have spawned many processes concurrently with greater speed than little by little during execution (maybe). There are use cases where I would not be surprised if users wanted to create all of the processes at startup time, probably because they want more predictable performance behavior. It might be nice if there was some convenient method to ensure that there were as many worker processes as requested workers. The people who are likely to care about this though are probably sophisticated to map Anyway, in short I don't have strong opinions either way. Personally, I had already assumed that the |
In the issue discussion, there is also the case where a one uses the default number of max workers ( The above would also apply to users with a smaller number of cores, but I think the most significant impact of this change would apply to users with a large number of cores.
It could be possible to consider an implementation of while (len(self._processes) < len(self._pending_work_items)
and len(self._processes) < self._max_workers):
p = self._mp_context.Process(
target=_process_worker,
args=(self._call_queue,
self._result_queue,
self._initializer,
self._initargs))
p.start()
self._processes[p.pid] = p The tricky part though is that the length of the
I'd wager that's the current mentality of most users. Unless they've explored the internals of ProcessPoolExecutor extensively, they'd have no reason to assume it handled worker spawning any differently than ThreadPoolExecutor. This is reinforced by the docs, which suggests that it spawns up to max_workers processes rather than always spawning max_workers processes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks basically good to me, just a couple comments.
* Use try-finally for executor.shutdown() in test_saturation * Use assertLessEqual * Simplify semaphore acquire Co-authored-by: Antoine Pitrou <[email protected]>
Thanks for the review @pitrou, I believe I've addressed all of the comments. Would you care to give it a final look-over? This would be my first self-authored merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @aeros . I will merge now.
Ah, sorry, I hadn't thought about that. |
No worries :) |
bpo-39207: Spawn workers on demand in ProcessPoolExecutor (pythonGH-19453) was implemented in a way that introduced child process deadlocks in some environments due to mixing threading and fork() in the parent process.
Roughly based on 904e34d, but with a few substantial differences.
/cc @pitrou @brianquinlan
https://bugs.python.org/issue39207
Automerge-Triggered-By: @pitrou