-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
quickly return WorkerPool, add workers later #83
Comments
I like the concept of dynamically adding workers into a pool but unfortunately we're restricted by how Distributed.jl works at the moment. Attempting to do this with the existing Distributed.jl would definitely be painful and probably end up being very fragile. The concept however could warrant an iteration to Distributed.jl which could start out as an external package. Some basic thoughts on what changes would be made to the existing Distributed.jl stdlib:
|
Are you sure we can't dynamically add workers to a pool? It seems like (Though a redesign does sound good too!) |
Here's the code that calls the You're expected to add workers to the If someone wants to look into this further that would be great. They may find something I've missed or at worst validate my assessment. |
I may have thought of a workaround to this problem. If we define an alternative |
Getting many requested pods can trigger scale-up which takes time.
Currently this is dealt with with a timeout; any requested pods that do not stand up and connect by the timeout are dropped, and
launch
returns with however many pods have come up by the timeout. This can be awkward.An alternative way to deal with spin-up slowness is to return a WorkerPool quickly, maybe as soon as there is one worker connected, and continue adding workers to that pool after returning.
To be practical, this method should have a worker initialization hook, so that workers only join the workerpool after
eval
ing some quoted code inMain
, typicallyusing Packages
commands and other definitions.The text was updated successfully, but these errors were encountered: