-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Re)implement threading comms #1156
Conversation
Hey @AngelFP ! We'd be happy to formally support threading comms, and its especially great that the implementation was easy since its restoring code we recently removed :) Thanks! Either you or we can do this, but much of |
Sounds great! Note that the If you are fine with using this PR, I can add back the relevant tests. |
Hmm, I had a look at the old tests just now, and they require these classes that are not used anywhere else (and which I didn't have to restore). Would it be better to define new tests for |
Sounds good to me! (although looking at those tests they aren't as substantial as I remembered, but maybe that's okay since the comms are tested everywhere else) |
We can support, and see how well this works, but I would say use with caution. Threads share an environment, including current working directory. Setting enviornment variables like CUDA_VISIBLE_DEVICES will be shared, unless we move that into a sub-process. These kind of issues are why we did not use the threading interface in the past. |
Codecov Report
@@ Coverage Diff @@
## develop #1156 +/- ##
===========================================
- Coverage 71.51% 71.33% -0.19%
===========================================
Files 88 88
Lines 7987 8065 +78
Branches 1427 1433 +6
===========================================
+ Hits 5712 5753 +41
- Misses 2022 2055 +33
- Partials 253 257 +4
... and 3 files with indirect coverage changes 📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today! |
Makes sense. The idea from the optimas side would be to enable threading only for light function evaluations that do not require the allocation of resources (num_procs or num_gpus) nor the creation of simulation directories. So, basically, only for user-defined functions evaluated with the |
As discussed in past meetings, we are currently trying to support running Optimas on Jupyter notebooks to allow for a more interactive workflow. However, when testing this a couple of days ago I realized that this is currently not possible due to the use of
multiprocessing
for the local communications. Basically,multiprocessing
and IPython/Jupyter don't play well together (especially on Windows and Mac, which usespawn
) and I couldn't get optimas to run without significant workarounds.The ideal solution would be that libEnsemble supported a
threading
local mode. I gave this a try and also found out that there was an early implementation of this in themain
branch (was removed in #1123). I updated theQCommThread
class to be more similar toQCommProcess
and exposed a newlocal_threading
option inlibe_specs['comms']
. Using this option enables optimas to run on a notebook without any modifications or workarounds by the user.There are still some issues in the current PR. For example, I had to set
log_comm=False
in the threading backend to prevent an infinite "loop" of logs that keeps filling up the outbox. I suspect this arises somehow because now the workers share the same memory. The shared memory might also lead to issues with how the resources are currently assigned to workers.Having this threading backend would be great, and the suggested implementation already solves the issues that we need to (urgently) address in optimas. Do you think it is possible to fully support this mode in libEnsemble?
As a bonus, I think this would enable an alternative solution to optimas-org/optimas#134, where we could define an option such as
libe_specs["run_persis_gen_on_thread"]=True
to run the generator on a thread (that is, it would be able to share the memory with the manager).