You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the latest version of QM (21.06.04) I manually added export OMP_NUM_THREADS=1 in the ~/.bashrc. This was to remedy issues we have found with parallel (MPI) runs of the simulation codes (at least quantum espresso and siesta); these work fine in isolation, but have issues in the full build (with all codes installed)
To finalise this:
If this is sufficient to solve the problem and/or if it will have any negative impact for other codes.
We should decide where this should be injected into the build automation (possibly as a task in marvel-nccr.simulationbase).
A few more debugging and possibly the answer to the problem:
I tried to increase the number of CPUs available to the VM: I discovered that in the problematic one, the actual CPU usage (even when running with mpirun -np 2) was actually 400% rather than 200%; this hinted to me to a concurrent usage of both OPENMP and MPI.
it seems that setting export OMP_NUM_THREADS=1
before running in the problematic VM fixes the issue (@chris or @iurii, can you confirm?).
If this is the case, we can set this in the .bashrc
What was still puzzling is that setting various values of OMP_NUM_THREADS in the fast machine does not change the behaviour.
I then checked the list of linked dynamic libraries for the two executables: the difference was this: libopenblas.so.0 => /usr/lib/x86_64-linux-gnu/libopenblas.so.0
I tried to uninstall libopenblas as follows: sudo apt-get remove libopenblas-base
and now it seems that QE is back to normal speed!
As a double check, I installed the package in the fast machine: sudo apt-get update sudo apt-get install libopenblas-base
and indeed the mpirun go back to very slow.
CONCLUSION: So it seems the problem is the presence of openblas that will try to use as many OpenMPI processes as possible, unless export OMP_NUM_THREADS=1 is set. @chris can you confirm this behaviour?
Since openblas might be needed by some codes, I think the best approach is to add export OMP_NUM_THREADS=1 in the .bashrc
@chris: once this is confirmed, can you please summarise the results in the appropriate issues, so we'll find this information in a few years when this problem will pop up again? :-)
And also see if you can prepare a "fixed" Quantum Mobile for the future? (maybe Iurii's school it's ok to use the VM that already works, so there is no need for additional tests)
When both OPENBLAS_NUM_THREADS and OMP_NUM_THREADS are set, OpenBLAS will use OPENBLAS_NUM_THREADS. If OPENBLAS_NUM_THREADS is unset, OpenBLAS will try to use OMP_NUM_THREADS.
Nicola:
my understanding is that the mpi parallialzation is already hard coded in optimal ways, and having libraries spin off threads is actually counterproductive and inteferes.
no idea if this is actually still true.
The text was updated successfully, but these errors were encountered:
chrisjsewell
changed the title
Inject `
Inject OMP_NUM_THREADS=1 environmental variable
Jun 10, 2021
In the latest version of QM (21.06.04) I manually added
export OMP_NUM_THREADS=1
in the~/.bashrc
. This was to remedy issues we have found with parallel (MPI) runs of the simulation codes (at least quantum espresso and siesta); these work fine in isolation, but have issues in the full build (with all codes installed)To finalise this:
marvel-nccr.simulationbase
).From email discussion:
@giovannipizzi :
A few more debugging and possibly the answer to the problem:
export OMP_NUM_THREADS=1
before running in the problematic VM fixes the issue (@chris or @iurii, can you confirm?).
If this is the case, we can set this in the .bashrc
What was still puzzling is that setting various values of OMP_NUM_THREADS in the fast machine does not change the behaviour.
I then checked the list of linked dynamic libraries for the two executables: the difference was this:
libopenblas.so.0 => /usr/lib/x86_64-linux-gnu/libopenblas.so.0
I tried to uninstall libopenblas as follows:
sudo apt-get remove libopenblas-base
and now it seems that QE is back to normal speed!
As a double check, I installed the package in the fast machine:
sudo apt-get update
sudo apt-get install libopenblas-base
and indeed the mpirun go back to very slow.
CONCLUSION: So it seems the problem is the presence of openblas that will try to use as many OpenMPI processes as possible, unless export OMP_NUM_THREADS=1 is set.
@chris can you confirm this behaviour?
Since openblas might be needed by some codes, I think the best approach is to add export OMP_NUM_THREADS=1 in the .bashrc
@chris: once this is confirmed, can you please summarise the results in the appropriate issues, so we'll find this information in a few years when this problem will pop up again? :-)
And also see if you can prepare a "fixed" Quantum Mobile for the future? (maybe Iurii's school it's ok to use the VM that already works, so there is no need for additional tests)
@chrisjsewell :
obenblas appears to be installed (only) by Fleur
From https://groups.google.com/g/openblas-users/c/W6ehBvPsKTw/m/N0_nyMyYlS0J:
When both OPENBLAS_NUM_THREADS and OMP_NUM_THREADS are set, OpenBLAS will use OPENBLAS_NUM_THREADS. If OPENBLAS_NUM_THREADS is unset, OpenBLAS will try to use OMP_NUM_THREADS.
Nicola:
my understanding is that the mpi parallialzation is already hard coded in optimal ways, and having libraries spin off threads is actually counterproductive and inteferes.
no idea if this is actually still true.
The text was updated successfully, but these errors were encountered: