Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU enabled Python wheel #1383

Closed
ferdonline opened this issue Jul 27, 2021 · 11 comments · Fixed by #1452
Closed

GPU enabled Python wheel #1383

ferdonline opened this issue Jul 27, 2021 · 11 comments · Fixed by #1452

Comments

@ferdonline
Copy link
Member

ferdonline commented Jul 27, 2021

Overview of the feature

Let the wheel distribution be compiled with GPU support.

This requires investigation on several fronts:

  • What toolset / images to build with
  • What libraries are needed to be shipped with (whole DIST?/ static nvidia?)
@ferdonline
Copy link
Member Author

Currently, CoreNeuron has a hard dependency to mpi in case it's enabled (i.e. it's not dynamic, so it won't run in case mpi is not in the target system, which breaks the idea of the wheel).
Therefore, initially, we will building GPU wheels which have MPI disabled.

@ferdonline
Copy link
Member Author

Initial wheels run Fine on CPU but, on GPU report

leite@ldir01u13 ~$ nrniv-core -d ~kumbhar/tmp/CoreNeuron/tests/integration/ring/ --gpu
INFO : Using neuron-nightly Package (Developer Version)
Failing in Thread:1
call to cuModuleGetGlobal returned error 500: Not found

setup.py is setting the following CMake flags:

cmake3 $HOME/dev/nrn \
    -DNRN_ENABLE_CORENEURON=ON \
    -DNRN_ENABLE_INTERVIEWS=OFF \
    -DIV_ENABLE_X11_DYNAMIC=OFF \
    -DNRN_ENABLE_RX3D=OFF \
    -DNRN_ENABLE_MPI=OFF \
    -DNRN_ENABLE_MPI_DYNAMIC=OFF \
    -DNRN_ENABLE_PYTHON_DYNAMIC=ON \
    -DNRN_ENABLE_MODULE_INSTALL=OFF \
    -DNRN_ENABLE_REL_RPATH=ON \
    -DLINK_AGAINST_PYTHON=OFF \
    -DCORENRN_ENABLE_GPU=ON \
    -DCORENRN_ENABLE_MPI=OFF \
    -DCMAKE_C_COMPILER=nvc \
    -DCMAKE_CXX_COMPILER=nvc++ \
    -DCMAKE_INSTALL_PREFIX=$PWD/install

@olupton
Copy link
Collaborator

olupton commented Jul 27, 2021

See BlueBrain/CoreNeuron#600 for dynamic MPI support in CoreNEURON.

@ferdonline
Copy link
Member Author

ferdonline commented Jul 27, 2021

We have been using a docker image based on manylinux14 and installed via rpms nvidia hpc toolkit 21.5 and Cuda 11.0. Attempting to compile the sample demos and run remotely worked.
However the created binaries always ended up with the aforementioned problem (across different scenarios, even excluding Cuda code and taking the bare cmake installation).

We started building minimal setups with a fresh system and started having good results.

  • manylinux14 + nvidia hpc sdk 21.7 (+built-in Cuda 11.0)
    Compile Coreneuron-only with flags

      -DCORENRN_ENABLE_GPU=ON \
      -DCORENRN_ENABLE_MPI=OFF \
      -DCMAKE_C_COMPILER=nvc \
      -DCMAKE_CXX_COMPILER=nvc++ \
    

    Works ✅

  • Took centos7 based container image from nvidia
    Install necessary dev dependencies for neuron+coreneuron
    Install neuron+coreneuron+gpu using cmake && make && install
    copy install prefix to BB5
    (Full steps in Good doc)
    Works too ✅

@ferdonline
Copy link
Member Author

Update.

Good news.

  • recreating Environment with nvidia hpc sdk 21.2, and setup to build neuron with Cuda 11.0
    Copying the cmake_install folder to a gpu node
    gpu node features Cuda 11 and nvhpc compilers 21.5
    nrniv-core runs without problem ✅
    leite@ldir01u13 ~$ module use /gpfs/bbp.cscs.ch/home/olupton/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/nvhpc-21.5-ic7c6b/modulefiles
    leite@ldir01u13 ~$ module load nvhpc
    leite@ldir01u13 ~$ module load unstable cuda/11.0.2
    leite@ldir01u13 ~$ cd nrn_test/cmake_install/
    leite@ldir01u13 ~/nrn_test/cmake_install$ ./bin/nrniv-core  -d ~kumbhar/tmp/CoreNeuron/tests/integration/ring --gpu
    Info : 4 GPUs shared by 1 ranks per node
    ...
    psolve |=========================================================| t: 100.00 ETA: 0h00m02s
    Solver Time : 2.12607
    

We can proceed for a wheel :)

@olupton
Copy link
Collaborator

olupton commented Jul 27, 2021

With #1385 we may also be able to use 21.7 (but not 21.3 or 21.5, see https://forums.developer.nvidia.com/t/incorrect-cpu-results-with-pragma-acc-atomic-capture/184453).

@ferdonline
Copy link
Member Author

I am currently compiling with 21.2.
The wheel works well on gpu ✅
Then testing on a normal node, however, we have a problem with the serial version

leite@bbpv2 ~$ module use /gpfs/bbp.cscs.ch/home/olupton/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/nvhpc-21.5-ic7c6b/modulefiles
leite@bbpv2 ~$ module load nvhpc-
nvhpc-byo-compiler/21.5  nvhpc-nompi/21.5
leite@bbpv2 ~$ module load nvhpc-nompi/21.5
leite@bbpv2 ~$ nrniv-core -d ~kumbhar/tmp/CoreNeuron/tests/integration/ring/ --gpu
INFO : Using neuron-nightly Package (Developer Version)

 ERROR : Enabled GPU execution but couldn't find NVIDIA GPU!

Aborted
[ ERROR ] Command returned 134.
leite@bbpv2 ~$ nrniv-core -d ~kumbhar/tmp/CoreNeuron/tests/integration/ring/
INFO : Using neuron-nightly Package (Developer Version)

 Duke, Yale, and the BlueBrain Project -- Copyright 1984-2020
 Version : 0.21.0 864b712 (2021-07-23 20:24:45 +0200)

 Additional mechanisms from files
 exp2syn.mod expsyn.mod halfgap.mod hh.mod netstim.mod passive.mod pattern.mod stim.mod svclmp.mod

 Memory (MBs) :             After mk_mech : Max 9.8555, Min 9.8555, Avg 9.8555
 Memory (MBs) :            After MPI_Init : Max 9.9414, Min 9.9414, Avg 9.9414
Segmentation fault

@ferdonline
Copy link
Member Author

ferdonline commented Jul 28, 2021

It would be nice to run with 21.7 libraries. @olupton, do you have them installed as well?

@ferdonline
Copy link
Member Author

As raised by Olli, we are hitting BlueBrain/CoreNeuron#599

@ferdonline
Copy link
Member Author

@alkino
Copy link
Member

alkino commented Oct 13, 2021

An update for CoreNeuron dynamic MPI, it is now on master: BlueBrain/CoreNeuron@6342df2

@alexsavulescu alexsavulescu linked a pull request Oct 18, 2021 that will close this issue
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants