Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for building GPU enabled wheel with CoreNEURON #1452

Merged
merged 48 commits into from
Oct 22, 2021
Merged

Conversation

pramodk
Copy link
Member

@pramodk pramodk commented Sep 8, 2021

Fully functional, production quality wheel requires still bit more effort but this is getting close. This PR summarises the current status, thanks to @ferdonline !

  • Add separate docker file that uses existing neuronsimulator/neuron_wheel
    but install additional nvidia hpc toolkit rpms for building GPU wheel
  • Update wheel package scripts with new options to enable coreneuron+gpu
  • setup.py updated with additional options to enable CMake coreneuron+gpu options
  • Remove --bare option from wheel build script
  • Update README with necessary instructions to build image and create wheels

Todo's before merge

cc: @ferdonline @alexsavulescu @iomaganaris

@pramodk pramodk marked this pull request as draft September 8, 2021 12:10
packaging/python/README.md Show resolved Hide resolved
packaging/python/build_wheels.bash Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Sep 8, 2021

Codecov Report

Merging #1452 (c3b944d) into master (09c4408) will increase coverage by 0.08%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1452      +/-   ##
==========================================
+ Coverage   41.11%   41.20%   +0.08%     
==========================================
  Files         550      550              
  Lines      110261   110262       +1     
==========================================
+ Hits        45336    45431      +95     
+ Misses      64925    64831      -94     
Impacted Files Coverage Δ
src/nrniv/nrncore_write.cpp 87.75% <100.00%> (+0.12%) ⬆️
src/modlunit/init.cpp 100.00% <0.00%> (ø)
src/parallel/bbs.cpp 64.55% <0.00%> (+1.68%) ⬆️
src/parallel/bbssrv2mpi.cpp 56.14% <0.00%> (+3.74%) ⬆️
src/nrnmpi/bbsmpipack.cpp 86.15% <0.00%> (+11.28%) ⬆️
src/parallel/bbsclimpi.cpp 50.58% <0.00%> (+22.09%) ⬆️
src/parallel/bbssrvmpi.cpp 44.57% <0.00%> (+27.71%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 09c4408...c3b944d. Read the comment docs.

@alkino
Copy link
Member

alkino commented Oct 12, 2021

Removing RPM : -6GB
Cleaning cache of yum: -300MB

ferdonline and others added 17 commits October 13, 2021 23:24
 * remove gpu_wheel directory, move Dockerfile
   at the top, remove install scripts
 * remove old --bare option
 * add coreneuron or coreneuron-gpu as a CLI
   option for build_wheel.sh script
Update README.md with instructions to build wheel
  as well image
Add new option in setup.py to enable coreneuron+gpu
Cleanup setup.py after rebase
Update coreneuron submodule to master
* do not propogate NEURON MPI flag if CoreNEURON
  MPI option is explicitly specified
* nrnivmodl uses CORENRNHOME env variable instead
  of CNRNHOE variable
* binwrapper.py sets CORENRNHOME same as NRNHOME
* update coreneuron to branch BlueBrain/CoreNeuron#634
 * corenrn_embedded_run now accepts coreneuron mpi library to load
 * neuron decides the name of mpi library to be loaded based on
   auto-detection already has
 * neuron pass the necessary path for dlopen
 * rest of the workflow remains same
@pramodk pramodk marked this pull request as ready for review October 13, 2021 23:48
@pramodk
Copy link
Member Author

pramodk commented Oct 13, 2021

Some extra testing is required with dynamic MPI aspects but overall this is ready for the review.

Edit: wheel is built with nvhpc/21.2 and trying on BB5 produces following (with or without --gpu but when --mpi is used)

# allocate BB5 GPU node
# create virtualenv with python3.8

export OMP_NUM_THREADS=1
module load unstable cuda hpe-mpi  nvhpc python
pip install /gpfs/bbp.cscs.ch/home/kumbhar/tmp/wheelhouse/NEURON_nightly-8.0a693-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

nrniv-core -e 1 -d CoreNeuron/tests/integration/ring # WORKS
nrniv-core -e 1 -d CoreNeuron/tests/integration/ring --gpu #WORKS

# BUT --mpi 

# we use hmpt library because its mpich compatible
(v38) kumbhar@ldir01u09:~/tmp$ LD_PRELOAD=/gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/externals/2021-01-06/linux-rhel7-x86_64/gcc-9.3.0/hpe-mpi-2.22.hmpt-r52ypu/lib/libmpi.so nrniv-core -e 1 -d CoreNeuron/tests/integration/ring --mpi-lib /gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/lib/libcorenrnmpi_mpich.so  --mpi
INFO : Using neuron-nightly Package (Developer Version)
 num_mpi=1
 num_omp_thread=1


 Duke, Yale, and the BlueBrain Project -- Copyright 1984-2020
 Version : 1.0 2ffcc0b (2021-10-14 00:09:37 +0200)

 Additional mechanisms from files
 exp2syn.mod expsyn.mod halfgap.mod hh.mod netstim.mod passive.mod pattern.mod stim.mod svclmp.mod

 Memory (MBs) :             After mk_mech : Max 232.0508, Min 232.0508, Avg 232.0508
 Memory (MBs) :            After MPI_Init : Max 232.0508, Min 232.0508, Avg 232.0508
 Memory (MBs) :          Before nrn_setup : Max 232.2031, Min 232.2031, Avg 232.2031
 Setup Done   : 0.00 seconds
 Model size   : 84.36 kB
 Memory (MBs) :          After nrn_setup  : Max 232.2266, Min 232.2266, Avg 232.2266
GENERAL PARAMETERS
--mpi=true
--mpi-lib=/gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/lib/libcorenrnmpi_mpich.so
--gpu=false
--dt=0.025
--tstop=1

GPU
--nwarp=0
--cell-permute=0
--cuda-interface=false

INPUT PARAMETERS
--voltage=-65
--seed=-1
--datpath=CoreNeuron/tests/integration/ring
--filesdat=files.dat
--pattern=
--report-conf=
--restore=

PARALLEL COMPUTATION PARAMETERS
--threading=false
--skip_mpi_finalize=false

SPIKE EXCHANGE
--ms_phases=2
--ms_subintervals=2
--multisend=false
--spk_compress=0
--binqueue=false

CONFIGURATION
--spikebuf=100000
--prcellgid=-1
--forwardskip=0
--celsius=6.3
--mindelay=1
--report-buffer-size=4

OUTPUT PARAMETERS
--dt_io=0.1
--outpath=.
--checkpoint=

 Start time (t) = 0

 Memory (MBs) :  After mk_spikevec_buffer : Max 232.2266, Min 232.2266, Avg 232.2266
 Memory (MBs) :     After nrn_finitialize : Max 232.2266, Min 232.2266, Avg 232.2266

psolve |=========================================================| t: 1.00   ETA: 0h00m00s

Solver Time : 0.00101947


 Simulation Statistics
 Number of cells: 20
 Number of compartments: 804
 Number of presyns: 21
 Number of input presyns: 0
 Number of synapses: 21
 Number of point processes: 41
 Number of transfer sources: 0
 Number of transfer targets: 0
 Number of spikes: 0
 Number of spikes with non negative gid-s: 0
MPT ERROR: Rank 0(g:0) received signal SIGSEGV(11).
	Process ID: 26651, Host: ldir01u09.bbp.epfl.ch, Program: /gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/bin/nrniv-core
	MPT Version: HPE HMPT 2.22  03/31/20 16:17:35

MPT: --------stack traceback-------
MPT: Attaching to program: /proc/26651/exe, process 26651
MPT: [New LWP 26676]
MPT: [New LWP 26675]
MPT: [Thread debugging using libthread_db enabled]
MPT: Using host libthread_db library "/lib64/libthread_db.so.1".
MPT: 0x00007fffeb0ab1d9 in waitpid () from /lib64/libpthread.so.0
MPT: Missing separate debuginfos, use: debuginfo-install bbp-nvidia-driver-470.57.02-2.x86_64 glibc-2.17-324.el7_9.x86_64 libgcc-4.8.5-44.el7.x86_64 libstdc++-4.8.5-44.el7.x86_64
MPT: (gdb) #0  0x00007fffeb0ab1d9 in waitpid () from /lib64/libpthread.so.0
MPT: #1  0x00007fffed5ea3e6 in mpi_sgi_system (
MPT: #2  MPI_SGI_stacktraceback (
MPT:     header=header@entry=0x7fffffff9710 "MPT ERROR: Rank 0(g:0) received signal SIGSEGV(11).\n\tProcess ID: 26651, Host: ldir01u09.bbp.epfl.ch, Program: /gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/bin/nrniv-"...) at sig.c:340
MPT: #3  0x00007fffed5ea5d8 in first_arriver_handler (signo=signo@entry=11,
MPT:     stack_trace_sem=stack_trace_sem@entry=0x7fffe7ba0080) at sig.c:489
MPT: #4  0x00007fffed5ea8b3 in slave_sig_handler (signo=11,
MPT:     siginfo=<optimized out>, extra=<optimized out>) at sig.c:565
MPT: #5  <signal handler called>
MPT: #6  0x00007fffec89ecd2 in ?? ()
MPT:    from /gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/bin/../../../NEURON_nightly.libs/libcudart-3f3c6934.so.11.0.221
MPT: #7  0x00007fffec8a2614 in ?? ()
MPT:    from /gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/bin/../../../NEURON_nightly.libs/libcudart-3f3c6934.so.11.0.221
MPT: #8  0x00007fffec8921bc in ?? ()
MPT:    from /gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/bin/../../../NEURON_nightly.libs/libcudart-3f3c6934.so.11.0.221
MPT: #9  0x00007fffec893cdb in ?? ()
MPT:    from /gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/bin/../../../NEURON_nightly.libs/libcudart-3f3c6934.so.11.0.221
MPT: #10 0x00007fffecb19ccb in __pgi_uacc_cuda_unregister_fat_binary (
MPT:     pgi_cuda_loc=0x7fffed9b4a40 <__PGI_CUDA_LOC>) at ../../src/cuda_init.c:649
MPT: #11 0x00007fffecb19c6a in __pgi_uacc_cuda_unregister_fat_binaries ()
MPT:     at ../../src/cuda_init.c:635
MPT: #12 0x00007fffea12ece9 in __run_exit_handlers () from /lib64/libc.so.6
MPT: #13 0x00007fffea12ed37 in exit () from /lib64/libc.so.6
MPT: #14 0x00007fffea11755c in __libc_start_main () from /lib64/libc.so.6
MPT: #15 0x00000000004122d7 in _start ()
MPT: (gdb) A debugging session is active.
MPT:
MPT: 	Inferior 1 [process 26651] will be detached.
MPT:
MPT: Quit anyway? (y or n) [answered Y; input not from terminal]
MPT: Detaching from program: /proc/26651/exe, process 26651
MPT: [Inferior 1 (process 26651) detached]

I think we have seen this problem when there is an issue/incompatiblity with CUDA driver vs CUDA version. Next I am going to try using CUDA v11.0 for building wheel. I think that's the problem (?).

Importantly, use MPI_SGI_vtune_is_running instead of
MPI_SGI_init to identify HMPT library becayse we have
to distinguish between MPT and HMPT versions.
coreneuron submodule update
  - correctly auth & upload
Update binwrapper to be aware of new package names for GPU
 See BlueBrain/mod2c#72
Enable upload of GPU nightly wheel
If we cancel stuck build then we see:

    + nrniv -python -c 'import neuron; neuron.test(); quit()'
    Warning: no DISPLAY environment variable.
    --No graphics will be displayed.
    NEURON -- VERSION 8.0a-712-g5db9e33e3 HEAD (5db9e33e3) 2021-10-20
    Duke, Yale, and the BlueBrain Project -- Copyright 1984-2021
    See http://neuron.yale.edu/neuron/credits

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    ModuleNotFoundError: No module named 'neuron'

And after cancelling the job continues until it gets killed.
It seems like with system python somehow gets selected wrong
but not sure how this is possible because we are virtual env
and everything seems to be the same.

Try passing -pyexe explivitly
Copy link
Member

@alexsavulescu alexsavulescu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@pramodk pramodk merged commit e68a7cd into master Oct 22, 2021
@alexsavulescu alexsavulescu deleted the epic/gpu_wheel branch January 18, 2022 19:44
pramodk added a commit that referenced this pull request Mar 16, 2022
`neuron-nightly*` package

As part of #1452, we changed exe wrapper for GPU but
introduced a bug where `neuron` package name was not checked
pramodk added a commit that referenced this pull request Mar 16, 2022
…-nightly*` package (#1706)

As part of #1452, we changed exe wrapper for GPU but introduced a
bug where `neuron` package name was not checked.
@alexsavulescu alexsavulescu mentioned this pull request Mar 22, 2022
15 tasks
@olupton olupton mentioned this pull request Jul 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GPU enabled Python wheel Integrating CoreNEURON under NEURON python wheel
6 participants