Support for building GPU enabled wheel with CoreNEURON #1452

pramodk · 2021-09-08T12:09:34Z

Fully functional, production quality wheel requires still bit more effort but this is getting close. This PR summarises the current status, thanks to @ferdonline !

Add separate docker file that uses existing neuronsimulator/neuron_wheel
but install additional nvidia hpc toolkit rpms for building GPU wheel
Update wheel package scripts with new options to enable coreneuron+gpu
setup.py updated with additional options to enable CMake coreneuron+gpu options
Remove --bare option from wheel build script
Update README with necessary instructions to build image and create wheels

Todo's before merge

make nrnivmodl-core functional
test on few external GPU system with different version of NVHPC and CUDA
merge Fixes for building portable wheel : nrnivmodl-core should be usable from wheel BlueBrain/CoreNeuron#634
add nrniv-core wrapper
Update CI to build wheel with CoreNEURON enabled - Add new Azure CI job to launch gpu docker image ?
If HMPT library headers are used to build then library name should be mpich and not MPT! (BB5 + coreneuron CI is failing because of this)

cc: @ferdonline @alexsavulescu @iomaganaris

packaging/python/README.md

packaging/python/build_wheels.bash

share/lib/python/scripts/nrniv-core

codecov-commenter · 2021-09-08T13:11:26Z

Codecov Report

Merging #1452 (c3b944d) into master (09c4408) will increase coverage by 0.08%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #1452      +/-   ##
==========================================
+ Coverage   41.11%   41.20%   +0.08%     
==========================================
  Files         550      550              
  Lines      110261   110262       +1     
==========================================
+ Hits        45336    45431      +95     
+ Misses      64925    64831      -94

Impacted Files	Coverage Δ
src/nrniv/nrncore_write.cpp	`87.75% <100.00%> (+0.12%)`	⬆️
src/modlunit/init.cpp	`100.00% <0.00%> (ø)`
src/parallel/bbs.cpp	`64.55% <0.00%> (+1.68%)`	⬆️
src/parallel/bbssrv2mpi.cpp	`56.14% <0.00%> (+3.74%)`	⬆️
src/nrnmpi/bbsmpipack.cpp	`86.15% <0.00%> (+11.28%)`	⬆️
src/parallel/bbsclimpi.cpp	`50.58% <0.00%> (+22.09%)`	⬆️
src/parallel/bbssrvmpi.cpp	`44.57% <0.00%> (+27.71%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 09c4408...c3b944d. Read the comment docs.

alkino · 2021-10-12T16:01:07Z

Removing RPM : -6GB
Cleaning cache of yum: -300MB

* remove gpu_wheel directory, move Dockerfile at the top, remove install scripts * remove old --bare option * add coreneuron or coreneuron-gpu as a CLI option for build_wheel.sh script Update README.md with instructions to build wheel as well image Add new option in setup.py to enable coreneuron+gpu Cleanup setup.py after rebase Update coreneuron submodule to master

* do not propogate NEURON MPI flag if CoreNEURON MPI option is explicitly specified * nrnivmodl uses CORENRNHOME env variable instead of CNRNHOE variable * binwrapper.py sets CORENRNHOME same as NRNHOME * update coreneuron to branch BlueBrain/CoreNeuron#634

Update coreneuron to BlueBrain/CoreNeuron#634

* corenrn_embedded_run now accepts coreneuron mpi library to load * neuron decides the name of mpi library to be loaded based on auto-detection already has * neuron pass the necessary path for dlopen * rest of the workflow remains same

…ns have bugs) Set PATH to prefer CUDA 11.0

pramodk · 2021-10-13T23:49:29Z

Some extra testing is required with dynamic MPI aspects but overall this is ready for the review.

Edit: wheel is built with nvhpc/21.2 and trying on BB5 produces following (with or without --gpu but when --mpi is used)

# allocate BB5 GPU node
# create virtualenv with python3.8

export OMP_NUM_THREADS=1
module load unstable cuda hpe-mpi  nvhpc python
pip install /gpfs/bbp.cscs.ch/home/kumbhar/tmp/wheelhouse/NEURON_nightly-8.0a693-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

nrniv-core -e 1 -d CoreNeuron/tests/integration/ring # WORKS
nrniv-core -e 1 -d CoreNeuron/tests/integration/ring --gpu #WORKS

# BUT --mpi 

# we use hmpt library because its mpich compatible
(v38) kumbhar@ldir01u09:~/tmp$ LD_PRELOAD=/gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/externals/2021-01-06/linux-rhel7-x86_64/gcc-9.3.0/hpe-mpi-2.22.hmpt-r52ypu/lib/libmpi.so nrniv-core -e 1 -d CoreNeuron/tests/integration/ring --mpi-lib /gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/lib/libcorenrnmpi_mpich.so  --mpi
INFO : Using neuron-nightly Package (Developer Version)
 num_mpi=1
 num_omp_thread=1


 Duke, Yale, and the BlueBrain Project -- Copyright 1984-2020
 Version : 1.0 2ffcc0b (2021-10-14 00:09:37 +0200)

 Additional mechanisms from files
 exp2syn.mod expsyn.mod halfgap.mod hh.mod netstim.mod passive.mod pattern.mod stim.mod svclmp.mod

 Memory (MBs) :             After mk_mech : Max 232.0508, Min 232.0508, Avg 232.0508
 Memory (MBs) :            After MPI_Init : Max 232.0508, Min 232.0508, Avg 232.0508
 Memory (MBs) :          Before nrn_setup : Max 232.2031, Min 232.2031, Avg 232.2031
 Setup Done   : 0.00 seconds
 Model size   : 84.36 kB
 Memory (MBs) :          After nrn_setup  : Max 232.2266, Min 232.2266, Avg 232.2266
GENERAL PARAMETERS
--mpi=true
--mpi-lib=/gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/lib/libcorenrnmpi_mpich.so
--gpu=false
--dt=0.025
--tstop=1

GPU
--nwarp=0
--cell-permute=0
--cuda-interface=false

INPUT PARAMETERS
--voltage=-65
--seed=-1
--datpath=CoreNeuron/tests/integration/ring
--filesdat=files.dat
--pattern=
--report-conf=
--restore=

PARALLEL COMPUTATION PARAMETERS
--threading=false
--skip_mpi_finalize=false

SPIKE EXCHANGE
--ms_phases=2
--ms_subintervals=2
--multisend=false
--spk_compress=0
--binqueue=false

CONFIGURATION
--spikebuf=100000
--prcellgid=-1
--forwardskip=0
--celsius=6.3
--mindelay=1
--report-buffer-size=4

OUTPUT PARAMETERS
--dt_io=0.1
--outpath=.
--checkpoint=

 Start time (t) = 0

 Memory (MBs) :  After mk_spikevec_buffer : Max 232.2266, Min 232.2266, Avg 232.2266
 Memory (MBs) :     After nrn_finitialize : Max 232.2266, Min 232.2266, Avg 232.2266

psolve |=========================================================| t: 1.00   ETA: 0h00m00s

Solver Time : 0.00101947


 Simulation Statistics
 Number of cells: 20
 Number of compartments: 804
 Number of presyns: 21
 Number of input presyns: 0
 Number of synapses: 21
 Number of point processes: 41
 Number of transfer sources: 0
 Number of transfer targets: 0
 Number of spikes: 0
 Number of spikes with non negative gid-s: 0
MPT ERROR: Rank 0(g:0) received signal SIGSEGV(11).
	Process ID: 26651, Host: ldir01u09.bbp.epfl.ch, Program: /gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/bin/nrniv-core
	MPT Version: HPE HMPT 2.22  03/31/20 16:17:35

MPT: --------stack traceback-------
MPT: Attaching to program: /proc/26651/exe, process 26651
MPT: [New LWP 26676]
MPT: [New LWP 26675]
MPT: [Thread debugging using libthread_db enabled]
MPT: Using host libthread_db library "/lib64/libthread_db.so.1".
MPT: 0x00007fffeb0ab1d9 in waitpid () from /lib64/libpthread.so.0
MPT: Missing separate debuginfos, use: debuginfo-install bbp-nvidia-driver-470.57.02-2.x86_64 glibc-2.17-324.el7_9.x86_64 libgcc-4.8.5-44.el7.x86_64 libstdc++-4.8.5-44.el7.x86_64
MPT: (gdb) #0  0x00007fffeb0ab1d9 in waitpid () from /lib64/libpthread.so.0
MPT: #1  0x00007fffed5ea3e6 in mpi_sgi_system (
MPT: #2  MPI_SGI_stacktraceback (
MPT:     header=header@entry=0x7fffffff9710 "MPT ERROR: Rank 0(g:0) received signal SIGSEGV(11).\n\tProcess ID: 26651, Host: ldir01u09.bbp.epfl.ch, Program: /gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/bin/nrniv-"...) at sig.c:340
MPT: #3  0x00007fffed5ea5d8 in first_arriver_handler (signo=signo@entry=11,
MPT:     stack_trace_sem=stack_trace_sem@entry=0x7fffe7ba0080) at sig.c:489
MPT: #4  0x00007fffed5ea8b3 in slave_sig_handler (signo=11,
MPT:     siginfo=<optimized out>, extra=<optimized out>) at sig.c:565
MPT: #5  <signal handler called>
MPT: #6  0x00007fffec89ecd2 in ?? ()
MPT:    from /gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/bin/../../../NEURON_nightly.libs/libcudart-3f3c6934.so.11.0.221
MPT: #7  0x00007fffec8a2614 in ?? ()
MPT:    from /gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/bin/../../../NEURON_nightly.libs/libcudart-3f3c6934.so.11.0.221
MPT: #8  0x00007fffec8921bc in ?? ()
MPT:    from /gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/bin/../../../NEURON_nightly.libs/libcudart-3f3c6934.so.11.0.221
MPT: #9  0x00007fffec893cdb in ?? ()
MPT:    from /gpfs/bbp.cscs.ch/home/kumbhar/tmp/v38/lib/python3.8/site-packages/neuron/.data/bin/../../../NEURON_nightly.libs/libcudart-3f3c6934.so.11.0.221
MPT: #10 0x00007fffecb19ccb in __pgi_uacc_cuda_unregister_fat_binary (
MPT:     pgi_cuda_loc=0x7fffed9b4a40 <__PGI_CUDA_LOC>) at ../../src/cuda_init.c:649
MPT: #11 0x00007fffecb19c6a in __pgi_uacc_cuda_unregister_fat_binaries ()
MPT:     at ../../src/cuda_init.c:635
MPT: #12 0x00007fffea12ece9 in __run_exit_handlers () from /lib64/libc.so.6
MPT: #13 0x00007fffea12ed37 in exit () from /lib64/libc.so.6
MPT: #14 0x00007fffea11755c in __libc_start_main () from /lib64/libc.so.6
MPT: #15 0x00000000004122d7 in _start ()
MPT: (gdb) A debugging session is active.
MPT:
MPT: 	Inferior 1 [process 26651] will be detached.
MPT:
MPT: Quit anyway? (y or n) [answered Y; input not from terminal]
MPT: Detaching from program: /proc/26651/exe, process 26651
MPT: [Inferior 1 (process 26651) detached]

I think we have seen this problem when there is an issue/incompatiblity with CUDA driver vs CUDA version. Next I am going to try using CUDA v11.0 for building wheel. I think that's the problem (?).

Importantly, use MPI_SGI_vtune_is_running instead of MPI_SGI_init to identify HMPT library becayse we have to distinguish between MPT and HMPT versions. coreneuron submodule update

- correctly auth & upload Update binwrapper to be aware of new package names for GPU

See BlueBrain/mod2c#72 Enable upload of GPU nightly wheel

If we cancel stuck build then we see: + nrniv -python -c 'import neuron; neuron.test(); quit()' Warning: no DISPLAY environment variable. --No graphics will be displayed. NEURON -- VERSION 8.0a-712-g5db9e33e3 HEAD (5db9e33e3) 2021-10-20 Duke, Yale, and the BlueBrain Project -- Copyright 1984-2021 See http://neuron.yale.edu/neuron/credits Traceback (most recent call last): File "<string>", line 1, in <module> ModuleNotFoundError: No module named 'neuron' And after cancelling the job continues until it gets killed. It seems like with system python somehow gets selected wrong but not sure how this is possible because we are virtual env and everything seems to be the same. Try passing -pyexe explivitly

ci/azure-wheel-test-upload.yml

Brew python : avoid 3.9 as it seems to have issues Update coreneuron

This reverts commit bd4c775.

Run brew test on all MacOS platform Disable upload wheels on PR

alexsavulescu

🚀

`neuron-nightly*` package As part of #1452, we changed exe wrapper for GPU but introduced a bug where `neuron` package name was not checked

…-nightly*` package (#1706) As part of #1452, we changed exe wrapper for GPU but introduced a bug where `neuron` package name was not checked.

pramodk marked this pull request as draft September 8, 2021 12:10

alexsavulescu reviewed Sep 8, 2021

View reviewed changes

packaging/python/README.md Show resolved Hide resolved

packaging/python/build_wheels.bash Outdated Show resolved Hide resolved

pramodk commented Sep 8, 2021

View reviewed changes

share/lib/python/scripts/nrniv-core Show resolved Hide resolved

pramodk force-pushed the epic/gpu_wheel branch from 049e278 to 39de15f Compare October 12, 2021 15:27

ferdonline and others added 17 commits October 13, 2021 23:24

WIP GPU wheel

b99d67b

Add script to build wheel, to run on the image

2211d44

Fixes for build script

47763a0

Update info on building gpu wheel

9fcee68

fix missing \

20a73e0

nrniv-core wheel wrapper

1af7249

symlink for nrnivmodl-core as well

914734f

fix the tests : CNRNHOME -> CORENRNHOME

aa3a889

Clean massive rpms after use

7af7c4e

clean cache of yum after installation

d95474e

Coreneuron dynamic MPI integration

fec3a76

Update coreneuron to BlueBrain/CoreNeuron#634

Connect coreneuron with dynamic mpi loading

081274e

* corenrn_embedded_run now accepts coreneuron mpi library to load * neuron decides the name of mpi library to be loaded based on auto-detection already has * neuron pass the necessary path for dlopen * rest of the workflow remains same

Set CORENRN_PYTHONEXE and CORENRN_PERLNEXE from wrapper for wheel

e94f40d

Enable MPI under coreneuron and update coreneuron submodule

e6b0974

Use nvhpc 21.2 as its used on BB5 and validated ( other <=21.7 versio…

41fd45e

…ns have bugs) Set PATH to prefer CUDA 11.0

pramodk force-pushed the epic/gpu_wheel branch from 467542f to 9faa982 Compare October 13, 2021 23:47

pramodk requested review from olupton and alkino October 13, 2021 23:48

pramodk marked this pull request as ready for review October 13, 2021 23:48

Cleanup and address review comments

c174509

Importantly, use MPI_SGI_vtune_is_running instead of MPI_SGI_init to identify HMPT library becayse we have to distinguish between MPT and HMPT versions. coreneuron submodule update

pramodk force-pushed the epic/gpu_wheel branch from 9faa982 to c174509 Compare October 14, 2021 00:18

pramodk added 3 commits October 20, 2021 20:30

fix the pip show name

9322ad0

Various fixes for uploading wheels via azure

bb9068e

- correctly auth & upload Update binwrapper to be aware of new package names for GPU

Imem issue on CPU execution now fixed

44810c7

See BlueBrain/mod2c#72 Enable upload of GPU nightly wheel

This was referenced Oct 20, 2021

Dynamic MPI library built with OpenACC flags result into crash at the end of simulation BlueBrain/CoreNeuron#674

Open

Improvements for GPU binary wheels BlueBrain/CoreNeuron#672

Open

pramodk added 3 commits October 21, 2021 00:07

Move tests around to avoid system python deadlock on mac os

da3784a

Test brew python on all Mac OS

2c2c609

pramodk force-pushed the epic/gpu_wheel branch from cfe0d8d to 2c2c609 Compare October 20, 2021 22:52

pramodk commented Oct 20, 2021

View reviewed changes

ci/azure-wheel-test-upload.yml Outdated Show resolved Hide resolved

pramodk added 10 commits October 21, 2021 08:50

Update README for GPU wheel testing on BB5

944dac2

Brew python : avoid 3.9 as it seems to have issues Update coreneuron

Revert "Mac OS CI get stuck with Python 3.6, 3.8 and 3.9."

57d58c5

This reverts commit bd4c775.

Disable coreneuron support in MacOS wheels

8b0eaca

Enable coreneuron support on macos

340dfef

Remove accidental run of nrniv -python on Azure CI

8c69929

Run brew test on all MacOS platform Disable upload wheels on PR

Update coreneuron to latest master

cffec91

Merge branch 'master' into epic/gpu_wheel

501beb5

fix empty value for CORENRN_ENABLE_GPU; set NVCOMPILER_ACC_TIME=1

ab90e92

Test brew only with Python 3.7 as before

c3b944d

Avoid embedded python tests on Azure CI

0516ce7

alexsavulescu approved these changes Oct 22, 2021

View reviewed changes

Merge branch 'master' into epic/gpu_wheel

fadb5bb

pramodk merged commit e68a7cd into master Oct 22, 2021

olupton mentioned this pull request Oct 25, 2021

Add testing of new GPU wheels. neuronsimulator/nrn-build-ci#24

Open

alexsavulescu deleted the epic/gpu_wheel branch January 18, 2022 19:44

pramodk added a commit that referenced this pull request Mar 16, 2022

Bug fix: wheel exe wrapper was not checking 'neuron' but only

ced6589

`neuron-nightly*` package As part of #1452, we changed exe wrapper for GPU but introduced a bug where `neuron` package name was not checked

pramodk mentioned this pull request Mar 16, 2022

Bug fix: wheel exe wrapper was not checking 'neuron' but only neuron-nightly* package #1706

Merged

pramodk added a commit that referenced this pull request Mar 16, 2022

Bug fix: wheel exe wrapper was not checking 'neuron' but only `neuron…

1de25f3

…-nightly*` package (#1706) As part of #1452, we changed exe wrapper for GPU but introduced a bug where `neuron` package name was not checked.

alexsavulescu mentioned this pull request Mar 22, 2022

NEURON 8.1.0 Release #1719

Closed

15 tasks

olupton mentioned this pull request Jul 27, 2023

GPU wheel support #2446

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for building GPU enabled wheel with CoreNEURON #1452

Support for building GPU enabled wheel with CoreNEURON #1452

pramodk commented Sep 8, 2021 •

edited

Loading

codecov-commenter commented Sep 8, 2021 •

edited

Loading

alkino commented Oct 12, 2021

pramodk commented Oct 13, 2021 •

edited

Loading

alexsavulescu left a comment

Support for building GPU enabled wheel with CoreNEURON #1452

Support for building GPU enabled wheel with CoreNEURON #1452

Conversation

pramodk commented Sep 8, 2021 • edited Loading

Todo's before merge

codecov-commenter commented Sep 8, 2021 • edited Loading

Codecov Report

alkino commented Oct 12, 2021

pramodk commented Oct 13, 2021 • edited Loading

alexsavulescu left a comment

Choose a reason for hiding this comment

pramodk commented Sep 8, 2021 •

edited

Loading

codecov-commenter commented Sep 8, 2021 •

edited

Loading

pramodk commented Oct 13, 2021 •

edited

Loading