Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce redundant GPU allocations #2393

Merged
merged 57 commits into from
Oct 23, 2024

Conversation

thorstenhater
Copy link
Contributor

@thorstenhater thorstenhater commented Aug 26, 2024

Introduction

  • Do not allocate and fill storage for resetting ion concentrations if these are never changed.
    Reasoning: If concentrations are never changed, we do not reset them and thus do not need
    to store the values.
  • Remove a duplicate of the CV area in solver and shared state and pass a reference during solving
  • Remove redundant solution from GPU solver
  • Do not store / allocate diffusion values if not needed.

This saves per CV

  • 1 x 8B for cv_area unconditionally
  • 1 x 8B for Xd for each ion with no diffusion is in use (majority of cases)
  • 2 x 8B for Xi for each ion (reset and init) if not written (reasonably often)
  • 2 x 8B for Xo for each ion (reset and init) if not written (majority of cases)
  • 1 x 8B for eX reset for each ion if not read (majority)
  • 1 x 8B for eX for each ion if not read (rarely)

In my standard benchmark, busyring with complex cells, this saves about 18% of the total GPU
allocation for the cell data (shared_state).

This has become a mixed bag, fixing a few additional things that came up during testing this:

  • a bug in event handling on GPU
  • pybind11 stub generation

@jlubo
Copy link
Collaborator

jlubo commented Oct 10, 2024

This is nice!

The previously high memory consumption prevented the execution of my large network application with multi-compartment neurons (see here for the code of the single-compartment case, the code for the multi-compartment case is not public yet) on common GPUs. The present PR reduces the GPU memory consumption for that application from >> 8 GB to ~100 MB.

Also, the whole variety of tests for my network application do pass (for both the single-compartment and the multi-compartment case).

@thorstenhater
Copy link
Contributor Author

You can also try #2394, which takes the concept one step further.

@thorstenhater
Copy link
Contributor Author

Spack is failing due to pb11-stubgen not being updated. I made the corresponding PR here
spack/spack#47162

Copy link
Contributor

@boeschf boeschf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the conditional initialization and allocation of the different ion state arrays is not very easy to follow. There must be a better way to do this, but that's maybe for another PR.

spack/package.py Outdated Show resolved Hide resolved
arbor/backends/multicore/shared_state.cpp Outdated Show resolved Hide resolved
arbor/backends/multicore/shared_state.cpp Outdated Show resolved Hide resolved
arbor/backends/multicore/shared_state.cpp Outdated Show resolved Hide resolved
arbor/backends/gpu/shared_state.cpp Outdated Show resolved Hide resolved
arbor/backends/multicore/shared_state.cpp Outdated Show resolved Hide resolved
arbor/backends/gpu/shared_state.cpp Outdated Show resolved Hide resolved
spack/package.py Outdated Show resolved Hide resolved
@boeschf
Copy link
Contributor

boeschf commented Oct 23, 2024

can we add a spack variant that would disable the pybind11 stubgen? This would be consistent with the cmake and would allow the spack CI to still pass.

Copy link
Contributor

@boeschf boeschf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, looks good to me

@thorstenhater thorstenhater merged commit b8b768d into arbor-sim:master Oct 23, 2024
28 of 29 checks passed
@thorstenhater thorstenhater deleted the mem/trim-gpu-allocations branch October 23, 2024 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants