Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic GPU Support #118

Merged
merged 43 commits into from
Aug 21, 2023
Merged

Basic GPU Support #118

merged 43 commits into from
Aug 21, 2023

Conversation

nic-barbara
Copy link
Member

Adding basic GPU support in accordance with #105.

@nic-barbara nic-barbara linked an issue Aug 14, 2023 that may be closed by this pull request
7 tasks
@nic-barbara nic-barbara marked this pull request as draft August 14, 2023 05:29
@nic-barbara
Copy link
Member Author

Main thing to change is removing scalar indexing when constructing parameterisations. Scalar indexing is not done on the GPU which can lead to very expensive data transfer back and forth between CPU and GPU.

@nic-barbara
Copy link
Member Author

There's a scalar indexing warning popping up on the backwards passes of DiffREN for LipschitzRENParams and GeneralRENParams only. The rest are all fine. Look for common elements.

@nic-barbara
Copy link
Member Author

This seemed like a good time to change the LBDN back-end a little bit. The number of hidden layers nh is now stored as an NTuple rather than a Vector. This means it's immutable, and won't get loaded onto the GPU when we don't actually want it to. I've still left the option of specifying nh with a vector in the constructor though, both for backwards compatibility and ease of use.

@nic-barbara nic-barbara self-assigned this Aug 15, 2023
@nic-barbara
Copy link
Member Author

There was a rather frustrating bug when evaluating the RENs on a GPU. In the src/Base/acyclic_ren_solver.jl, we used similar(z_eq, 1, size(b,2)) to initialise the product term Di_zi. On the CPU, similar() always initialises the array with extremely small values that are effectively zero. This is not the case on the GPU. On the GPU, similar() can initialise the arrays with basically anything, including NaN.

This was causing the REN to have different outputs when evaluated multiple times with the same inputs. Sometimes, the outputs would be NaN. Constructing Di_zi = typeof(b)(zeros(Float32, 1, size(b,2))) seems to work much better.

@nic-barbara nic-barbara marked this pull request as ready for review August 18, 2023 08:49
Copy link
Member

@jclinton830 jclinton830 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@nic-barbara nic-barbara merged commit a4c3770 into main Aug 21, 2023
3 checks passed
@nic-barbara nic-barbara deleted the feature/gpu-support branch August 21, 2023 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Basic GPU Support
3 participants