Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic GPU Support #118

Merged
merged 43 commits into from
Aug 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
9e6d254
Initial commit with minor edits
nic-barbara Jul 25, 2023
e3287f7
Merge branch 'main' of github.com:acfr/RobustNeuralNetworks.jl into f…
nic-barbara Aug 3, 2023
d48cbbc
Left a note
nic-barbara Aug 3, 2023
93d2ca3
Merged in main
nic-barbara Aug 4, 2023
053ff38
Merge branch 'feature/gpu-support' of github.com:acfr/RobustNeuralNet…
nic-barbara Aug 4, 2023
ddd645d
Bugfixes in test script
nic-barbara Aug 4, 2023
68bb696
Generalised functions for CuArrays, added test script
nic-barbara Aug 4, 2023
0627e10
Added GPU support for non-trainable structs
nic-barbara Aug 4, 2023
664fb0b
Switched to proper (hopefully?) usage of @functor
nic-barbara Aug 8, 2023
83ecbf6
Removed scalar indexing from REN construction to avoid GPU warnings
nic-barbara Aug 14, 2023
fa719df
Merge branch 'feature/gpu-support' of github.com:acfr/RobustNeuralNet…
nic-barbara Aug 14, 2023
0e0e61a
Removed scalar indexing from REN rrules to allow GPU support
nic-barbara Aug 14, 2023
cea70e3
BREAKING: Changed passivity parameter to non-keyword argument
nic-barbara Aug 14, 2023
dc2d341
Fixed bug with numerical conditioning on GPU when nu=ny
nic-barbara Aug 14, 2023
8b6a0e4
Simplified code calling identity matrix to make compatible with GPU
nic-barbara Aug 14, 2023
2dc965b
Added test script for all RENs on GPU
nic-barbara Aug 14, 2023
22c2472
Minor edits
nic-barbara Aug 14, 2023
567f801
Removed scalar GPU indexing warning at cost of adding inv(A)
nic-barbara Aug 14, 2023
f5775d4
Minor edits
nic-barbara Aug 14, 2023
0af7067
Made tests more efficient
nic-barbara Aug 15, 2023
e826a73
Added simple test for forward pass of LBDN models
nic-barbara Aug 15, 2023
8ceb6b3
Generalisng LBDNs to work with CUDA arrays
nic-barbara Aug 15, 2023
ad273c3
Changed LBDN to store number of hidden layers in a tuple
nic-barbara Aug 15, 2023
748e0fc
Minor updates to documentation
nic-barbara Aug 15, 2023
1faf5e5
Removed scalar indexing in LBDN usage
nic-barbara Aug 15, 2023
e77ce3e
Added test files for LBDN, currently broken DiffLBDN backwards pass o…
nic-barbara Aug 15, 2023
d5bfd02
Fixed bug with scalar indexing in GPU backprop
nic-barbara Aug 16, 2023
138420c
Moved GPU scripts to tests folder
nic-barbara Aug 16, 2023
30f5c3f
Changed my mind...
nic-barbara Aug 16, 2023
d0d5bd4
Tested sandwich layer on GPU
nic-barbara Aug 16, 2023
e63d82c
Changed MNIST example to GPU
nic-barbara Aug 16, 2023
19441d6
Bug in REN makes it not deterministic on GPU to evaluate the model
nic-barbara Aug 16, 2023
a1808ee
Working on GPU observer example (work-in-progress)
nic-barbara Aug 16, 2023
be01897
Wrote testing scripts to help debug GPU training
nic-barbara Aug 17, 2023
3aefe6b
Minor changes
nic-barbara Aug 18, 2023
6b0f73e
Added another test script for debugging
nic-barbara Aug 18, 2023
754d096
Added equality checking for ExplicitRENParams
nic-barbara Aug 18, 2023
cfa467c
Minor edits
nic-barbara Aug 18, 2023
52aac15
Further isolated the error
nic-barbara Aug 18, 2023
8dd7ca4
Tidying up
nic-barbara Aug 18, 2023
97ee129
Fixed bug by remove similar() for D11
nic-barbara Aug 18, 2023
ae1f245
Pruned testing files, fixed observer demo
nic-barbara Aug 18, 2023
690ab8d
Merge pull request #121 from acfr/feature/gpu-support-debug
nic-barbara Aug 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions examples/GPU/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
## Testing `RobustNeuralNetworks.jl` on a GPU

There is currently full support for using models from `RobustNeuralNetworks.jl` on a GPU. However, the speed could definitely be improved with some code optimisations, and we don't have any CI testing on the GPU.

The scripts in this directory serve two purposes:
- They provide a means of benchmarking model performance on a GPU
- They act as unit tests to verify the models can be trained on a GPU

There is an [open issue](https://github.com/acfr/RobustNeuralNetworks.jl/issues/119) on improving the speed of our models on GPUs. Any and all contributions are welcome.
59 changes: 59 additions & 0 deletions examples/GPU/test_lbdn.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# This file is a part of RobustNeuralNetworks.jl. License is MIT: https://github.com/acfr/RobustNeuralNetworks.jl/blob/main/LICENSE

cd(@__DIR__)
using Pkg
Pkg.activate("../")

using BenchmarkTools
using CUDA
using Flux
using Random
using RobustNeuralNetworks

rng = Xoshiro(42)

function test_lbdn_device(device; nu=2, nh=[10, 5], ny=4, γ=10, nl=tanh,
batches=4, is_diff=false, do_time=true, T=Float32)

# Build model
model = DenseLBDNParams{T}(nu, nh, ny, γ; nl, rng) |> device
is_diff && (model = DiffLBDN(model))

# Create dummy data
us = randn(rng, T, nu, batches) |> device
ys = randn(rng, T, ny, batches) |> device

# Dummy loss function
function loss(model, u, y)
m = is_diff ? model : LBDN(model)
return Flux.mse(m(u), y)
end

# Run and time, running it once to check it works
print("Forwards: ")
l = loss(model, us, ys)
do_time && (@btime $loss($model, $us, $ys))

print("Reverse: ")
g = gradient(loss, model, us, ys)
do_time && (@btime $gradient($loss, $model, $us, $ys))

return l, g
end

function test_lbdns(device)

d = device === cpu ? "CPU" : "GPU"
println("\nTesting LBDNs on ", d, ":")
println("--------------------\n")

println("Dense LBDN:\n")
test_lbdn_device(device)
println("\nDense DiffLBDN:\n")
test_lbdn_device(device; is_diff=true)

return nothing
end

test_lbdns(cpu)
test_lbdns(gpu)
93 changes: 93 additions & 0 deletions examples/GPU/test_ren.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# This file is a part of RobustNeuralNetworks.jl. License is MIT: https://github.com/acfr/RobustNeuralNetworks.jl/blob/main/LICENSE

cd(@__DIR__)
using Pkg
Pkg.activate("../")

using BenchmarkTools
using CUDA
using Flux
using Random
using RobustNeuralNetworks

rng = Xoshiro(42)

function test_ren_device(device, construct, args...; nu=4, nx=5, nv=10, ny=4,
nl=tanh, batches=4, tmax=3, is_diff=false, T=Float32,
do_time=true)

# Build the ren
model = construct{T}(nu, nx, nv, ny, args...; nl, rng) |> device
is_diff && (model = DiffREN(model))

# Create dummy data
us = [randn(rng, T, nu, batches) for _ in 1:tmax] |> device
ys = [randn(rng, T, ny, batches) for _ in 1:tmax] |> device
x0 = init_states(model, batches) |> device

# Dummy loss function
function loss(model, x, us, ys)
m = is_diff ? model : REN(model)
J = 0
for t in 1:tmax
x, y = m(x, us[t])
J += Flux.mse(y, ys[t])
end
return J
end

# Run and time, running it once first to check it works
print("Forwards: ")
l = loss(model, x0, us, ys)
do_time && (@btime $loss($model, $x0, $us, $ys))

print("Reverse: ")
g = gradient(loss, model, x0, us, ys)
do_time && (@btime $gradient($loss, $model, $x0, $us, $ys))

return l, g
end

# Test all types and combinations
γ = 10
ν = 10

nu, nx, nv, ny = 4, 5, 10, 4
X = randn(rng, ny, ny)
Y = randn(rng, nu, nu)
S = randn(rng, nu, ny)

Q = -X'*X
R = S * (Q \ S') + Y'*Y

function test_rens(device)

d = device === cpu ? "CPU" : "GPU"
println("\nTesting RENs on ", d, ":")
println("--------------------\n")

println("Contracting REN:\n")
test_ren_device(device, ContractingRENParams)
println("\nContracting DiffREN:\n")
test_ren_device(device, ContractingRENParams; is_diff=true)

println("\nPassive REN:\n")
test_ren_device(device, PassiveRENParams, ν)
println("\nPassive DiffREN:\n")
test_ren_device(device, PassiveRENParams, ν; is_diff=true)

println("\nLipschitz REN:\n")
test_ren_device(device, LipschitzRENParams, γ)
println("\nLipschitz DiffREN:\n")
test_ren_device(device, LipschitzRENParams, γ; is_diff=true)

println("\nGeneral REN:\n")
test_ren_device(device, GeneralRENParams, Q, S, R)
println("\nGeneral DiffREN:\n")
test_ren_device(device, GeneralRENParams, Q, S, R; is_diff=true)

return nothing
end

test_rens(cpu)
test_rens(gpu)
64 changes: 64 additions & 0 deletions examples/GPU/test_sandwich.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# This file is a part of RobustNeuralNetworks.jl. License is MIT: https://github.com/acfr/RobustNeuralNetworks.jl/blob/main/LICENSE

cd(@__DIR__)
using Pkg
Pkg.activate("../")

using BenchmarkTools
using CUDA
using Flux
using Random
using RobustNeuralNetworks

rng = Xoshiro(42)

function test_sandwich_device(device; batches=400, do_time=true, T=Float32)

# Model parameters
nu = 2
nh = [10, 5]
ny = 4
γ = 10
nl = tanh

# Build model
model = Flux.Chain(
(x) -> (√γ * x),
SandwichFC(nu => nh[1], nl; T, rng),
SandwichFC(nh[1] => nh[2], nl; T, rng),
(x) -> (√γ * x),
SandwichFC(nh[2] => ny; output_layer=true, T, rng),
) |> device

# Create dummy data
us = randn(rng, T, nu, batches) |> device
ys = randn(rng, T, ny, batches) |> device

# Dummy loss function
loss(model, u, y) = Flux.mse(model(u), y)

# Run and time, running it once to check it works
print("Forwards: ")
l = loss(model, us, ys)
do_time && (@btime $loss($model, $us, $ys))

print("Reverse: ")
g = gradient(loss, model, us, ys)
do_time && (@btime $gradient($loss, $model, $us, $ys))

return l, g
end

function test_sandwich(device)

d = device === cpu ? "CPU" : "GPU"
println("\nTesting Sandwich on ", d, ":")
println("--------------------\n")

test_sandwich_device(device)

return nothing
end

test_sandwich(cpu)
test_sandwich(gpu)
2 changes: 2 additions & 0 deletions examples/Project.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
[deps]
BSON = "fbb218c0-5317-5bc6-957e-2ee96dd4b1f0"
BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
ControlSystemsBase = "aaaaaaaa-a6ca-5380-bf3e-84a91bcd477e"
Expand All @@ -17,3 +18,4 @@ Revise = "295af30f-e4ad-537b-8983-00126c2a3abe"
RobustNeuralNetworks = "a1f18e6b-8af1-433f-a85d-2e1ee636a2b8"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"
cuDNN = "02a925ec-e4fe-4b08-9a7e-0d78e3d38ccd"
Binary file modified examples/results/lbdn-mnist/dense_mnist.bson
Binary file not shown.
Binary file modified examples/results/lbdn-mnist/lbdn_mnist.bson
Binary file not shown.
Loading
Loading