Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running into a weird issue, with the getting started example #323

Closed
sharabhshukla opened this issue Apr 11, 2024 · 2 comments
Closed

Running into a weird issue, with the getting started example #323

sharabhshukla opened this issue Apr 11, 2024 · 2 comments

Comments

@sharabhshukla
Copy link

I am running a simple example shared in a julia forum here,

using ExaModels, MadNLP, CUDA

function luksan_vlcek_model(
N;
T = Float64, # precision
backend = nothing, # GPU backends, e.g., CUDABackend()
kwargs... # solver options
)

c = ExaCore(T, backend)
x = variable(c, N; start = (mod(i, 2) == 1 ? -1.2 : 1.0 for i = 1:N))

constraint(
c,
3x[i+1]^3 + 2 * x[i+2] - 5 + sin(x[i+1] - x[i+2])sin(x[i+1] + x[i+2]) + 4x[i+1] -
x[i]exp(x[i] - x[i+1]) - 3 for i = 1:N-2
)
objective(c, 100 * (x[i-1]^2 - x[i])^2 + (x[i-1] - 1)^2 for i = 2:N)

return ExaModel(c; kwargs...)

end;

m = luksan_vlcek_model(10000; backend = CUDBackend())

result = madnlp(m; tol = 1e-8)

I am running into the following error,

ubuntu@gpu-cuda-server:~/julianlpcuda$ julia test_madnlp.jl
┌ Warning: ExaCore(T, backend) is deprecated. Use ExaCore(T; backend = backend) instead
└ @ ExaModels ~/.julia/packages/ExaModels/PyPin/src/nlp.jl:145
ERROR: LoadError: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations do not execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use allowscalar or @allowscalar
to enable scalar iteration globally or for the operations in question.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] errorscalar(op::String)
@ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:155
[3] _assertscalar(op::String, behavior::GPUArraysCore.ScalarIndexing)
@ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:128
[4] assertscalar(op::String)
@ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:116
[5] getindex
@ ~/.julia/packages/GPUArrays/OKkAu/src/host/indexing.jl:48 [inlined]
[6] macro expansion
@ ~/.julia/packages/MadNLP/RRGPv/src/matrixtools.jl:131 [inlined]
[7] macro expansion
@ ./simdloop.jl:77 [inlined]
[8] force_lower_triangular!(I::CuArray{Int32, 1, CUDA.Mem.DeviceBuffer}, J::CuArray{Int32, 1, CUDA.Mem.DeviceBuffer})
@ MadNLP ~/.julia/packages/MadNLP/RRGPv/src/matrixtools.jl:130
[9] create_kkt_system(::Type{MadNLP.SparseKKTSystem}, cb::MadNLP.SparseCallback{Float64, CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, ExaModel{Float64, CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, ExaModelsKernelAbstractions.KAExtension{Float64, CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, Nothing, CuArray{Tuple{Int64, Int64}, 1, CUDA.Mem.DeviceBuffer}, CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, CUDABackend}, ExaModels.Objective{ExaModels.ObjectiveNull, ExaModels.SIMDFunction{ExaModels.Node2{typeof(+), ExaModels.Node2{typeof(), Int64, ExaModels.Node1{typeof(abs2), ExaModels.Node2{typeof(-), ExaModels.Node1{typeof(abs2), ExaModels.Var{ExaModels.Node2{typeof(-), ExaModels.ParSource, Int64}}}, ExaModels.Var{ExaModels.ParSource}}}}, ExaModels.Node1{typeof(abs2), ExaModels.Node2{typeof(-), ExaModels.Var{ExaModels.Node2{typeof(-), ExaModels.ParSource, Int64}}, Int64}}}, ExaModels.Compressor{Tuple{Int64, Int64, Int64}}, ExaModels.Compressor{NTuple{4, Int64}}}, CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}}, ExaModels.Constraint{ExaModels.ConstraintNull, ExaModels.SIMDFunction{ExaModels.Node2{typeof(-), ExaModels.Node2{typeof(-), ExaModels.Node2{typeof(+), ExaModels.Node2{typeof(+), ExaModels.Node2{typeof(-), ExaModels.Node2{typeof(+), ExaModels.Node2{typeof(), Int64, ExaModels.Node2{typeof(^), ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}, Int64}}, ExaModels.Node2{typeof(), Int64, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}, Int64}, ExaModels.Node2{typeof(), ExaModels.Node1{typeof(sin), ExaModels.Node2{typeof(-), ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}, ExaModels.Node1{typeof(sin), ExaModels.Node2{typeof(+), ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}}}, ExaModels.Node2{typeof(), Int64, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}, ExaModels.Node2{typeof(), ExaModels.Var{ExaModels.ParSource}, ExaModels.Node1{typeof(exp), ExaModels.Node2{typeof(-), ExaModels.Var{ExaModels.ParSource}, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}}}, Int64}, ExaModels.Compressor{NTuple{10, Int64}}, ExaModels.Compressor{NTuple{17, Int64}}}, CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, Int64}}, MadNLP.MakeParameter{CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}}, MadNLP.EnforceEquality}, ind_cons::@NamedTuple{ind_eq::CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, ind_ineq::CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, ind_fixed::CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, ind_lb::CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, ind_ub::CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, ind_llb::CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, ind_uub::CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}}, linear_solver::Type{UmfpackSolver}; opt_linear_solver::MadNLP.UmfpackOptions, hessian_approximation::Type)
@ MadNLP ~/.julia/packages/MadNLP/RRGPv/src/KKT/sparse.jl:218
[10] MadNLPSolver(nlp::ExaModel{Float64, CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, ExaModelsKernelAbstractions.KAExtension{Float64, CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, Nothing, CuArray{Tuple{Int64, Int64}, 1, CUDA.Mem.DeviceBuffer}, CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, CUDABackend}, ExaModels.Objective{ExaModels.ObjectiveNull, ExaModels.SIMDFunction{ExaModels.Node2{typeof(+), ExaModels.Node2{typeof(), Int64, ExaModels.Node1{typeof(abs2), ExaModels.Node2{typeof(-), ExaModels.Node1{typeof(abs2), ExaModels.Var{ExaModels.Node2{typeof(-), ExaModels.ParSource, Int64}}}, ExaModels.Var{ExaModels.ParSource}}}}, ExaModels.Node1{typeof(abs2), ExaModels.Node2{typeof(-), ExaModels.Var{ExaModels.Node2{typeof(-), ExaModels.ParSource, Int64}}, Int64}}}, ExaModels.Compressor{Tuple{Int64, Int64, Int64}}, ExaModels.Compressor{NTuple{4, Int64}}}, CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}}, ExaModels.Constraint{ExaModels.ConstraintNull, ExaModels.SIMDFunction{ExaModels.Node2{typeof(-), ExaModels.Node2{typeof(-), ExaModels.Node2{typeof(+), ExaModels.Node2{typeof(+), ExaModels.Node2{typeof(-), ExaModels.Node2{typeof(+), ExaModels.Node2{typeof(), Int64, ExaModels.Node2{typeof(^), ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}, Int64}}, ExaModels.Node2{typeof(), Int64, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}, Int64}, ExaModels.Node2{typeof(), ExaModels.Node1{typeof(sin), ExaModels.Node2{typeof(-), ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}, ExaModels.Node1{typeof(sin), ExaModels.Node2{typeof(+), ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}}}, ExaModels.Node2{typeof(), Int64, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}, ExaModels.Node2{typeof(), ExaModels.Var{ExaModels.ParSource}, ExaModels.Node1{typeof(exp), ExaModels.Node2{typeof(-), ExaModels.Var{ExaModels.ParSource}, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}}}, Int64}, ExaModels.Compressor{NTuple{10, Int64}}, ExaModels.Compressor{NTuple{17, Int64}}}, CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, Int64}}; kwargs::@kwargs{tol::Float64})
@ MadNLP ~/.julia/packages/MadNLP/RRGPv/src/IPM/IPM.jl:155
[11] MadNLPSolver
@ ~/.julia/packages/MadNLP/RRGPv/src/IPM/IPM.jl:115 [inlined]
[12] madnlp(model::ExaModel{Float64, CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, ExaModelsKernelAbstractions.KAExtension{Float64, CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, Nothing, CuArray{Tuple{Int64, Int64}, 1, CUDA.Mem.DeviceBuffer}, CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, CUDABackend}, ExaModels.Objective{ExaModels.ObjectiveNull, ExaModels.SIMDFunction{ExaModels.Node2{typeof(+), ExaModels.Node2{typeof(), Int64, ExaModels.Node1{typeof(abs2), ExaModels.Node2{typeof(-), ExaModels.Node1{typeof(abs2), ExaModels.Var{ExaModels.Node2{typeof(-), ExaModels.ParSource, Int64}}}, ExaModels.Var{ExaModels.ParSource}}}}, ExaModels.Node1{typeof(abs2), ExaModels.Node2{typeof(-), ExaModels.Var{ExaModels.Node2{typeof(-), ExaModels.ParSource, Int64}}, Int64}}}, ExaModels.Compressor{Tuple{Int64, Int64, Int64}}, ExaModels.Compressor{NTuple{4, Int64}}}, CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}}, ExaModels.Constraint{ExaModels.ConstraintNull, ExaModels.SIMDFunction{ExaModels.Node2{typeof(-), ExaModels.Node2{typeof(-), ExaModels.Node2{typeof(+), ExaModels.Node2{typeof(+), ExaModels.Node2{typeof(-), ExaModels.Node2{typeof(+), ExaModels.Node2{typeof(), Int64, ExaModels.Node2{typeof(^), ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}, Int64}}, ExaModels.Node2{typeof(), Int64, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}, Int64}, ExaModels.Node2{typeof(), ExaModels.Node1{typeof(sin), ExaModels.Node2{typeof(-), ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}, ExaModels.Node1{typeof(sin), ExaModels.Node2{typeof(+), ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}}}, ExaModels.Node2{typeof(), Int64, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}, ExaModels.Node2{typeof(), ExaModels.Var{ExaModels.ParSource}, ExaModels.Node1{typeof(exp), ExaModels.Node2{typeof(-), ExaModels.Var{ExaModels.ParSource}, ExaModels.Var{ExaModels.Node2{typeof(+), ExaModels.ParSource, Int64}}}}}}, Int64}, ExaModels.Compressor{NTuple{10, Int64}}, ExaModels.Compressor{NTuple{17, Int64}}}, CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, Int64}}; kwargs::@kwargs{tol::Float64})
@ MadNLP ~/.julia/packages/MadNLP/RRGPv/src/IPM/solver.jl:10
[13] top-level scope
@ ~/julianlpcuda/test_madnlp.jl:24
in expression starting at /home/ubuntu/julianlpcuda/test_madnlp.jl:24

I have a VM with a GPU RTX A6000, I installed all the drivers that might be needed. Please point out in anything seems missing, that might be throwing errors

julia> CUDA.versioninfo()
CUDA runtime 12.3, artifact installation
CUDA driver 12.3
NVIDIA driver 535.161.8, originally for CUDA 12.2

CUDA libraries:

  • CUBLAS: 12.3.4
  • CURAND: 10.3.4
  • CUFFT: 11.0.12
  • CUSOLVER: 11.5.4
  • CUSPARSE: 12.2.0
  • CUPTI: 21.0.0
  • NVML: 12.0.0+535.161.8

Julia packages:

  • CUDA: 5.2.0
  • CUDA_Driver_jll: 0.7.0+1
  • CUDA_Runtime_jll: 0.11.1+0

Toolchain:

  • Julia: 1.10.2
  • LLVM: 15.0.7

1 device:
0: NVIDIA RTX A6000 (sm_86, 47.532 GiB / 47.988 GiB available)

@sshin23
Copy link
Member

sshin23 commented Apr 12, 2024

Oops, my bad. MadNLPGPU needs to be imported. Could you retry with using MadNLPGPU?

@sharabhshukla
Copy link
Author

@sshin23 , indeed thanks. It was missing madNLPGPU. It works now, there was one small typo in CUDABackend instead of CUDBackend. other that works like a charm, I changed the N from 10000 to 100000, worked like a charm.

Also, I have some questions around using this with the usual JuMP interface, can I open a discussion around it. The docs aren't that comprehesive at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants