-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast vertical profile diagnostic #186
Comments
Since mean() works on the GPU, it’s not difficult to implement right? I feel developing the abstraction so it’s trivial to implement kernels is the way to go. We’ll benefit on many fronts. The FourierFlows strategy is to provide a tool that writes output from arbitrary functions (paired with a symbol that gives the output a name). Then developers can add functions for common outputs (like TKE, dissipation, covariances), and users can pick and choose what they need flexibly, or write new functions if they need to. It could also make sense to have a tool that takes kernels as input for our case. |
This is becoming a high priority issue now. We don't have to worry about using a second core to compute diagnostics yet but should develop some reusable kernels to compute, e.g. vertical profiles, covariances, etc. Unfortunately taking the horizontal average of a field via We must use a non-contiguous view to ignore halos as including halos in the averaging would produce a wrong answer. This is what I've been doing for now which works okay (no scalar ops) but not great: function horizontal_avg(model, field)
function havg(model)
f = Array(ardata(field))
return mean(f, dims=[1, 2])[:]
end
return havg
end
function horizontal_avg(model, f1, f2)
function havg(model)
prod = Array(ardata(f1)) .* Array(ardata(f2))
return mean(prod, dims=[1, 2])[:]
end
return havg
end and some timings from a 256^3 simulation
Shouldn't be hard to code up some simple kernels that do this quickly with minimal allocations. |
This script: https://github.com/glwagner/ColumnModelOptimizationProject/blob/master/les/simple_flux.jl shows how to use the JLD2OutputWriter to calculate horizontal averages efficiently on the GPU. edit: I missed your point about not including the halos. |
Can you use mean on a view into the parent array on the GPU? |
Note also that due to our current convention for fields and indexing, the “correct” horizontal average depends on both the field type and the boundary conditions. |
Unfortunately no. It's still a non-contiguous view. julia> using Statistics, CuArrays; C = rand(128, 128, 128) |> CuArray; @time CuArrays.@sync mean(C, dims=[1, 2]);
0.000575 seconds (221 allocations: 8.797 KiB)
julia> using Statistics, CuArrays; C = rand(128, 128, 128) |> CuArray; @time CuArrays.@sync mean(view(C, 2:127, 2:127, 2:127), dims=[1, 2]);
18.857220 seconds (10.15 M allocations: 342.416 MiB, 1.11% gc time) A 33,000x slow down lol.
Ah good point, that would complicate things... |
Interesting. One solution is to compute the mean including the halos, and then adjust the result (on the cpu) taking into account the horizontal boundary conditions + halo values. |
Hmmm, I wonder if it might be easier to contribute a version of @maleadt @vchuravy is it possible to efficiently calculate Nevermind, I think it's an open issue in CuArrays.jl: JuliaGPU/CuArrays.jl#68 |
We may want vertical profiles of many variables, e.g. u, v, w, T. Would be nice to have a diagnostic that does this efficiently, especially if we have very frequent diagnostics.
If it's literally every iteration then a CUDA kernel might be the way to go. But if it's like every 20-100+ iterations then it might be faster to copy stuff to the CPU and do a lot of extended on-the-fly analysis there (similar to what we do with asynchronous NetCDF output).
Not sure if the same diagnostic can handle products of fields, e.g. w'T'. That could be another diagnostic?
cc @sandreza
The text was updated successfully, but these errors were encountered: