Skip to content
Simon Byrne edited this page May 13, 2021 · 7 revisions

Design

Aims

  • separate data layout from interface
  • make it easier to use smaller pieces of code
  • make the code / physics easier to understand
  • separate vertical and horizontal where possible:
    • horizontal:
      • will be spectral element
      • use quad elements with LGL quadrature
      • duplicate values at colocated nodes (supporting both DG and CG)
      • either regular grid (LES box), cubed sphere (GCM) or potentially arbitrary conformal mesh (e.g. for ocean and land)
        • support for loading an arbitrary 2D grid from standard mesh formats (GMSH)
      • can be distributed
    • vertical
      • nodes will be vertical (i.e. radial on a sphere) aligned
      • should support a variety of discretizations (e.g. spectral element, FV, staggered grids)
      • support both duplicated or shared nodes for CG
      • implicit solves in the vertical
  • Support for 2D horizontal surfaces (e.g. boundary conditions, barotropic modes in ocean models, etc.)
  • Make interface more functional:
    • i.e. users should write functions that receive and return immutable objects, rather than in-place modification

First steps plan

  1. Data backends for 3D/2D/column/pancake
  • StructArray or roll our own?
  • Unit tests + GPU
  1. Horizontal grid structure
  • topology
  • metrics
  • DSS
  1. Horizontal operators
  • horiz gradient
  • horiz divergence
  • broadcasting
  1. Basic examples
  • Poisson equation / Heat equation
    • regular 2D mesh
    • irregular 2D mesh
  • Wave equation
  1. Column grid structure
  • 1D tests of column
  • 3D tests of both combined

Basic considerations

  • We only need to consider 2D topologies

    • i.e. connectivity is the same for all elements in a stack
  • Aim for deep atmosphere

    • optional support for shallow atmosphere shouldn't be too difficult: modify metric terms and Coriolis terms
  • Remainder model

    • becomes a splitting of the vertical
    • needs to only be vertical-only

Interfaces

  • column(values, i, j, h)

    • view that can be indexed by [k, v], returning a struct
    • used for everything vertical (divergence, gradient, integration, implicit solves, search up/down/bisect, interpolated view)
    • for 3d fields, this will act like a function
  • pancake(values, k, v, h)

  • view that can be indexed by [i,j], returning a struct

  • face(pancake, face#) => view indexed by [i] (for horz num fluxes)

  • opposing(face) will give the face opposing the element

  • elements (read only)

    • vizualization, VTK export
  • key assumptions:

    • columns never access horizontal data other than gradients, etc, which can be computed before being passed in
    • horizontal terms only act via derivatives (divergence / gradients / curl)
  • broadcasting (incl. 2D horz -> 3D fields), lazy variant

    • reduction (integrate 3D -> 2D)

potential layouts

(some we will want, flexibility to add more)

  • GPU: thread model 1 thread per (i,j) within an element

  • CPU: ?

    • 1 thread per pancake
    • 1 thread per column
  • i,j horizontal node indices

  • k vertical node index

  • field: field index (to be figured out)

  • v: vertical element index

  • h: horizontal element index (i.e. uniquely identifies an element stack) (a) integer for arbitrary mesh - how to handle real vs ghost elements? (b) (hx,hy) for regular grid (LES) (c) (hx,hy,hface) for cubed sphere

TODO

  • How to handle (expensive) column physics on duplicated columns?

  • primary -> secondary sharing mechanism:

  • duplicate work

  • duplicate across processes

  • ultimately depends on threading model

  • How to handle columns that use different vertical staggering?

  • Interpolation:

    1. what is h?
    • regular grid/cubed sphere we can compute a direct lookup
    • arbitrary mesh build some quad tree
    1. what is v?
    • linear lookup (ideally)
    1. compute coordinate in reference element
    2. barycentric interpolation
      • weighted average of all nodes in an element
      • if horizontal-only, can do it via a pancake

Communication

  • only certain fields require communication (e.g. tendencies, grads in DG)
  • can reuse communication buffers for efficiency
  • load data directly from recv buffers
  • data buffer

DG

opposing face 4 possible faces * 2 orientations

for a ghost element has only one face store whole faces contiguously (duplicate shared nodes if necessary)

  • elemtoelem[f,h] => if > 0, real element, if <0 then its the ghost face #
  • elemtoface[g,h] => if real element, 1:4 positive orienataion, -4:-1 flipped orientation; ghost face 1/-1

CG

store columns contiguous, based on the logical order for the sender

  • don't duplicate nodes unique numbering of columns on a given rank
  • if positive, real column, use linear indexing c = NqNq(h-1) + Nq*(j-1) + i
  • if < 0, ghost column, look up in the receive buffer

Stuff

  • Data: a value stored at each node

  • Grid: geometric information

    • physical location
    • metric terms
    • face normals (need for DG/boundaries); can be computed from metrics if required
    • stores its information using a Data object
  • Topology (move out of Grid)

    • connection between horizontal
      • DG opposing horizontal faces and orientation
      • CG which horizontal nodes are collacated (0 for internal, 1 for non-vertex faces, 2+ for vertices)
    • communication / rank partitioning
  • Field: Data + Grid

    • map
    • vertical reduce (integral)
    • broadcast + lazy broadcast (e.g. combine 3D field, 2D field, scalar)
    • grad, div, curl (split into horizontal/vertical)
    • use a StructArray (or something that acts like an array of structs)
      • main limitation: can only be something take spatial derivatives of
      • requires scalar multiplication + addition

Fields

idea field element type can be one of:

  • real scalars (floating point numbers)
  • vector coefficient types (basically coeffients of a vector space)
    • CartesianVector(x1,x2,x3)
    • SphericalCartesian(u,v,w)
    • Contravariant/Covariant wrt the reference element
    • Contravariant1/Contravariant2/Contravariant3 projections onto contravariant
    • to convert between these, you need geometry information SphericalCartesian(CartesianVector(1,2,3), geom)
  • tensors?
  • composite object (tuples or named tuples of the above)
    • support for Tuples/NamedTuples
    • add support for custom structs?
    • should not support vector coefficients to avoid e.g. adding two SphericalCartesian objects at different locations

Composable operations

Can we compose kernels by combining multiple operations, such as broadcasting and tensor product operations.

  1. gradient:

    1. extract/ compute X to shared memory
    2. synchronize
    3. tensor product
      A1 = (D ⊗ I) * X
      A2 = (I ⊗ D) * X
      
    4. combine
      ∂ξ∂x .* Covariant(A1,A2)
      
  2. strong divergence

    1. map to shared memory
      C1 = J .* contravariant1(V)
      C2 = J .* contravariant1(V)
      
    2. synchronize
    3. tensor on each matrix
      D1 = (D ⊗ I) * C1
      D2 = (I ⊗ D) * C2
      
    4. combine
      (D1 .+ D2) ./ J
      
  3. filtering / interpolation

    1. compute/load to shared memory
    2. synchronize
    3. tensor dimension 1 to shared meory
      A = (F ⊗ I) * X
      
    4. synchronize
    5. tensor dimension 2
      Y = (I ⊗ F) * A
      

Next steps

  1. add (optional) surface geometry data structure to mesh

    • IFH data structure: separate internal/boundary
    • DG we need at all faces
    • CG we only need at boundary faces
  2. horizontal boundary conditions for CG/DG

    • effectively we need a way to prescribe boundary fluxes
  3. Additional operators

  • Curl
  • Hyperdiffusion
    • scalars, vectors, tensors
  1. high-level CG/DG divergence function: would specify

    • input fields
    • flux function
    • numerical flux function for DG
    • boundary fluxes
    • additional options
      • integration mesh for overintegration
      • hyperdiffusion
  2. Mesh warping (dependent on 1)

  • flat horizontal to check metric terms
    • e.g. apply a sinusoid to x1
  • Cubed-sphere mesh
    • equiangular
    • S2 geometry
  1. Vertical column standalone (dependent 1) (Charlie, Dennis)

    • add Column data structure VF to DataLayouts
    • finite difference mesh
      • heights of edges
      • heights of centers (can be computed)
      • number of ghost points top & bottom
      • handle flipped indexing ocean/land top down, atmos bottom up
    • fields stored at centers or faces
    • stencils for interpolation / gradients
    • diffusion example
    • aim for Ekman layer, stable boundary layer (Ignacio, Yair)
  2. Boundary conditions for the vertical

    • Identify people (Charlie, ?)
    • Higher-level interface
    • Identify required functionality
  3. Integrated vertical/horizontal model

    • hybrid mesh
    • implicit solver for multiple columns
  4. Performance and Parallelism

  • Profiling/track allocations

  • CPU threading

    • may need to disable GC
    • volume: can @threads for h = 1:Nh
    • faces: will need to avoid race conditions
      • partition into non-overlapping face groups
      • duplicate work (matching current ClimateMachine model)
  • GPU support

    • CG Bicleky jet (Sriharsha)
  • Distributed Memory / MPI

    • what is the object that determines the distribution mechanism

      • topology at the lowest level
    • only need to worry about horizontal

    • Ghost exchange

      • DG ghost faces
      • CG ghost columns
      • (optional) Reusing shared buffers (e.g. for multistage RK)
      • (optional) Index directly into receive buffer
    • Allreduce to compute norms

    • Broadcast

Test cases to add

  • vector invariant bickley jet
  • wave / advection-diffusion
  • isentropic vortex
  • non Cartesian vectors

Initial GPU kernel development:

- Get bickley jet working with CG
    - Need specalized operators for DSS, divergence kernels
- Get bickley jet working with DG
    - Need specialized operators for DG weak divergence, numerical fluxes