-
Notifications
You must be signed in to change notification settings - Fork 12
Relationship between libraries
The cube-explorer project will integrate functionality from a number of libraries, including cartopy, iris, matplotlib and HoloViews. Determining the correct level of abstraction for these different libraries to interact is the main purpose of the initial phase of this project. Therefore we will summarize how HoloViews usually handles wrapping external data types and plots them and then go into some concrete proposals on what functionality should be exposed where.
The HoloViews library usually provides very thin wrappers around some data structure. A HoloViews Element type wraps the data, determines the dimensionality of the data and then provides an API to work with the data, visualize it and transform it in different ways.
The core API required for plottable Elements is as follows:
- The constructor has to infer the dimensions from the data if not supplied otherwise needs to confirm supplied dimensions exist.
- The
dimension_values
method is used to provide a backend agnosticzAPI to access an array of values associated with dimension. - The
range
method computes the minimum and maximum of each dimension in an Element. - The
__len__
method returns the number of samples in the object
Additionally some Elements offer the following API:
-
reindex
: Allows shuffling the dimensions of the data -
__getitem__
andselect
allow slicing the object either through dimension value slices/indices or a boolean array. -
groupby
allows grouping the values in an Element by one or more dimensions, returning a container with the subgroups. -
aggregate
/reduce
allow aggregating over specific dimensions in an Element using numpy functions. -
add_dimension
: Add a new dimension to an existing Element -
sample
allows for regular subsampling into a Table.
In terms of plotting code HoloViews already implements a wide range of Element types but does not support plotting onto projections natively. Existing plot types could be reused but are easily subclassed to support new behaviors.
Iris is a library for working with n-dimensional datasets with optional associated coordinate systems. It provides functionality to store and load data, metadata and coordinates in a Cube. The Cube itself offers a broad API to slice, aggregate and sample the data.
HoloViews should avoid reimplementing any of the slicing, aggregating or loading machinery, instead letting the Cube handle that. Similarly the plotting functions provided by Iris can be reused directly by HoloViews without having to reimplement the coordinate and coordinate reference system handling.
Cartopy provides an extension for other plotting libraries to support working with custom geographic projections. It integrates with matplotlib using the transform and projection arguments, which specify the coordinate system the data is situated in and the coordinate system the data is plotted in respectively.
The relationship between HoloViews and Cartopy is less clear, the HoloViews matplotlib plotting backend could theoretically handle both projections and transforms directly. However the transform (or data coordinate system) should be tightly coupled to the actual data, so custom Element types, which explicitly know about coordinate-reference systems is likely preferable.
Since HoloViews provides a high-level wrapper around both the datastructures and the plotting code, one of the core principles is that as much of the implementation should be pushed upstream if possible. This means that HoloViews should use iris plotting functions as much as possible to avoid reimplementing the plotting code and should avoid reimplementing the functionality the Cube already provides.
HoloViews provides a set of custom Element and plot types centered
around wrapping and displaying Cube Elements, when working with
cube-based data wrap it in a Cube Element
type, otherwise use
standard Element types. The interface of the Cube types at first
covers just the minimum requirements but can in future be expanded to
allow operations on the data including slicing and aggregation.
Pros:
- Good separation of concerns HoloViews does not need to know anything about projections or transforms.
- Allows developing the interfaces first and then think about integrating them in a general way in HoloViews or Iris as appropriate.
- Easiest to implement.
Cons:
- Requires custom versions of most Element and plot types to support coordinate reference systems properly.
- Elements with and without coordinate systems cannot be mixed in an Overlay.
Extensions:
-
After developing a cube based interface in the cube-explorer project develop a general interface in HoloViews to support ND data arrays, of which the Iris cube interface becomes one particular example (xarray being another example).
-
Integrate the interface with Iris such that Cubes have method to easily convert to HoloViews objects (as an example see the imagen project).
HoloViews provides custom Element types but we try to support custom projections within HoloViews itself as much as possible.
Pros:
- Does not require reimplementing custom plot classes in cube-explorer project.
- Would allow for regular and custom Element types to be mixed in an Overlay.
Cons:
- HoloViews itself would need to become more aware of both projections and transforms.
- Would require reimplementing some chunks of Iris plotting functions
One, two dimensional iris Cubes could be hooked up to a display hook directly without wrapping in a HoloViews objects.
Pros:
- No wrapping allows methods on cubes to be used directly providing a more familiar interface to iris users.
Cons:
- Cannot easily be composed because HoloViews expects a certain API from its Elements. Cubes would have to mirror a significant portion of the HoloViews API, i.e. all the methods on
Dimensioned
objects.
Notes: 1D or 2D iris Cubes could still be made directly visible with a display hook, which automatically wraps it in a corresponding Element type, higher-dimensional cubes would continue to just display a regular repr.
Largely identical to Proposal I but is implemented as a general holoviews data interface. HoloViews now allows both column based and grid based data to be exposed to HoloViews via interface classes. The Iris CubeInterface should just mirror the general API (and GridColumns in particular) leaving some methods unimplemented for now. As the project evolves we can consider exposing slicing, aggregation and sampling methods.
Pros:
- Transparently allows using Cube data in non-geographic Element types.
- Conversions to default Element types immediately supported via
.to
method. - Can be extended to expose operations on the data on demand
Cons:
- Same as Proposal I