Multiple representations of the same data #116

jerstlouis · 2020-03-27T16:49:54Z

At the Toulouse TC, the concept of multiple representations of the same data (identified by a single collection) was proposed (e.g. here: #62 (comment)) and welcomed by many. As an example, the same set of observations could be available as Features, a Coverage, a SensorThings API.

Another example is 3D buildings data, which could be available as Features (e.g. based on CityGML data model), as 3D meshes (whether organized in a bounding volume hierarchy as in 3D Tiles and i3s, or following a fixed tiling scheme as in CDB), as a point cloud (e.g. as LAS, or as 3D Tiles .pnts or i3s). A service could provide one or more of these representations, and a client could chose the one it supports best or which is appropriate for specific tasks. For example a 2D-only client could show buildings in 2D, while a 3D client pointed to the same dataset would display them in 3D.

I believe this concept is highly valuable, however Chapter 9 (Collections) does not appear to currently describe and reflect these more recent developments.

cportele · 2020-03-28T09:45:08Z

I don't think there is anything in the current Common that excludes that, maybe expect for the fuzzyness in the terminology in the current draft. This comment is about terminology, too, but with architectural implications.

Since a key point of the OGC API standards is alignment with the Web, we should be consistent with their terminology. Different "representations" of the same data/resource are selected using content negotiation (different media type, different language, etc.). See HTTP.

At least at first, this seems to be different from "representation" as used in this issue since there is an underlying assumption that the different representations use different URIs, i.e. are different resources. However, what we have been doing in OGC API Features is that we have been splitting the collection into two resources at least for feature collections. /collections/{collectionId} is the Feature Collection, but it does not embed the features in its representations (JSON, XML, HTML) and instead has typed links to each representation of Features in /collections/{collectionId}/items, which is a sub-resource). See also #105. Conceptually the two resources might also be seen as a single resource.

The current design with the split resources was selected for practical reasons during the March 2018 WFS3/STAC hackathon (see the issue). Before that we simply had the landing page (which included the list of collections and links to them), the feature collection at /{collectionId} and the feature at /{collectionId}/{featureId} based on the WFS 2.x REST binding and the work in the Geonovum testbed.

If we ignore the practical reasons for the change mentioned in the issue for a moment and say that we would have sticked to the original path structure. Then we could have requested, for example, the following representations from /my-observations:

application/ogcapi-metadata+json (or similar, we are just using application/json for now): information about the collection
application/geo+json: the collection as features in GeoJSON
application/gml+xml: the collection as features in GML
image/tiff; application=geotiff: the collection as a GeoTIFF coverage
text/html: the collection so that a human can understand it (and crawlers)
etc.

In that sense, the use of the term "representation" is correct, but since we are using a different resource structure we need to be careful how we use the term. Maybe we should distinguish "views" (feature view, coverage view, tile/container view, etc) and then distinguish the different representations for each data view resource consistent with the HTTP spec?

Note that the practical reasons that led to the current resource structure are still valid. In fact, there would be even more reasons for collections with multiple views since, for example, paging/the limit parameter makes sense for feature content types, but less for coverage content types.

dblodgett-usgs · 2020-03-28T12:11:37Z

Great summary @cportele --

@jerstlouis You say:

the concept of multiple representations of the same data (identified by a single collection)

(emphasis on single collection mine)

I don't think the the idea that a collection is analogous to a dataset has been established other than in the zeitgeist. Not saying it's totally wrong -- I see the logic -- but it has not been decided.

If an OGC-API instance were to provide access to one abstract dataset, then each OGC API-standard it conforms to would logically provide a different view of that dataset.

For now, since OGC API-Features provides access to one dataset, (see Note in 7.1) I think this is a non-issue. In a sense, each collection is already a thematic view of a dataset. As more OGC API-standards are finalized, maybe we get to a place where we need to deal with this complexity in them, but I don't think handing this complexity right now is justified.

We certainly may find ourselves in need of a way to support multiple datasets, but I think there is plenty of (more valuable) scope to take on in defining building blocks focused on distribution of a single dataset. I use distribution intentionally -- as I think it's really important to remember that there will be other distributions of any dataset. i.e. An OGC API instance with multiple views of a dataset is just one distribution. Having the OGC API endpoint encompass multiple datasets would just get annoying and cause a bunch of hacks when needing to list other distributions of a dataset if it were to attempt otherwise.

jerstlouis · 2020-03-28T13:31:02Z

@cportele @dblodgett-usgs This issue is submitted in regard to what Chapter 9 - Collections currently says, where /collections/{collectionID} is still Common (Collections), and where one such collection available as vector features at /collections/{collectionID}/items could be alternatively available as a coverage (at /collections/{collectionID}/coverage), but another collection might not. It would thus not be correct to say that /collections/{collectionID} is (only) the Feature collection.

By representation, I meant the underlying data model, which I believe differs significantly from the media type of Accept-Encoding etc. This probably does need a better term to differentiate it from that other meaning, but I am not convinced that 'view' is the best term. Example of such data models would be:

Vector features
Gridded raster coverage
3D Meshes

In addition, the data could be available in multiple specifications following the same data model (e.g. i3s and 3D Tiles are two community standards providing 3D Meshes). In addition to this, specific resources could be available as different media types (e.g. GeoTIFF, JPEG2K).

This issue is mostly about the discussion in Toulouse in the Coverage SWG where it was proposed that a collection could be available as multiple representations of the same data, but which is not explicitly mentioned in Chapter 9. This came up in the 3D Pilot where it was pointed out that nothing in there says that all resources after the /collections/{collectionID}/ portion of the path are representations or views of a specific data source (although one could infer that from Features). A Collection available using multiple specifications (representation / view) would have different resources available for each, e.g. /items for vector features, and a number of resources for coverages under /coverage, as in the current OGC API - Coverages draft.

I really disagree with 'tiles' being considered a separate view, because tiles enable to split the data following whichever data model for caching & optimization purposes, is not itself a specific representation of the data (i.e. the data is still raster, or vector or a 3D mesh) and a non-developer user really should not have to understand or have visibility into Tiles unless he really insists on looking under the hood and learn the mystical arts of TileMatrixSets etc. Tiles is a building block that can be used together with multiple data models. You can still make a service serving only tiles, but if you do so you are not embracing the concept of an integrated and unified OGC API. e.g. if you serve vector data, you would normally also offer items. If you serve raster data, you would offer e.g. a PNG output by BBOX/resolution (I am arguing that this is not OGC API - Maps if the service is not doing or pretending to be doing server-side rendering), and potentially also a more advanced OGC API - Coverage interface. I have also been suggesting that tiles can be used within a process daisy chain without the client having visibility into it, even if the client did not explicitly request tiles from the first hop along the chain.

dblodgett-usgs · 2020-03-28T14:28:45Z

@jerstlouis Do you have a proposal here that doesn't include the assumption that collections are analogous to dataset? If no, I think we should table this issue until we've settled the numerous open issues around collections.

jerstlouis · 2020-03-28T14:54:03Z

@dblodgett-usgs As said above, this issue is filed in relation to the current Common Draft, which does assume up to /collections/{collectionId} we are still in 'Common'. I agree we need to settle the numerous issues around collections, but I think the aspects identified here are important to consider while settling those issues.

joanma747 · 2020-03-29T09:25:57Z

I fully support the @cportele idea and include the term "resource view" to express that a resource can be retrieved using "related resources" (many times subresources in the path) that provide other views of the same geospatial resource (e.g. features, maps, tiles,...) or descriptions of the geospatial resource (metadata, schemas,...)

jerstlouis · 2020-03-29T14:49:53Z

@joanma747 I see all those examples you mentioned as different from different 'underlying representations' (which might be what we want to call views).
e.g. if the underlying representation is simple vector features (points, lines & polygons):

You can render this on a map, but you are just using the vector features representation (and could be layering this with other data sources on your map)
You could be retrieving these vector features as tiles
You could be accessing the metadata or schema for those vector features.

With the multiple underlying representations concept, you could access the data either as a feature collection or as a coverage, or as CityGML features or 3D Tiles meshes, and the client would not really know what is the underlying representation (e.g. whether the source of truth is vector or raster data).

dblodgett-usgs · 2020-06-04T19:46:52Z

We bottomed this out in #140. Need to reflect that outcome in the spec then can close.

cmheazel · 2020-08-14T21:53:06Z

@dblodgett-usgs #140 was closed through a pull request. Were the changes made sufficient to close this issue as well?

dblodgett-usgs · 2020-08-15T11:59:35Z

I think so -- but that's just my personal opinion. I think closing this and encouraging @jerstlouis to open a new issue in the context of the emerging part-2 would be the right path for this if you think it needs to be discussed more.

jerstlouis · 2020-08-24T14:00:15Z

@cmheazel @dblodgett-usgs
The Part 2 specifications should probably explicitly state in an informative manner that this approach is possible, i.e. that multiple "views" or "access mechanisms" using more than one OGC API (e.g. Features & Tiles, or Features & Coverages -- accessing the data as features, vector tiles, coverage tiles or as a coverage) is possible for the same data.

This is what this issue is about, so I would argue for simply making the change before closing it.

cmheazel · 2020-09-11T12:02:19Z

@jerstlouis I will make the recommended updates then close.

cmheazel · 2021-06-04T18:22:47Z

Given that we are developing standards for modular APIs, it is possible for an API implementation to conform to both API-Features and API-Coverages. If both Feature and Coverage views of a collection are supported, then /collections/{collectionId} will not be a unique path. So how does the client know which /collections/{collectionId} is a Feature Collection and which is a Coverage?

jerstlouis · 2021-06-04T18:31:23Z

@cmheazel The resolution of #140 is that the very same collection at /collections/mycollection can potentially be accessed both as a Coverage (/collections/mycollection/coverage) and as a Feature Collection (/collections/mycollection/items) , i.e. using both OGC API - Features and OGC API - Coverages (and OGC API - Tiles, and OGC API - Maps, and OGC API - GeoVolumes...). The collection will have links with the appropriate relation type for each access mechanism supported by that particular collection.

cmheazel · 2021-06-15T15:02:18Z

@jerstlouis So to be clear, the resource at /collections/mycollection is independent of the type of collection (feature, coverage, map, etc.). The path /collections/mycollection/type (where type = coverage, items, etc.) returns a specific type of collection. This requires that:

we have a standard taxonomy of OGC resource collection types (coverage, feature, map, etc.)
that we have a standard path element assigned to each resource collection type
that each path element is assigned to one and only one resource collection type (subtypes are allowed)
that we come up with a standard term for the concept of resource collection type

jerstlouis · 2021-06-15T15:15:45Z

@cmheazel I prefer to call it access mechanism for a collection, or a view on the collection of data, since a collection could support multiple access mechanisms/views.

I think what is important is that resources defined by OGC API specifications which can be attached to a collection are properly registered to avoid clashes when implementing them together in the same API.

Some resources (e.g. /collections/{collectionId}/metadata) might be useful with more than a single OGC API specification.

The Features Part for Schema will define /schema as well in addition to /items.

EDR already defines multiple queryType resources as /collections/{collectionId}/{queryType} (in a sense, each of these are probably different access mechanisms)

cmheazel · 2021-06-15T19:19:59Z

June 15, 2021 - What comes after the collection ID is a "View" or "access mechanism" for the resource. That is out of scope for Part 2. Resolve the proper term in a sprint - use the most intuitive term.

NOTUC

cmheazel · 2021-06-20T18:01:53Z

Re-wrote Section 2 - Scope
Added Section 7.2 - Views
Removed ATS for the /items path.

cmheazel · 2021-06-28T12:24:52Z

JUne 26, 2021 - close - NOTUC

joanma747 added Resources of Collections type Issues related to the /collections path Resources types Issues related to resource types and taxonomy labels Apr 20, 2020

cmheazel added the Collections Applicable to Collections (consider to use Part 2 instead) label May 11, 2020

dblodgett-usgs mentioned this issue May 21, 2020

Collections Discussion #140

Open

cmheazel self-assigned this Jul 27, 2020

cmheazel added the Close label Aug 16, 2020

pvgenuchten mentioned this issue Oct 1, 2020

Conneg by Profile opengeospatial/ogcapi-records#16

Closed

cmheazel added Progress: resolution agreed and removed Close Resources of Collections type Issues related to the /collections path Resources types Issues related to resource types and taxonomy labels Nov 16, 2020

cmheazel added the help wanted Extra attention is needed label Jun 4, 2021

cmheazel added Progress: solution merged and removed Progress: resolution agreed help wanted Extra attention is needed labels Jun 20, 2021

cmheazel closed this as completed Jun 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple representations of the same data #116

Multiple representations of the same data #116

jerstlouis commented Mar 27, 2020 •

edited

Loading

cportele commented Mar 28, 2020

dblodgett-usgs commented Mar 28, 2020

jerstlouis commented Mar 28, 2020

dblodgett-usgs commented Mar 28, 2020

jerstlouis commented Mar 28, 2020

joanma747 commented Mar 29, 2020

jerstlouis commented Mar 29, 2020

dblodgett-usgs commented Jun 4, 2020

cmheazel commented Aug 14, 2020

dblodgett-usgs commented Aug 15, 2020

jerstlouis commented Aug 24, 2020

cmheazel commented Sep 11, 2020 •

edited

Loading

cmheazel commented Jun 4, 2021

jerstlouis commented Jun 4, 2021 •

edited

Loading

cmheazel commented Jun 15, 2021 •

edited

Loading

jerstlouis commented Jun 15, 2021 •

edited

Loading

cmheazel commented Jun 15, 2021

cmheazel commented Jun 20, 2021

cmheazel commented Jun 28, 2021

Multiple representations of the same data #116

Multiple representations of the same data #116

Comments

jerstlouis commented Mar 27, 2020 • edited Loading

cportele commented Mar 28, 2020

dblodgett-usgs commented Mar 28, 2020

jerstlouis commented Mar 28, 2020

dblodgett-usgs commented Mar 28, 2020

jerstlouis commented Mar 28, 2020

joanma747 commented Mar 29, 2020

jerstlouis commented Mar 29, 2020

dblodgett-usgs commented Jun 4, 2020

cmheazel commented Aug 14, 2020

dblodgett-usgs commented Aug 15, 2020

jerstlouis commented Aug 24, 2020

cmheazel commented Sep 11, 2020 • edited Loading

cmheazel commented Jun 4, 2021

jerstlouis commented Jun 4, 2021 • edited Loading

cmheazel commented Jun 15, 2021 • edited Loading

jerstlouis commented Jun 15, 2021 • edited Loading

cmheazel commented Jun 15, 2021

cmheazel commented Jun 20, 2021

cmheazel commented Jun 28, 2021

jerstlouis commented Mar 27, 2020 •

edited

Loading

cmheazel commented Sep 11, 2020 •

edited

Loading

jerstlouis commented Jun 4, 2021 •

edited

Loading

cmheazel commented Jun 15, 2021 •

edited

Loading

jerstlouis commented Jun 15, 2021 •

edited

Loading