-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
initial dataset record proposal (WIP) #130
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
## Overview | ||
|
||
This document describes the fields needed for an OGC Record to describe a 'dataset'. A dataset is a | ||
"collection of data, published or curated by a single agent, and available for access or download in | ||
one or more serializations or formats" (from [dcat](https://www.w3.org/TR/vocab-dcat-2/#dcat-scope)). In the | ||
geospatial domain datasets typically are defined with the same properties and share higher level metadata. In GIS a | ||
dataset typically corresponds to a 'layer', and in the satellite world a dataset would be all the scene captures that | ||
come from the same sensor or constellation. It corresponds directly to what others call a "dataset series" (ESA, ISO 19115), | ||
"collection" (CNES, NASA), and "dataset" (JAXA, DCAT). | ||
|
||
The Dataset Record is the metadata needed for users to actually find the data they need. The data itself may be available as | ||
an OGC API Service, an older OGC W\*S Service, or an actual data file. | ||
|
||
A dataset record is an [OGC Record](ogc-record-geojson-spec.md), and uses all the exact same fields, but makes | ||
more of the fields required, in order to more fully describe the metadata users need to understand the dataset. | ||
|
||
A Record is the GeoJSON equivalent of an [OGC Dataset Collection](https://github.com/cholmes/ogc-collection/blob/main/ogc-collection-spec.md) | ||
(todo: port this to be a proposal in Features API) that includes 'Dataset Fields', and shares most all the same fields. | ||
|
||
Dataset Records are represented in JSON format and are very flexible. Any JSON object that contains all the | ||
required fields is a valid Record. | ||
|
||
- Examples: | ||
- See this [example](./examples/record-meetlocaties-example.json) that contains more fields and links. | ||
- JSON Schema: TODO | ||
|
||
|
||
## Dataset Record Fields | ||
|
||
The core Record fields for a 'Dataset Record' remain the same as in the core [OGC Record](ogc-record-geojson-spec.md), with the | ||
exact same Item fields as [specified there](ogc-record-geojson-spec.md#item-fields). (TODO: Link to main spec when Peter's refactor lands) | ||
|
||
### Datset Record Property Fields | ||
|
||
The property fields are where the Dataset Record has more requirements. It uses all the same core Record definitions, but adds in | ||
more requirements and a couple defaults. | ||
|
||
| Field Name | Required in Core Record | Required in Dataset Record | Description | | ||
|-------------------|-------------------------|----------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| type | M | M | Denotes the resource type of the record. For the dataset record this is **required** to be `dataset`. | | ||
| title | M | M | A short descriptive human-readable one-line title for the Collection. | | ||
| description | O | M | Detailed multi-line description to fully explain the Collection. | | ||
| keywords | O | M | List of keywords describing the Collection. | | ||
| keywordsCodespace | O | O (defaults to XXX) | A reference to a controlled vocabulary used for the keywords property. | | ||
| language | O | O (defaults to english) | The natural language used for textual values (i.e. titles, descriptions, etc) that the collection information is given in. | | ||
| externalId | O | O | Identifier for the Collection that is unique across the provider. | | ||
| publisher | O | M | The entity making the resource available. | | ||
| created | O | M | The date-time the collection represented by this record was created, formatted to [RFC 3339](https://tools.ietf.org/html/rfc3339#section-5.6). | | ||
| updated | O | O | The date-time this collection represented by this record was updated, formatted to [RFC 3339](https://tools.ietf.org/html/rfc3339#section-5.6). | | ||
| themes | O | O | A knowledge organization system used to classify the resource. | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is |
||
| formats | O | O | A list of available distributions for the resource. | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how about if there is |
||
| contactPoint | O | M | An entity to contact about the resource. | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can this just be Also, does it overlap with What's the object definition (name, email, links, ...) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It does seem to overlap with publisher. I think the current best (only?) way to get a sense of the object definitions is to go to the examples. I think the spec would be a lot more usable if it described the fields in much more depth, and included all the info on the object structures, with in-line examples of the relevant snippet. |
||
| license | O | M | A legal document under which the resource is made available. | | ||
| rights | O | O | A statement that concerns all rights not addressed by the license such as a copyright statement. | | ||
| extent | O | M | Spatial and temporal extents. | | ||
| associations | O | M | A list of links for accessing the resource, links to other resources associated with this resource, etc. | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what is the difference between |
||
| crs | O | O (default to latlong) | Coordinate reference system of the data represented by this collection. | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. from records: what about a more complete |
||
|
||
### Associations | ||
|
||
At least one association is required. This should link to the actual dataset. It could be to OGC API (or OGC W\*S interfaces to the data, it | ||
could link directly to the source format file for the dataset, or it ideally is a combination: several OGC services and a link to the source data. | ||
|
||
TODO: Flesh out common rel types for ogc api links, source data file links, etc. | ||
|
||
### Dataset definition ideas (Work in Progress) | ||
|
||
From STAC: a set of assets that are defined with the same properties and share higher level metadata. In the satellite world these would typically all come from the same sensor or constellation. It corresponds directly to what others call a "dataset series" (ESA, ISO 19115), "collection" (CNES, NASA), and "dataset" (JAXA, DCAT). So if all your Items have the same properties, they probably belong in the same Collection. | ||
|
||
We should also reference vector dataset ideas, how it maps to a 'layer', can be a coverage, etc. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
{ | ||
"id": "de0b0f94-aadb-4db4-a1b5-9a656810682c", | ||
"type": "Feature", | ||
"created": "2021-02-05", | ||
"updated": "2021-02-21T00:14:33Z", | ||
"geometry": { | ||
"type": "Polygon", | ||
"coordinates": [ | ||
[ | ||
[ | ||
4.52767, | ||
51.58673 | ||
], | ||
[ | ||
4.52767, | ||
52.0393 | ||
], | ||
[ | ||
6.18931, | ||
52.0393 | ||
], | ||
[ | ||
6.18931, | ||
51.58673 | ||
], | ||
[ | ||
4.52767, | ||
51.58673 | ||
] | ||
] | ||
] | ||
}, | ||
"properties": { | ||
"created": "2018-02-05T08:15:56Z", | ||
"updated": "2021-02-21T00:14:33Z", | ||
"type": "dataset", | ||
"title": "Meetlocaties waterkwantiteit Waterschap Rivierenland", | ||
"description": "Binnen het waterschap worden veel oppervlaktewaterpeilen en grondwaterstanden gemeten. De kaart toont waar metingen plaatsvinden en wat voor type meting plaats vindt. Voor meetgegevens of overige informatie over waterkwantiteit kunt u contact opnemen met het waterschap via [email protected]", | ||
"contactPoint": "Waterschap Rivierenland, [email protected]", | ||
"associations": [ | ||
{ | ||
"href": "https://kaarten.wsrl.nl/arcgis/services/Kaarten/Meetlocaties_Waterkwantiteit_WMS_WFS_OD/MapServer/WMSServer?request=GetCapabilities&service=WMS", | ||
"rel": "item", | ||
"type": "OGC:WMS" | ||
}, | ||
{ | ||
"href": "https://kaarten.wsrl.nl/arcgis/services/Kaarten/Meetlocaties_Waterkwantiteit_WMS_WFS_OD/MapServer/WFSServer?request=GetCapabilities&service=WFS", | ||
"rel": "item", | ||
"type": "OGC:WFS" | ||
} | ||
], | ||
"externalId": "meetlocaties-waterschap-rivierenland", | ||
"themes": [ | ||
{ | ||
"concepts": [ | ||
"waterkwaniteit peil grondwaterstand peilbuis waterpeil peilbesluit waterschap rivierenland" | ||
], | ||
"scheme": null | ||
} | ||
], | ||
"extent": { | ||
"spatial": { | ||
"bbox": [ | ||
[ | ||
[ | ||
4.52767, | ||
51.58673, | ||
6.18931, | ||
52.0393 | ||
] | ||
] | ||
], | ||
"crs": "http://www.opengis.net/def/crs/OGC/1.3/CRS84" | ||
}, | ||
"temporal": { | ||
"interval": [ | ||
null, | ||
null | ||
], | ||
"trs": "http://www.opengis.net/def/uom/ISO-8601/0/Gregorian" | ||
} | ||
} | ||
}, | ||
"links": [ | ||
{ | ||
"rel": "alternate", | ||
"type": "text/html", | ||
"title": "This document as HTML", | ||
"href": "./meetlocaties.html" | ||
}, | ||
{ | ||
"rel": "alternate", | ||
"type": "application/json", | ||
"title": "This document as an OGC Collection", | ||
"href": "./collection-meetlocaties-example.json" | ||
}, | ||
|
||
] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a comment regarding the draft Records API - but Is there a reason this can't be
id
? the prefixexternal
is confusing unless it has some other meaning that's not stated.And I suggest it should be
M
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fully agree a core ID should be mandatory. I think that since record inherits from Feature it has an ID field. But probably be worth being explicit about that in this table, along with 'links'.