Skip to content

Inspire Download Service As Atom

josegar74 edited this page Jan 27, 2015 · 7 revisions
Date 3 November, 2014 Contacts Paul van Genuchten, Jose García
Status Motion passed - Done Release 3.0
Resources Available Ticket # Ticket #666
Source code PR #667
Funding GeoNovum, Nordic countries

Overview

Inspire requires either WFS service or OpenSearch (an Atom-format with link to the download) for downloading datasets. More info can be read in technical guidence for download services doc 3.1 http://inspire.jrc.ec.europa.eu/documents/Network_Services/Technical_Guidance_Download_Services_v3.1.pdf

The current OpenSearch implentation in GeoNetwork will be extended to support all features of the INSPIRE download Specification. In the current implementation Geonetwork creates rss feeds describing the available downloads in Geonetwork. This implementation can be extended to provide this info in the ATOM.xml format as described by INSPIRE.

However some commented that from a legal point of view geonetwork should not create these documents, the provider is the legal owner of the download service and should thus provide the Atom documents.

So we propose to make a setting in a config-override to support either one of the use cases:

  • Atom documents are generated from metadata-content
  • Atom documents are linked to at external sources
  • A mixed option can be implemented, that links externally if available, else create Atom dynamically

In option 2 their still is a need to create an Open Search interface on the external Atom documents, because quite some dataproviders were having difficulties providing an OpenSearch interface on these documents. They requested to extend GeoNetwork to provide the Opensearch functionality on top of the external Atom documents. The implementation suggested was to harvest the external Atom documents and make them searchable.

Shared functionality for the implementations will have these features:

  • Atom search queries the standard lucene index (by configuration limited to only records complying to the Inspire standard) and presents the results in an atom document, from this document the individual Atom documents can be accessed.

  • If an iso19119 metadata record identifier is provided in the url, the search will be limited to this document plus all the related datasets to this document (the download service), as specified by the INSPIRE OpenSearch specification

  • For each iso19119 record an OpenSearch Description document should be available listing all the dataset-identifiers available in the Atom feed.

  • "Describe Spatial Data Set"-operation will provide a single Atom document for a dataset (inputs are identifier, language)

  • "Get Spatial Data Set"-operation will provide an attached spatial datafile (inputs are identifier, language, crs)

Technical Details:

In http://trac.osgeo.org/geonetwork/ticket/333 some work has been implemented to introduce OpenSearch?. This proposal adds some extra fields (and functionality) to the existing implementation (and/or) is implemented as a series of overrides to not make the current implementation to complex.

/geonetwork/srv/dut/portal.opensearch This url opens the OpenSearch? Description document. Some extra fields should be added. The filter with a iso19119 uuid should be implemented, if such a filter is provided a list of all dataset identifiers in this service should be displayed.

<OpenSearchDescription>
        <ShortName>[AtomServiceFeed:feed.title]</ShortName>
        <Description>[AtomServiceFeed:feed.title]</Description>
        <!--URL of this document-->
        <Url type="application/OpenSearchDescription+xml" rel="self" template="http://nationaalgeoregister.nl/opensearch/[ServiceMetadata:fileIdentifier]/OpenSearchDescription.xml"/>
        <!--Generic URL template for browser integration-->
        <Url type="application/atom+xml" rel="results" template="http://nationaalgeoregister.nl/opensearch/[ServiceMetadata:fileIdentifier]/search?q={searchTerms}"/>

<!-- repeat for each Atom dataset feed -->
<!--Describe Spatial Data Set Operation request URL template to be used
        in order to retrieve the Description of Spatial Object Types in a Spatial
        Dataset-->
        <Url type="application/atom+xml" rel="describedby" template="http://nationaalgeoregister.nl/opensearch/[ServiceMetadata:fileIdentifier]/search?spatial_dataset_identifier_code={inspire_dls:spatial_dataset_identifier_code?}&amp;spatial_dataset_identifier_namespace={inspire_dls:spatial_dataset_identifier_namespace?}&amp;language={language?}&amp;q={searchTerms?}"/>

<!-- repeat step for each attached dataset -->
        <!--Get Spatial Data Set Operation request URL template to be used in
        order to retrieve a Spatial Dataset-->
        <!-- For download of GML files, use this template. -->
        <Url type="application/gml+xml;version=3.2" rel="results" template="http://nationaalgeoregister.nl/opensearch/[ServiceMetadata:fileIdentifier]/search?spatial_dataset_identifier_code={inspire_dls:spatial_dataset_identifier_code?}&amp;spatial_dataset_identifier_namespace={inspire_dls:spatial_dataset_identifier_namespace?}&amp;crs={inspire_dls:crs?}&amp;language={language?}&amp;q={searchTerms?}"/>
        <!-- format differentiation. If multiple formats are supported (for the same CRS), return an Atom feed containing multiple links. In that case use results="application/atom+xml" for multiple downloadable files. -->
        <Url type="application/atom+xml" rel="results" template="http://nationaalgeoregister.nl/opensearch/[ServiceMetadata:fileIdentifier]/search?spatial_dataset_identifier_code={inspire_dls:spatial_dataset_identifier_code?}&amp;spatial_dataset_identifier_namespace={inspire_dls:spatial_dataset_identifier_namespace?}&amp;crs={inspire_dls:crs?}&amp;language={language?}&amp;q={searchTerms?}"/>
<!-- Voor elke Service Feed de contactgegevens -->
        <Contact>[AtomServiceFeed:author.name]</Contact>
        <Tags>[ServiceMetadata.Keywords]</Tags>
        <LongName>[AtomServiceFeed:feed.subtitle]</LongName>

<!-- repeat for each dataset dataset and crs  -->
	<Query role="example" inspire_dls:spatial_dataset_identifier_namespace="[AtomServiceFeed:feed.entry.inspire_dls:spatial_dataset_identifier_namespace]" inspire_dls:spatial_dataset_identifier_code="[AtomServiceFeed:feed.entry.inspire_dls:spatial_dataset_identifier_code]" inspire_dls:crs="[AtomDatasetFeed:feed.entry.category@term]" language="[AtomDatasetFeed:feed.entry.link[rel=”alternate”]@xml:lang]" title="[AtomDatasetFeed:feed.entry.title]" count="1"/>

<!-- per Atom Service feed / Service metadata record combination: -->
        <Developer>[AtomServiceFeed:author.name]</Developer>
        <!--Languages supported by the service. The first language is the default language-->
        <Language>[AtomServiceFeed:feed.title@xml:lang]</Language>

/geonetwork/srv/dut/rss.search?any= Queries the index and shows results. Some extra fields should be implemented. The link should not reference the iso19115 record in GN but an Atom document descrbing the dataset. The url for this could look like:

/geonetwork/srv/eng/rss.detail?uuid={uuid}&lang={lang} This could also become an implementation of the "Describe Spatial Data Set"-operation, however note that this operation uses dataset-identifier/namespace and not metadata identifier

GN will return a document like:

<feed xmlns="http://www.w3.org/2005/Atom"
    xmlns:georss="http://www.georss.org/georss"
    xmlns:inspire_dls="http://inspire.ec.europa.eu/schemas/inspire_dls/1.0">
    <!-- feed title -->
    <title xml:lang="nl">Demonstratie INSPIRE Download Service 3.0, ATOM - Service Feed</title>
    <!-- feed subtitle -->
    <subtitle xml:lang="nl">INSPIRE Download Service van Geonovum als demonstratie van een Download Service met voorgedefinieerde datasets voor Geografische namen en Administratieve eenheden</subtitle>
    <!-- links to metadata and alternative representations -->
    <link href="http://s01.geonovum.site4u.nl/download/metadata_atom_servicefeed.xml" rel="describedby" type="application/vnd.iso.19139+xml"/>
    <link href="http://s01.geonovum.site4u.nl/download/downloadservice_atom_servicefeed.xml" rel="self" type="application/atom+xml"
        hreflang="nl" title="Demonstratie INSPIRE Download Service 3.0 - Service Feed"/>
    <link rel="search" href="http://s01.geonovum.site4u.nl/download/opensearch_description.xml" type="application/opensearchdescription+xml" title="Open Search Description voor Demonstratie INSPIRE Download Service 3.0, ATOM - Service Feed"/>
    <!-- identifier -->
    <id>http://s01.geonovum.site4u.nl/download/downloadservice_atom_servicefeed.xml</id>
    <!-- rights, access restrictions -->
    <rights>geen toegangsbeperkingen</rights>
    <!-- date/time of last update of feed -->
    <updated>2012-06-18T15:35:06Z</updated>
    <!-- author info -->
    <author>
        <name>Geonovum</name>
        <email>[email protected]</email>
    </author>
    <entry>
        <!-- title for pre-defined dataset -->
        <title xml:lang="nl">Geografische namen (DEMO) NamedPlaces - Parent Feed (CRS)</title>

		<!-- Spatial Dataset Unique Resourse Identifier voor de dataset -->
		<inspire_dls:spatial_dataset_identifier_code>06b6c650-cdb1-11dd-ad8b-0800200c9a79</inspire_dls:spatial_dataset_identifier_code>
		<!-- Geonovum: de namespace voor de code, van toepassing op de dataset. Nadere invulling hiervan volgt nog. -->
		<inspire_dls:spatial_dataset_identifier_namespace>http://s01.geonovum.site4u.nl/download</inspire_dls:spatial_dataset_identifier_namespace>
        <link href="http://nationaalgeoregister.nl/geonetwork/srv/nl/iso19139.xml?uuid=81ff84ec-42a4-4481-840b-12713bbb5d38" rel="describedby" type="application/xml"/>
        <!-- Link naar Dataset feed -->
        <link href="http://s01.geonovum.site4u.nl/download/downloadservice_atom_dataset1.xml" rel="alternate" type="application/atom+xml"
            hreflang="nl" title="Geografische namen (DEMO) - Download Service voorgedefinieerde dataset"/>            
        <id>http://s01.geonovum.site4u.nl/download/downloadservice_atom_dataset1.xml</id>
	    <updated>2012-06-18T15:35:04Z</updated>
        <!-- Optioneel: een samenvatting / omschrijving  -->
		<summary>Download the dataset Geografische namen (DEMO) NamedPlaces, via this feed</summary>
        <!-- The service feed contains the boundingbox, in polygon format -->
        <georss:polygon>50.7539 3.37087 50.7539 7.21097 53.4658 7.21097 53.4658 3.37087 50.7539 3.37087</georss:polygon>
        <!-- For each entry provide CRS information -->
        <category term="http://www.opengis.net/def/crs/EPSG:4258"
            label="ETRS89"/>
        <category term="http://www.opengis.net/def/crs/EPSG:4326"
            label="WGS84"/>
    </entry>
</feed>

Implementation with harvested ATOM will require additional functionality

Collect ATOM

In the situation that the data provider provides its own Atom document, geonetwork should not link to the Atom document generated by the catalogue, but to the document provided by the data provider. To be able to include the ATOM contents in the Lucene index, we'll need to harvest the Atom document on regular intervals. Similar to a WMS-capabilities harvest. An Atom harvest would be able to collect the contents of the Atom feed and include it as a field in the metadata table, to be able to add it to the lucene index.

To verify if a document has a link to an external Atom document, a protocol application/atom-xml was added (but this value can be overridden).

Harvest ATOM

A usecase to consider is that we also create an ATOM harvester which will be able to harvest iso19115 and iso19119 metadata from Atom feeds. Comment by Simon: you could even harvest a WFS-service and package it with a geonetwork-generated Atom document.

Validate Atom

Before being able to collect or harvest Atom Feeds I guess we'll need Atom XSD in GN

Display Atom Contents

The atom link can be referred to from the Inspire iso19115 and Inspire iso19119 records in the catalogue, we might add a suggestion button here to be able to auto-add the geonetwork link here, or add a link to your local server

<srv:connectPoint><gmd:CI_OnlineResource><gmd:linkage><gmd:URL>/geonetwork/srv/eng/rss.detail?uuid=a3d33-...</gmd:URL></gmd:linkage></gmd:CI_OnlineResource></srv:connectPoint>

An example reord can be viewed at: http://www.nationaalgeoregister.nl/geonetwork/srv/nl/iso19139.xml?id=448130

Also if GN finds such an atom feed url in the gmd:url field, the metadata record-view could get the feed contents and return the linked datasets inside the Atom document and present them as hyperlinks

Link to Inspire thesaurus

A reference should be made from the ATOM feed to a SKOS/RDF thesaurus on the JRC website ( http://inspire-registry.jrc.ec.europa.eu/registers/FCD). This thesaurus has a format currently not supported by geonetwork (each term is in a separate web location, the central document only has a list of links/identifiers). We might be able to support the format if with an upgrade of Sesame. Else we should transform the thesaurus to a readable format. A user should at least include a single keyword from this thesaurus in each record, that dhouls have an Aton document generated by geonetwork. Most probably in a new version of the discovery service specification a link to this thesaurus will be required anyway.

Other challenges when generating Inpire compliant Atom documents

  • The Atom feed should have some indication of filesize of the download, we might be able to find this info with a java function (if the file resides on the geonetwork server). This kind of info can also be filled in iso19115 ( transfersize), but it seems a total of all files attached to the record.

  • multilingual support, how to register the language of the external resource (proposal: gmd:online@xlink:role)

  • projection (crs) of the download, geonetwork doesn't have "epsg:xxxx" in rs_identifier, and crs seems to be registered for all gmd_online

Proposal Type:

  • Type: Inspire download service improvement
  • Module: Inspire

Voting History

  • Vote Proposed: 03/11/2014
  • +1 from Jose, Francois

Participants

  • Paul van Genuchten
  • Steven Smolders
  • Heikki Doeleman
  • Jose Garcia
  • Thijs Brentjens / Ine de Visser
Clone this wiki locally