Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

definitions clarifications of conforms to #1130

Closed
bertvannuffelen opened this issue Oct 17, 2019 · 46 comments
Closed

definitions clarifications of conforms to #1130

bertvannuffelen opened this issue Oct 17, 2019 · 46 comments
Labels
dcat dct:conformsTo due for closing Issue that is going to be closed if there are no objection within 6 days feedback Issues stemming from external feedback to the WG
Milestone

Comments

@bertvannuffelen
Copy link

Hi,

conforms to appears twice as property. Having the definition An established standard to which the described resource conforms.

However it is not clear to what the conforms to actually refers. To the metadata that is in the catalog, or the information that is described by the metadata.

the usage notes state

  • cataloged resource: This property SHOULD be used to indicate the model, schema, ontology, view or profile that the cataloged resource content conforms to.
  • catalog record:
    This property SHOULD be used to indicate the model, schema, ontology, view or profile that the catalog record metadata conforms to.
  • distribution: This property SHOULD be used to indicate the model, schema, ontology, view or profile that this representation of a dataset conforms to. This is (generally) a complementary concern to the media-type or format.

What does that mean for a Catalog, Dataset, DataService? Are that the rules imposed on the DCAT structure used? E.g. the rules from DCAT-AP (SE)?
Then I come to the distribution case which is not about the additional constraints on the metadata for the distribution, but about the actual content. E.g. I use the INSPIRE-address model.
That is not the same interpretation for conforms-to.

Suppose I have a harvesting data-portal like the European Data Portal, then it would be nice to know which metadata profile the descriptions conform to. Each entity conforms-to a EU memberstate profile. But know I cannot know that for distributions, because that describes the ontology used by the actual data.

@rob-metalinkage
Copy link
Contributor

On a related matter - there is a proposal in this issue #1042 (which we just agreed to in last Conneg-by-AP subgroup meeting) to include an informative appendix to the Conneg-by-AP spec to illustrate how we thing DCAT might reference Distributions and DataServices that support Conneg-by-AP.

It would be good to have a critical review of this proposal - using this issue as a motivation for providing clarity.

@riccardoAlbertoni riccardoAlbertoni added this to the DCAT2 ratification milestone Oct 18, 2019
@andrea-perego andrea-perego added the feedback Issues stemming from external feedback to the WG label Oct 20, 2019
@riccardoAlbertoni riccardoAlbertoni added the future-work issue deferred to the next standardization round label Nov 1, 2019
@kcoyle
Copy link
Contributor

kcoyle commented Nov 1, 2019

@bertvannuffelen I don't know your use case so I can only speak to the intention of the Dublin Core property. The object of the property should always be some standard, schema, or specification that is adhered to by the subject. Obviously this is somewhat open to interpretation, and perhaps the DCAT descriptions are not sufficient to explain what is intended in this context. But I think it is correct to say that it applies to the subject of the triple itself, not a resource that the content of the subject may be describing. So for the DCAT catalog record, the "conforms to" would have as its object the specification of the DCAT catalog record (presumably all or part of the DCAT standard). The distribution should be a data file, and if that data file is defined by a standard or specification it is that standard or specification that should be the object of the property. However, there is no reason why a particular subject cannot have multiple "conforms to" properties if that is what is needed to satisfy certain use cases.

I believe that your last comment:

Suppose I have a harvesting data-portal like the European Data Portal, then it would be nice to know which metadata profile the descriptions conform to. Each entity conforms-to a EU memberstate profile. But know I cannot know that for distributions, because that describes the ontology used by the actual data.

is what the Content Negotiation by Profile intends to address, although that would require that you use that method for harvesting.

@rob-metalinkage
Copy link
Contributor

rob-metalinkage commented Nov 1, 2019

this may be a major showstopper for DCAT if @kcoyle's interpretation is correct - its at odds with the way this could be interpreted: "An established standard to which the described resource conforms."

each of the DCAT properties tends to describe whether it relates to the dcat class instance or the described resource - and its a mix.

this appears to be a requirement that hasnt been articulated fully - the ability to identify the DCAT profile in use - and concur that its an important one. I think its important the DCAT team weighs in here with a consensus position and a proposal to clarify with an example.

@rob-metalinkage rob-metalinkage removed the future-work issue deferred to the next standardization round label Nov 1, 2019
@smrgeoinfo
Copy link
Contributor

CatalogRecord.conformsTo should identify the 'standard' that the dcat record at CatalogRecord.foaf:PrimaryTopic conforms to (however the metadata producer interprets 'conformance, which hopefully they document somewhere). In the dcat context I would expect this to be some dcat profile.

dcat:Resource.conformsTo should identify the 'standard' that the content of the described resource 'conforms to' (see caveat above). Because there might be multiple distributions, there is some ambiguity here. In the general case I would expect this conformance to be at the level of a conceptual model that all distributions would conform to.

dcat:Distribution.conformsTo should identify the 'standard' that the actual serialization for that distribution 'conforms to'. In general this should be a specification that details syntax conventions particular to a specific serialization of the resource content. If there is a conformsTo on the resource that is being distributed, I would expect the 'standard' for the Distribution to be consistent/compatible with the 'standard' for the Resource.

@kcoyle
Copy link
Contributor

kcoyle commented Nov 2, 2019

@smrgeoinfo I don't understand this statement:

CatalogRecord.conformsTo should identify the 'standard' that the dcat record at CatalogRecord.foaf:PrimaryTopic conforms to

because I don't see how that could be expressed as an RDF triple. dct:conformsTo can't reify to foaf:primaryTopic via CatalogRecord.foaf:PrimaryTopic. The statement:

ex:catalogRecord1 dct:conformsTo ex:DCATv2

is a statement about the specification of the rules for the catalog record, and only the catalog record. It seems to me that since foaf:primaryTopic has as its object the dcat:Resource that the statements:

ex:resource rdf:type dcat:Resource
ex:resource dct:conformsTo ex:standard1

cover your case. For the other two examples you give I concur.

@makxdekkers
Copy link
Contributor

@kcoyle As far as I am concerned, the statement ex:catalogRecord1 dct:conformsTo ex:DCATv2 does say something about the set of metadata statements that describe the dataset. That is true for all properties associated with the dcat:CatalogRecord: the title is the title give to the set of metadata statements that describe the dataset, the issued date is the date the metadata for the dataset was entered in the catalogue and likewise the dct:conformsTo states to which standard the metadata for the dataset conforms. The dcat:CatalogRecord is the metadata for the metadata for the dataset, and all its properties describe the metadata for the dataset.

@makxdekkers
Copy link
Contributor

The way I see this working is:

If we just have the metadata and the dataset, this is the diagram:
image

The URI identifies the dataset, and the metadata statements describe the dataset identified by the URI.
Now when we add the description of the catalog record, we get:

image

where the metadata for the metadata describes that metadata. In fact, the middle metadata is the catalog record; the dcat:CatalogRecord contains the set of metadata statements that describe the catalog record, the same way that the dcat:Dataset contains the metadata for the dataset, or the dcat:Distribution contains the metadata for the actual digital resource.

@kcoyle
Copy link
Contributor

kcoyle commented Nov 2, 2019

@makxdekkers What you diagram may be the result of the use of foaf:primaryTopic. I'm not clear on the usage of that, but it seems to say that the catalog record is a document about the resource. If it does, then what Richard says may be the case, however that makes the dct:conformsTo on the dcat:Resource to be redundant with the dct:conformsTo on the dcat:Record, which may be part of my confusion. I was assuming that each entity would conform to a standard for that entity. The definition of dct:conformsTo is: "An established standard to which the described resource conforms" and that is "resource" in the generic sense, not dcat:Resource. Again, I do not know what the reification rules are for foaf:primaryTopic in regards to the entire graph. Here's where one should ping @danbri ;-)

@makxdekkers
Copy link
Contributor

@kcoyle I don't think this is about reification: the object of dcat:primaryTopic is not the metadata for the dataset, but the dataset. As per the definition at https://w3c.github.io/dxwg/dcat/#Property:record_primary_topic, it is the dcat:Resource (dataset or service) described in the record. So in my diagram, the value of dcat:primaryTopic in the left-hand metadata box is the URI of the dataset, so the URI on the right in the diagram.

@makxdekkers
Copy link
Contributor

So, the way I see it, the dct:conformsTo in the left-hand metadata says to which standard the metadata in the middle conforms, while the dct:conformsTo in the middle metadata says to which standard the dataset on the right conforms.

@riccardoAlbertoni
Copy link
Contributor

I agree with the following @makxdekkers's and @smrgeoinfo views:

So, the way I see it, the dct:conformsTo in the left-hand metadata says to which standard the metadata in the middle conforms, while the dct:conformsTo in the middle metadata says to which standard the dataset on the right conforms.

CatalogRecord.conformsTo should identify the 'standard' that the dcat record at CatalogRecord.foaf:PrimaryTopic conforms to (however the metadata producer interprets 'conformance, which hopefully they document somewhere). In the dcat context I would expect this to be some dcat profile.

The DQV example "Express the conformance of a dataset's metadata with a standard" shows how to express compliance to a DCAT profile, and it is coherent with the views expressed above.

@rob-metalinkage: is the DQV example close to the example you are looking for?
In the case, we might add a note in DCAT to refer to it.

@pwin pwin added the future-work issue deferred to the next standardization round label Nov 2, 2019
@pwin
Copy link
Contributor

pwin commented Nov 2, 2019

I have restored the 'future work' label because we are close to completion of this phase of DCAT development (v2); there are several areas of 'future work' - so it is clear that this phase of the DCAT development isn't perfect and needs further work; experiences coming from the wider implementation of catalogues compliant with the DCAT v2 standard will give more insight into the issue #1130 problem and lead to a better-quality resolution if, indeed, there is actually an issue here [because the discussion above shows that it is not clear-cut].

@kcoyle
Copy link
Contributor

kcoyle commented Nov 2, 2019

@riccardoAlbertoni The DQV standard says that the "conforms to" statement relates to the metadata, not to the resource which the metadata describes:

The following example illustrates how a (DCAT) catalog record can be said to be conformant with the GeoDCAT-AP standard itself.

:myDatasetRecord a dcat:CatalogRecord ;
 foaf:primaryTopic :myDataset ;
 dcterms:conformsTo :geoDCAT-AP .

Given that the GeoDCAT-AP is a standard for metadata, not for the resources that are pointed to from a DCAT catalog, then I think this says the opposite of what @makxdekkers and @smrgeoinfo were saying. That DQV statement is consistent with the definition quoted by @bertvannuffelen, which is:

catalog record:
This property SHOULD be used to indicate the model, schema, ontology, view or profile that the catalog record metadata conforms to.

Note that it says "that the catalog record metadata conforms to", not "that the cataloged resource conforms to." So I think that the DCAT definitions are consistent with what is in DQV, but it seems that the interpretation that the three of you are making is different. I honestly don't care which interpretation is given the final approval, but we do need to eliminate the inconsistencies we've surfaced here.

Also, as I pointed out above, if the meaning of:

ex:catrec1 dct:conformsTo ex:standard1
ex:catrec1 foaf:primaryTopic ex:dcatResource

is that ex:catrec1 dct:conformsTo ex:standard1 means that the ex:dcatResource conforms to ex:standard1 then there does seem to be a redundancy because there are two places where the ex:dcatResource dct:conformsTo is stated, in graph for the Catalog Record and in the graph for the Resource:

ex:catrec1 dct:conformsTo ex:standard1 
ex:catrec1 foaf:primaryTopic ex:dcatResource
ex:dcatResource rdf:type dcat:Resource
ex:dcatResource dct:conformsTo ex:standard1

and one wonders if that redundancy is intentional.

@makxdekkers
Copy link
Contributor

@kcoyle I think you are misreading this. I think that you view the set of metadata statements that go into the left-hand metadata 'record' as the catalog record. However, as I see it, the catalog record is the set of metadata statements in the middle metadata box, namely the set of metadata statements that describe the dataset. In that sense, dcterms:conformsTo :geoDCAT-AP in the example above says that the set of metadata statements that describe the dataset conforms to the GeoDCAT-AP, which is entirely correct.
In my diagram:
image

Maybe the confusion stems from the foaf:primaryTopic? I see the role of foaf:primaryTopic to provide a shortcut from the metadata about the catalog record to the dataset that the catalog record describes, so the object of foaf:primaryTopic is the same as the subject of the statements associated with the dcat:Dataset.

@riccardoAlbertoni
Copy link
Contributor

@riccardoAlbertoni The DQV standard says that the "conforms to" statement relates to the metadata, not to the resource which the metadata describes:

The following example illustrates how a (DCAT) catalog record can be said to be conformant with the GeoDCAT-AP standard itself.

:myDatasetRecord a dcat:CatalogRecord ;
 foaf:primaryTopic :myDataset ;
 dcterms:conformsTo :geoDCAT-AP .

Given that the GeoDCAT-AP is a standard for metadata, not for the resources that are pointed to from a DCAT catalog, then I think this says the opposite of what @makxdekkers and @smrgeoinfo were saying.

Sorry, I don't see the contradiction with what @makxdekkers and @smrgeoinfo were saying. But of course, It is possible that I am misreading.

Where you apply the conformsTo depends on what kind of standards you are considering. The rule of thumb should be something like

  • If it is a metadata standard or profile of a metadata standard, than you have to apply the conformsTo to CatalogRecord, which is the place where you speak about metadata of metadata of a dataset.
  • If it is a data standard, then you have to apply the dct:conformsTo to the dcat:Dataset, and so on.

I think this is in line with @makxdekkers statement

So, the way I see it, the dct:conformsTo in the left-hand metadata says to which standard the metadata in the middle conforms, while the dct:conformsTo in the middle metadata says to which standard the dataset on the right conforms.

@kcoyle
Copy link
Contributor

kcoyle commented Nov 2, 2019

@riccardoAlbertoni You seem to be saying what I said above, but Richard said:

CatalogRecord.conformsTo should identify the 'standard' that the dcat record at CatalogRecord.foaf:PrimaryTopic conforms to

which sounds to me like it is applying dcat:catalogRecord1 dct:conformsTo to the object of foaf:PrimaryTopic. Could we get this as code, Richard? Because I'm not sure what "the dcat record at CatalogRecord.foaf:PrimaryTopic" means - are you referring to the object of the triple:

CatalogRecord foaf:PrimaryTopic [Richard's dcat record here]

And if Makx is saying what I said at my comment then I don't know why he didn't simply say that he agrees with what I had said. I thought he was making another point.

I said:

ex:catalogRecord1 dct:conformsTo ex:DCATv2

is a statement about the specification of the rules for the catalog record, and only the catalog record.

And Makx replied:

As far as I am concerned, the statement ex:catalogRecord1 dct:conformsTo ex:DCATv2 does say something about the set of metadata statements that describe the dataset.

So, Makx, were you disagreeing with me? Agreeing? What do you mean by "the set of metadata statements that describe the dataset"? Can you give an example in code so that it is clearer?

Let's clarify this with code because this will only be clear if we use precise language, meaning the actual DCAT properties. I get quite confused with things like "left-hand metadata record" and would like to see examples as code rather than high-level diagrams so we can be sure that the diagrams match the properties defined in DCAT and that we're all talking about the same thing.

@makxdekkers
Copy link
Contributor

@kcoyle Apologies if I misunderstood. I thought you were arguing that ex:catrec1 dct:conformsTo ex:standard1 said something about the dcat:Resource conforming to ex:standard1. If not, I misunderstood.

So maybe we all agree and we are confusing ourselves.

If we go back to the question @bertvannuffelen asked, I think the answer is that

ex:catalogRecordX dct:conformsTo ex:metadataprofileP says that the metadata for the dataset conforms to some metadata profile (e.g. GeoDCAT-AP)

ex:datasetY dct:conformsTo ex:implementingrulesQ says that the dataset conforms to some set of rules (e.g. about how data was collected, what quality process was applied etc.)

ex:distributionZ dct:conformsTo ex:datastructureR says that the data in the distribution conformsTo some datastructure specification (e.g. SDMX for statistical data)

@pwin
Copy link
Contributor

pwin commented Nov 2, 2019

@makxdekkers - that last set of 3 examples is exactly how I see dct:conformsTo being used

@rob-metalinkage
Copy link
Contributor

I can live with this interpretation and if communicated then DCAT is not broken.. but suboptimal in its explanation.

It does introduce a constraint that all catalog records need to conform to the same set of profiles .. and as @kcoyle pointed out conneg could be used if multiple are possible. Managing heterogeneous catalogs can be considered out of scope or future work.

I also suggest contributing an example for conneg of a DCAT catalog service offering multiple profiles.

@pwin
Copy link
Contributor

pwin commented Nov 2, 2019

@rob-metalinkage - I think that this sort of issue, as part of 'future work', gets carried forward into the primer doc

@makxdekkers
Copy link
Contributor

@rob-metalinkage I don't think this means that all catalog records in a catalog need to conform to the same profile. As far as I can see, there is no rule that prohibits two catalog records in the same catalog to conform to different profiles:

ex:catalogRecordX dct:conformsTo ex:metadataprofileP
ex:catalogRecordY dct:conformsTo ex:metadataprofileQ

In practice, catalogs may often contain catalog records that conform to the same profile -- but it is not a requirement.

@bertvannuffelen
Copy link
Author

I agree that my examples indicate that addressing DCAT profile conformance might not be obvious and it is hard to estimate its impact. And that there should be more evidence how it might be used in practice.

Leaving it as it is now means also that the decision on how dct:conformsTo is being interpreted will be left to the users of the vocabulary. I think my examples showed that there are different interpretation possibilities.

I follow the approach of @rob-metalinkage for adding more formal guidelines w.r.t. the clarifying the impact of the specifying DCAT profile requirements.
Because, I have the believe that the vocabulary should be able to be interpreted by a machine and in order to have different machines draw the same conclusion, this requires more information about the scope of the statement dct:conformsTo.

@makxdekkers
Copy link
Contributor

@bertvannuffelen I certainly understand your point, that the DCAT Recommendation does not say exactly what is and what is not a valid object for dct:conformsTo -- other that it can be inferred to be a dct:Standard of some sort. If I understand correctly, you would want the object to be something that is machine-actionable, like a SHACL file that a machine can use to perform validation. However, the object of dct:conformsTo may well be some human-readable document, for example a catalogue in INSPIRE could specify:
ex:cat dct:conformsTo http://data.europa.eu/eli/dir/2007/2/oj and/or
ex:cat dct:conformsTo http://data.europa.eu/eli/reg/2008/1205/oj.

I think you're asking too much from the Recommendation in this respect. It uses the DCMI property and associated class which themselves are defined in rather vague terms. If we really wanted to make it clear what kind of resource should be referenced, maybe we should then define a specific property, e.g. dcat:validationResource as a sub-property of dct:conformsTo. Future work?

For the time being, I would say that the exact usage of the property dct:conformsTo could be specified in an application profile so that for a particular application it is clear what should be expected.

@dr-shorthair
Copy link
Contributor

Machine actionability all round would be terrific. But the current reality is that there is a huge diversity of expressions of 'standards' and the chances of this converging on a small number of machine actionable forms is very very slight. Like @makxdekkers says, this would be a suitable thing for a profile to specify.

@rob-metalinkage
Copy link
Contributor

I think @bertvannuffelen and @makxdekkers are talking at cross-purposes here - I dont think there is any debate aboiut the rdfs:range of dct:conformsTo - and @dr-shorthair is right in saying that constraining this is a matter for profiles of DCAT

The semantics issue is about the how the property relates to the subject - the rdfs:domain - which dct:conformsTo does not bound.

If the subject is a proxy for the thing (i.e. a non-information resource) - and the metadata graph is an information resource representing the thing (but not the thing itself) then it seems OK to use the common practice of saying that the statement dct:conformsTo is about the thing being identified - not the representation. In some ways making the CatalogRecord into a separate entity is confusing unless you think of of it as one of many possible representations of the underlying thing - its cataloguing metadata.

AFAICT an implementation pattern whereby the dcat:CatalogRecord and dcat:Dataset are both well-known profiles of the identified thing (the real world dataset) and accessible by conneg-by-ap would highlight the subtle differences here, and be consistent with the DCAT spec.

@makxdekkers
Copy link
Contributor

@rob-metalinkage

I think your make things even more confusing:

If the subject is a proxy for the thing (i.e. a non-information resource) - and the metadata graph is an information resource representing the thing (but not the thing itself) then it seems OK to use the common practice of saying that the statement dct:conformsTo is about the thing being identified - not the representation.

In my opinion, your statement implying that the "thing" is a "non-information resource" is incorrect. The "thing" being described in the metadata statements associated with a dcat:CatalogRecord is the set of metadata statements associated with the dcat:Dataset; see my diagram at #1130 (comment). By the way, as far as I am aware, the terminology of "non-information resource" versus "information resource" was quietly abandoned because it only created more confusion. For example, you seem to imply that a dataset is a non-information resource which I think is contrary to what the people who invented the terminology would say.

I would strongly object to your opinion that it is OK to assume that a statement about a set of metadata statements says something about the thing that the metadata describes. If I say that my name conforms to a naming rule in the country I was born in, i.e. one or more given names and one family name, does that mean I, the person, conform to the naming rule? Of course not.

What matters is the thing the metadata describes, and it doesn't really matter whether that is a physical thing or an idea or a set of metadata statements.

@rob-metalinkage
Copy link
Contributor

@makxdekkers I have to agree with you "because it only created more confusion" : I dont recognise any of my points (or the underlying concerns in http:range-14) in your re-characterisation of them !

"a dataset" is a conceptual thing, an encoding of a dcat:Dataset entity is an information resource describing it and a dcat:CatalogRecord is an information resource describing a dcat:Dataset (for example). I dont think you are arguing otherwise.

AFIACT you are not objecting to my opinion - only to your interpretation of my opinion - an I think your interpretation of my opinion is incorrect (that got very "meta"... )

My opinion is that the recommended resolution of the underlying concept of the http-range14 issue is va;lid (I dont agree with TBL's initial oversimplistic premise about # URIs) - and you have not questioned its validity or pointed to an alternative strategy. We can however agree the terminology is confusing ..., I wish I knew a better way of explaining this but would rather not proliferate yet more terminology.

Its the URI that matters here... the CatalogRecord needs its own URI - but the Dataset record can be based on some governance process that mints a URI for the conceptual object - or it can exist as a different type of thing - the record of "the dataset" in a particular catalog concept. In the latter case, it is necessary to have an additional layer of identification ..

DCAT itself is silent on the uniqueness of URIs for dcat:Dataset objects relative to "the dataset" being described - and I think very different assumptions might be made here, whereas no assumptions can be made safely.

@andrea-perego
Copy link
Contributor

Reviewing this thread, it seems that some of the issues under discussion have been addressed as explained in #1130 (comment)

About the possible still open ones, I propose we address them more punctually by moving the discussion to issues #1211 and #1338 (if in scope) and, if need be, by creating specific issues.

Unless there are any objections, I propose we close this issue.

@andrea-perego andrea-perego added the due for closing Issue that is going to be closed if there are no objection within 6 days label Mar 28, 2021
@riccardoAlbertoni
Copy link
Contributor

We are closing this issue as proposed above and as a result of tonight's DCAT subgroup meeting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dcat dct:conformsTo due for closing Issue that is going to be closed if there are no objection within 6 days feedback Issues stemming from external feedback to the WG
Projects
None yet
Development

No branches or pull requests