Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dcat issue 1526 bis annette proposal #1579

Merged
merged 8 commits into from
Aug 25, 2023
13 changes: 7 additions & 6 deletions dcat/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -5977,14 +5977,15 @@ <h2>DCAT Profiles</h2>
</section>

<section id="security_and_privacy">
<h2>Security and Privacy</h2>
<h2>Security and Privacy Considerations</h2>
<p>
The DCAT vocabulary supports the attribution of data and metadata to various participants such as resource <a href="#Property:resource_creator">creators</a>, <a href="#Property:resource_publisher">publishers</a> and other parties or agents via <a href="#qualified-forms">qualified relations</a>,
and as such defines terms that may be related to personal information. In addition, it also supports the association of <a href="#Property:resource_rights">rights</a> and <a href="#Property:resource_license">licenses</a> with cataloged Resources and Distributions.
These rights and licenses could potentially include or reference sensitive information such as user and asset identifiers as <a data-cite="?ODRL-VOCAB#privacy-consideration">described</a> in [[!ODRL-VOCAB]]. Implementations that produce, maintain, publish or
consume such vocabulary terms must take steps to ensure security and privacy considerations are addressed at the application level.
The DCAT vocabulary supports the datasets that may contain personal or private information. In addition, the metadata expressed with DCAT may itself contain personal or private information, such as resource <a href="#Property:resource_creator">creators</a>, <a href="#Property:resource_publisher">publishers</a> and other parties or agents via <a href="#qualified-forms">qualified relations</a>.
riccardoAlbertoni marked this conversation as resolved.
Show resolved Hide resolved
Implementers who produce, maintain, publish or consume such vocabulary terms must take steps to ensure security and privacy considerations are addressed. Sensitive data and metadata must be stored securely and made available only to authorized parties, in accordance with the legal and functional requirements of the type of data involved. Detailing how to secure web content and authenticate users is beyond the scope of DCAT.
</p>
<p>DCAT borrows the property <a href="#Property:distribution_checksum"><code>spdx:checksum</code></a> from [[!SPDX]] to ensure the integrity and authenticity of DCAT distributions. It is worth noting that the associated checksum will not provide the expected security protections if the integrity or authenticity of the DCAT metadata is also not guaranteed. Integrity and authenticity of DCAT metadata depend on the trustworthiness of the source. DCAT providers should address integrity and authenticity at the application level. For example, they should ensure the integrity and authenticity of their API and download endpoints and make DCAT metadata files downloadable from authoritative origins. DCAT does not prescribe the manner of generating the checksum. Publishers should provide the necessary detail for the user to reliably calculate the provided hash from the files supplied. Development of a canonical method for the generation of a checksum overlaps with the scope of the <a href="https://www.w3.org/groups/wg/rch">RDF Dataset Canonicalization and Hash Working Group</a>, on which DCAT can build in the future. Moreover, the use of Verifiable Credentials Data Integrity [[?VC-DATA-INTEGRITY]] can be explored.
<p>Some datasets require assurances of integrity and authenticity (for example, data about software vulnerabilities). For these, checksums can serve as a type of verification.
DCAT borrows the <a href="#Class:Checksum"><code>spdx:Checksum</code></a> class from [[!SPDX]] to ensure the integrity and authenticity of DCAT distributions. Publishers may provide a checksum value (a hash) and the algorithm used to generate the hash for each resource in the distribution. A checksum must, however, be provided via a route that is separate from the data it sums. It may be included in metadata that is provided with the data (e.g., a tarfile that includes a file for the distribution and a file for the metadata that includes a checksum for the distribution file), but if so the checksum, or a checksum for the metadata, must also be provided separately to foil an attacker who would manipulate the checksum along with the data. A checksum provided in DCAT metadata will not provide the expected assurances if the integrity and authenticity of the metadata are not also guaranteed.
</p>
riccardoAlbertoni marked this conversation as resolved.
Show resolved Hide resolved
<p>Integrity and authenticity of DCAT data ultimately depend on the trustworthiness of the source. DCAT providers should address integrity and authenticity at the application level and transport level. For example, they should ensure the integrity and authenticity of their API and download endpoints, make DCAT data and metadata files downloadable from authoritative HTTPS origins, and provide any checksums via a separate channel from the data they represent.
</p>
</section>
<section id="accessibility">
Expand Down