-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request/Idea: Make sure that for datasets published with CC0 waiver or standard license, metadata exports include the waiver or license #8798
Comments
FWIW: The design discussion around this was specifically to not reference the standard license (or consider the dataset to be using that license) when there are custom terms (any field that appears when you pick custom terms). I think the schema.org example is what was intended i.e. there is still a For other metadata exports, I think we removed places where CC0 was hardcoded and/or where it wasn't obvious how map custom terms/the custom license URI into the format. (I think that was noted somewhere as 'future work' but I'm not aware of anyone having taken it up.) W.r.t. specific datasets, in cases where some files are restricted/embargoed and the terms of use/other custom terms are just about the restricted files, it may work to just put those terms on the Terms of Access for Restricted Files field - you could then have a CC0 dataset with those files having the additional Terms of Access. For cases like you have here, which only has restricted files, I'm not sure that would actually help - the metadata is already open so saying the dataset is CC0 but you can't actually get any files under CC0 terms may not make sense. (FWIW: At QDR, we opted for a 'QDR Standard license' that codifies that documentation files are CC0 and other files (that are restricted or embargoed) are subject to QDR's terms.) |
Thanks @qqmyers. I think it might be helpful if I look at cases where a CC0 waiver was applied and text was entered in one of the fields in the "Dataset Terms" accordion", but there are no restricted files. This is kind of just a to-do note for myself. (Of course others who read this issue and would like to review and contribute other use cases should feel free.) One of my concerns is that if people are looking for datasets with CC0 waivers, maybe using search facets (#9060), or in some other way looking for datasets with no or very few use restrictions, datasets like https://doi.org/10.7910/DVN/WHNXKY won't show up. I think the same might be true for datasets with a mix of standard licenses referenced in a "custom license". Will people want to search for datasets whose files have a CC0 waiver, even if some of those datasets' files have other licenses? Would they be able to? Like you said, what's been entered in the Special Permissions field could be moved to the "Terms of Access for Restricted Files" field. Does that mean that any information about data access should be put in the "Terms of Access for Restricted Files" field or one of the other fields in the "Restricted Files + Terms of Access" accordion?: Is the Dataverse software's model then that if a depositor wants to add information about data use, depositors should add that information in fields in the "Dataset Terms" accordion? And information about data access should go in the "Restricted Files + Terms of Access" fields? The "Special Permissions" field is one of the fields in the "Dataset Terms" accordion and that field's description is "Determine if any special permissions are required to access a resource (e.g., if form is a needed and where to access the form)". That seems more about data access. The same seems true for the "Confidentiality Declaration" field. To maintain the model, should those fields be moved to the group of fields in the "Terms of Access for Restricted Files" accordion? What about the fields in that "Terms of Access for Restricted Files" accordion that mention both data use and access? My other concern is something I've also heard from others in the community, that in 5.10+ installations, when depositors choose a standard license from the new dropdown, there's no longer a place for them to put information that they would normally put in fields in the "Dataset Terms" accordion because those fields disappear. I think @philippconzett made a related comment in some channel recently (a GitHub issue, Slack, Google Group?), though I can't find it now. And I've also heard about related complications from the Harvard Dataverse repo's curation team. I think the community should invest time in reviewing how the functionality is working, and I hope these questions could help provide some direction. |
Julian, thanks for raising this issue. You're right, I have earlier commented on this, but I can't find my comment either. The comment was about the following: When researchers have reused data from other sources and want to publish a dataset that is somehow derived from or builds on these other source, it is good practice to describe your sources and also under what Terms of Use or licenses they were (re)used. For CC BY licenses, this is even a legal requirement. We're still running on versjon 5.6 of Dataverse, so when choosing another license than CC0, we add the Terms of Use or the standard license into the Terms of Use field in the Terms of Use tab. Into the same field, we add a description of which Terms and Use or licenses the different sources were used. See, e.g., this dataset: https://doi.org/10.18710/VMUP44. Once we implement standard license support, we won't be able to add any text to the Terms of Use field anymore. So, the question is where to put this information. The metadata field Data Sources could be an alternative, which we already have used in the Terms of Use clean-up to prepare the implementation of standard license support. But as you say, the community should invest time in agreeing on a best practice for this. |
Investigate, prioritize and size* |
also related: #8796 |
2024/05/08
|
2024/06/20
|
Depends on: #10632. It must be merged first. |
Overview of the Feature Request
Make sure that for datasets published with a CC0 waiver or a standard license, their metadata exports include the waiver or license.
What kind of user is the feature intended for?
Curator, Guest
What inspired the request?
While reviewing the metadata exports of datasets published in repositories running a Dataverse software version that includes the "multiple license" update, I noticed that for certain datasets published before that update, the metadata exports don't include the CC0 waiver or other standard license.
For example, after the multiple license update, datasets published before that update with a CC0 waiver and metadata entered in one or more "Terms of Use" fields were updated to include text in the Terms of Use field that read that "This dataset is made available under a Creative Commons CC0 license with the following additional/modified terms and conditions":
But the OpenAIRE and Schema.org metadata exports of those datasets do not include information about the CC0 waiver:
For comparison, see the OpenAIRE and Schema.org metadata exports of a dataset with a CC0 waiver and nothing entered in any of the "Terms of Use" fields:
OpenAIRE export of dataset with CC0 waiver and no metadata entered in one or more "Terms of Use" fields
Schema.org export of dataset with CC0 waiver and no metadata entered in one or more "Terms of Use" fields
What existing behavior do you want changed?
For datasets that were published with a CC0 waiver (or possibly a CC-BY license for some Dataverse repositories whose default license was CC-BY), include in their metadata exports information about those waivers or licenses. This will become more important as license metadata is included in more exports (e.g. being sent to DataCite (#5889)) and is indexed and made searchable (e.g. #7482).
The text was updated successfully, but these errors were encountered: