Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IQSS/8533 Update Internal Semantic Mappings #8592

Conversation

qqmyers
Copy link
Member

@qqmyers qqmyers commented Apr 9, 2022

What this PR does / why we need it: As discussed in the issues, the default semantic URI assigned to metadatablock terms (datasetFieldType), when no term specific URI is defined for it in the metadatablock, was accidentally dependent on the displayed title of the block, meaning that the identifier changes if the display name is changed. This update fixes that and also simplifies the generation of default term URIs, now relying on the fact that all child terms, like any term, must have a unique name. The mapping is now just to append the datasetFieldType name to the URI assigned to the block. Tests, guides, examples also updated.

Which issue(s) this PR closes:

Special notes for your reviewer: This should be a permanent fix and allow #8454 to be reapplied without breaking tests, etc.
It does include updates to the DatasetFieldType class to enforce uniqueness of the name and uri in the db. This was assumed, and hardcoded in some places already (e.g. uses of the DatasetFieldType.findByName query call getSingleResult and would fail if there were more than one.) I also added a flyway script to add these constraints to existing dbs as well. These constraints aren't required for this PR, but thought they'd be useful and are related. I also did some refactoring to shift the code managing namespace/uri generation to the MetadataBlock and DatasetFieldType classes. There shouldn't be code elsewhere assigning those. That said, if these methods should be in the related service beans, let me know (they're relatively simple and based on the variables in the entities).

Suggestions on how to test this: The automated tests cover the round-trip of export/recreate of datasets via the semantic api. Manually testing the inline examples and example files changed in the PR would make sense.

Does this PR introduce a user interface change? If mockups are available, please link/include them here: No

Is there a release notes update needed for this change?: yes - this is a breaking change for the 'experimental' semantic api, although a fairly minor one (it affects terms that have not been mapped to external vocabularies). It does mean that json-ld retrieved via the semantic api from older versions won't correctly import into new versions without some minor modifications to match the new naming convention.

Additional documentation:

Copy link
Contributor

@landreev landreev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay with this.
It does make sense. I'm fine with the methods mentioned in the PR intro being where they are.
I agree it's a good idea to add the constraints to DatasetFieldType. (Are we actually aware of any real life way/reason for an installation to end up with multiple/duplicated entries in normal operation? Or is this just in case? - just curious).

@landreev landreev removed their assignment May 10, 2022
@qqmyers
Copy link
Member Author

qqmyers commented May 11, 2022

FWIW: I think the constraints are just for the hypothetical case right now - it's possible that someone has seen a DatasetFieldType.findByName query error from it and we haven't identified the root cause, but I'm not aware of any existing cases. That said, anyone who creates a new metadata block with something generic like 'subject', 'state', 'series', 'software', etc. as a field title could hit it.

@kcondon kcondon self-assigned this May 11, 2022
@mreekie
Copy link

mreekie commented May 11, 2022

Sprint:

@kcondon
Copy link
Contributor

kcondon commented May 12, 2022

Tests:

  1. semantic api endpoints, including sample create dataset jsonld file [x]
  2. migrate dataset endpoint, including sample migrate dataset jsonld file [x]
  3. check export [x]
  4. apply field name-modified citation block to copy of db, check the above tests again [x]
  5. build guides, check altered pages [x]
  6. check release notes for required action on deployment, alert/discuss as needed [x]

@kcondon kcondon merged commit a4712d8 into IQSS:develop May 13, 2022
@pdurbin
Copy link
Member

pdurbin commented May 13, 2022

Then #8127 goes into the next sprint (Needs a new PR)

I just put this in Next Sprint.

pdurbin added a commit to ErykKul/dataverse that referenced this pull request May 18, 2022
@pdurbin pdurbin added this to the 5.11 milestone Jun 2, 2022
qqmyers added a commit to GlobalDataverseCommunityConsortium/dataverse that referenced this pull request Jun 3, 2022
qqmyers added a commit to GlobalDataverseCommunityConsortium/dataverse that referenced this pull request Sep 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment