Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single dates exported from ArchivesSpace do not display #1028

Open
gwiedeman opened this issue Mar 1, 2021 · 7 comments
Open

Single dates exported from ArchivesSpace do not display #1028

gwiedeman opened this issue Mar 1, 2021 · 7 comments
Assignees

Comments

@gwiedeman
Copy link
Contributor

By default ArchivesSpace permits three date types, inclusive, bulk, and single (and you can add more local types). Only inclusive and bulk are valid in EAD2002, so single dates export to EAD without a @ type.

During indexing, this method parses unitdates and adds them to normalized_date which is used for display.

Currently, if a component has an inclusive date and a single date, only the inclusive date will be added to normalized_date and the single date will not display to users.

@seanaery
Copy link
Contributor

seanaery commented Mar 1, 2021

Thanks for reporting, @gwiedeman . We noticed this issue, too, at Duke and made local modifications (esp. to the normalize method):
https://gitlab.oit.duke.edu/dul-its/dul-arclight/-/blob/develop/lib/dul_arclight/normalized_date.rb
https://gitlab.oit.duke.edu/dul-its/dul-arclight/-/blob/develop/spec/lib/dul_arclight/normalized_date_spec.rb

Seems to be working well for us so far, but I imagine there might be variations throughout the community on how dates are treated.

@gwiedeman
Copy link
Contributor Author

relates to #1336

@gwiedeman gwiedeman self-assigned this Dec 12, 2023
@gwiedeman
Copy link
Contributor Author

I have this up in single_dates, and @seanaery 's fix ensures that all <unitdate>s display, even when some have @type and some do not. However, this doesn't always preserve the order of the dates, as some archivists would expect. For example, the fixture I added has a <unitdate> without a @type and then an undated <unitdate> with @type="inclusive" Since Arclight currently indexes dates into individual fields by @type this just appends all the inclusive dates, then dates without @type and bulk dates are appended last. If all your dates have the same @type or no @type the order will be preserved, but not if they're mixed.

Changing this to always preserve date order would require reworking how Arclight handles dates. It is debatable how important the order of unitdates are. They do get exported from ASpace following the same order in the webapp, but it's unclear if EAD is promising to preserve date order. Many archivists do think date order is meaningful and enter dates how they expect is most intuitive for users. Many would expect that date order is maintained.

Relatedly, per #1336, it might be best to eliminate some redundancy in what we are storing and indexing in Solr. If we are maintaining normalized_date_ssm, then normalized_title_ssm might not be necessary and we could just join title_ssm and normalized_date_ssm in the view templates. I do think its probably good to have both individual dates and a display date string in the index.

What I'm thinking might be to have an unitdate_ssim field is a list of all dates in order regardless of @type and a unitdate_label_ssm field that is also a list of labels, like "inclusive" or "bulk" and have empty strings ("") when there is no @type. During indexing these would also be joined into normalized_title_ssm similar to how works now, but with different logic. it seems not ideal to rely on empty strings, but Arclight would probably only use normalized_title_ssm and this would mostly just to preserve the individual date data in the index. I might be making this more complicated and there's a better way to store this in Solr.

We could also implement @seanaery 's fix, permanently and not care about order, or temporarily just to ensure everything displays for now and make this into another ticket.

@seanaery
Copy link
Contributor

Thanks @gwiedeman . The two links to Duke code I had shared on this thread back in 2021 for this are now broken, but the code now lives here:
https://gitlab.oit.duke.edu/dul-its/dul-arclight/-/blob/main/lib/traject/dul_arclight/normalized_date.rb
https://gitlab.oit.duke.edu/dul-its/dul-arclight/-/blob/main/spec/lib/traject/dul_arclight/normalized_date_spec.rb

From a non-archivist perspective :-) ... what you propose (splitting to two fields, one for the dates and one for the labels) makes good sense to me. And I am always on board with removing unnecessary redundancy.

In the interest of bringing the community sprint in for a landing, you could PR your current branch to get an immediate fix in for the missing dates bug. Then pursue the enhanced refactor for better sequencing as a followup line of work. So if that effort extends beyond this sprint for whatever reason, we will still have made improvements for the impending release.

@dinahhandel
Copy link

Unless we can confirm that EAD will preserve unit date order, it seems important from both an archivist and user perspective to maintain the unit date order, even if that means a different effort to change how ArcLight handles dates.

@mmmmcode
Copy link

I agree that preserving date order is important. While "bulk" dates should appear last, single (no "type" attribute) and "inclusive" dates can appear in any order in a series of dates e.g. 2002-2003, 2021 vs. 2002, 2020-2021 vs. 2002-2021 (bulk 2020-2021). And I can see "undated"/"n.d" date expressions causing trouble, depending on how/if normalized dates are supplied

@gwiedeman
Copy link
Contributor Author

Okay I'm PR-ing the quick fix for now but also leaving this open. I think in practice ASpace (and most tools) to Arclight will preserve <unitdate> order, but I think XML doesn't promise that so some tools might not.

Agreed that bulk dates and undated are usually last. Just preserving the date order should allow archivists to set that order in ASpace or whatever tool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

4 participants