-
-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
input: Align date-parts and edtf string models #301
Conversation
e8b5c0b
to
5b7239c
Compare
This comment has been minimized.
This comment has been minimized.
So the idea is to modify date-parts to make it better at handling a parsed EDTF string?
As per https://json-schema.org/understanding-json-schema/structuring.html, I think you can use |
Yes; to make them consistent. It's based on the EDTF.js library's modeling, though there what we have as
Thanks, Do you have a view on whether we should? |
Is there any reason this whole file shouldn't go in the CSL Data schema as a variable? Also are users creating CSL JSON Items supposed to use this schema or just processors? |
No; more just that I wanted to see what it would look like without touching
the main schema file.
But if there's support for this, easy enough to add it directly.
|
I just pushed a commit that moves it back into csl-data.json. |
This comment has been minimized.
This comment has been minimized.
schemas/input/csl-data.json
Outdated
}, | ||
"date-range": { | ||
"title": "Date Range", | ||
"description": "An EDTF range is a two-item array. An open end or begin of a range can be represented with a null.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The null option for open currently isn't implemented. Not sure how to do this, or whether this is the best approach.
But If I remove the required "date-parts" on the date object, we can just have an empty object (as below).
I've just pushed this change, but would appreciate feedback. Is this OK? I think it is, so am planning to merge this.
{
"id": "four",
"type": "book",
"title": "The Title",
"issued": [
{ "date-parts": [2000] },
{}
]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this or permitting an empty date-parts
array be better? I'm leaning toward this empty object solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IDK; if I don't hear from any developers, am just going to go with this and figure we can correct it before merging to master.
But I was looking for the most minimal representation of the open item.
I did also consider this:
{
"id": "four",
"type": "book",
"title": "The Title",
"issued": [
{ "date-parts": [2000] },
{ "open": "true" }
]
}
This modifies the structured date object to align with the EDTF model. It moves the qualifiers to the date object, and then defines a date range as an array of these date objects.
This LGTM. We may wish to include some guidance in the processor addendum to the spec about how to process the old date model in CSL 1.1+ (e.g., what to do if |
Co-authored-by: Brenton M. Wiernik <[email protected]>
We'll in general need to figure out that last issue. In a style, for example, what to do with these? issued: 1945-04-05~/1945-04-07 issued: 1945-04-05/1945-04-07~ issued: 1945-04-05~/1945-04-07~ |
I think I should mention this, since I mentioned here: I think date-parts is better represented as an object, as it is in pandoc-citeproc. I can see no advantage at all to the array, given what we're modeling here, other than that it fits with how But unless developers want to argue the case, it seems we should defer to status quo on this. In any case, just to illustrate what I mean, {
"id": "four",
"type": "book",
"title": "The Title",
"issued": {
"year": "1932",
"month": "10",
"day": "22",
"approximate": "true"
}
} But that's an incompatible change of course, and it may not be worth it given the option to deprecate this entirely longer term in favor of the EDTF string. |
Compatibility with edtf.js is a major advantage I think. It's not clear to me how the object format you describe handles date ranges, which I think is the major motivation behind the array structure? |
Easy; the range is still an array. {
"id":"four",
"type":"book",
"title":"The Title",
"issued":[
{
"year":"2012",
"month":"21"
},
{
"year":"2012",
"month":"22"
}
]
} I'm wondering if @inukshuk decided to represent as the |
Oh, I see. You mean that the date itself is an object, whereas the date range is an array. |
Yes, but just to be super clear for anyone coming late to this ... Date is an object in this PR, and date-range is a two-item array of date objects. Currently, and in 1.0, the date-parts property within the date object is also an array. This is what I was focusing on in my latest comments; that this property was not necessary (at least not now), and in hindsight, not the best decision. |
@bdarcus the CSL Overall, I'd say the |
Thanks much @inukshuk. FYI, I just pushed a change (that will fail CI) to remove I can remove the commit if objections. |
This removes the date-parts array and adds year, month, day integer properties.
schemas/input/csl-data.json
Outdated
"type": "book", | ||
"title": "The Title", | ||
"issued": { | ||
"date-parts": [2000, 3, 10] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we remove date-parts:
"date-parts": [2000, 3, 10] | |
"year": "2000", | |
"months": "3", | |
"day": "10" |
schemas/input/csl-data.json
Outdated
"id": "range-1", | ||
"type": "book", | ||
"title": "The Title", | ||
"issued": [{ "date-parts": [2000] }, { "date-parts": [2001] }] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we remove date-parts:
"issued": [{ "date-parts": [2000] }, { "date-parts": [2001] }] | |
"issued": [{ "year": "2000" }, { "year": "2001" }] |
schemas/input/csl-data.json
Outdated
"id": "range-2", | ||
"type": "book", | ||
"title": "The Title", | ||
"issued": [{ "date-parts": [2000] }, {}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we remove date-parts:
"issued": [{ "date-parts": [2000] }, {}] | |
"issued": [{ "year": "2000" }, {}] |
So should we just GTM? |
So long as give processor instructions for handling the old model, LGTM. |
Now tests for a date object without the required year of undefined properties.
This is basically ready, but the CI script is killing me. The test on push passes, but the test on pull doesn't. |
The error at https://github.com/citation-style-language/schema/pull/301/checks?check_run_id=859242336#step:5:38 makes me think the CI run is for the old commit where the code has Perhaps try rerunning CI... maybe some github actions problem where it ran on an old commit |
This turned out to be the issue. I had reran the job, and it had still failed. In any case, thank you! |
Aside: the json schema landscape is pretty frustrating. Things that pass as
valid with some tools, don't with others. So it's hard to know what's
correct.
|
This modifies the structured date object to align with the EDTF model.
It moves the qualifiers to the date object, and then defines a date
range as an array of these date objects.
The result is that dates can either be represented as an EDTF string,
or as a structured object date, or a date range array.
The intention is this structured variant will go away in time, so that
in the future the only option will be the EDTF string.
Closes #300
In CSL JSON 1.0, dates are either a two-level nested array ...
... or a
raw
string, with is a standard EDTF-like date (note, however, that the example isn't compliant because months are not two-digits, as they are in ISO 8601).In CSL JSON 1.1, we move
raw
to an explicit EDTF string option on the property itself,So in this option, the above becomes:
This PR defines:
Also, removes
date-parts
.A single date would be:
... and the first range example becomes this ("approximate" is the EDTF language what we have called "circa"):
Note: open question is how to handle the mismatch between circa on 1.0 and this.
And we can then do things like season ranges (not formally supported by EDTF, but easy for us to add):
This would seem a better long-term that would allow us to painlessly drop the object/array representation at some point in the future.