-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TG2 -INVALID DATE FORMAT #210
Comments
@marcelooyaneder Thanks for raising the issue. We have been working off of the following concepts: DarwinCore eventDate is expected to contain a date in ISO 8601-1 format. ISO 8601-1 allows for specific dates (1880-01-05) dates with reduced precision (1880), and date ranges (1880-01-01/1880-12-31). An extension to ISO allows for explicit uncertainty in dates (1880-??-??), but that isn't within the scope of ISO 8601-1, and thus not an expected value for dwc:eventDate. The definitions for dwc:year, dwc:month, dwc:day, dwc:startDayOfYear, and dwc:endDayOfYear (in particular dwc:day) impose constraints on when values should be present in those terms when dwc:eventDate is a specific date, a date range with a precision to day or better, or a reduced precision date. Our understanding of those expectations is summarized in a table in a comment: #67 (comment) The test VALIDATION_EVENT_CONSISTENT (with the present location for human readable documentation and the location for the rationale management living at #67 ) should be able to identify cases where the information in the various Event terms is inconsistent, and has values filled in where they shouldn't be. The test VALIDATION_EVENTDATE_STANDARD #66 should be able to identify when a dwc:eventDate is incorrectly formatted - separating out the concern of invalid formatting of the dwc:eventDate from inconsistency among the date terms. The test AMENDMENT_EVENT_FROM_EVENTDATE #52 should be able to propose cases like filling in dwc:year in the example you give. There are some additional relevant tests, but I think, if I am understanding the issue you are describing correctly, that these three tests do, as currently phrased, separate out the concerns you are raising. If you have a specific example of values in dwc:eventDate, dwc:year, dwc:month, dwc:day, dwc:startDayOfYear, dwc:endDayOfYear, we could include that as a test case and see if the suite of CORE tests produces appropriately informative results, and separate out the concerns as you are doing. |
Hello Paul, first of all, thank you for your prompt response. Secondly, I understand that the tests provide solutions to these problems, so perhaps this is related to GBIF's IPT. I am new to this community, so I still don't fully understand the relationship they have or if they are related at all. Nevertheless, I am attaching an example case from the dataset I mentioned, where the date is in ISO format. The fields dwc:eventDate, dwc:day, dwc:month are complete, but (again, my mistake) I forgot to fill in the dwc:year field. As you mentioned, the AMENDMENT_EVENT_FROM_EVENTDATE mechanism works, but I am still curious why it is marked as 'Recorded date invalid' when it is in the correct format. Could this be related to the VALIDATION_EVENT_CONSISTENT test? As a side note, I published other datasets with the same workflow, and they don't have this issue. Example case: https://www.gbif.org/occurrence/3970615795 |
@marcelooyaneder sorry for chiming in - I remember seeing a similar issue here gbif/portal-feedback#4464 It seems to be an issue of GBIF interpretation. Here's the blog post of GBIF flags if that helps |
@marcelooyaneder sorry to be so long getting back again. I concur with @ymgan the issue is with GBIF's "Recorded date invalid flag". Your case provides illustrates some important principles we've tried to follow in developing the bdq test stuite.
Addressing each principle with the specifics of your case:
Here is how the event_date_qc library implementation of pertenent TIME tests (leaving out the start/endDayOf year tests and most of the amendments) would address the data example you give above (giving the name of the test, the response.status, the response.value, and the response.comment): MEASURE_EVENTDATE_DURATIONINSECONDS RUN_HAS_RESULT 86400 Provided dwc:eventDate [2020-01-15] represents a period of time with a duration of 86400 seconds VALIDATION_EVENT_TEMPORAL_NOTEMPTY RUN_HAS_RESULT COMPLIANT Some value is present in at least one of the Event temporal terms VALIDATION_EVENTDATE_NOTEMPTY RUN_HAS_RESULT COMPLIANT Some value provided for eventDate. VALIDATION_EVENTDATE_INRANGE RUN_HAS_RESULT COMPLIANT Provided value for dwc:eventDate '2020-01-15' falls entirely within the range 1582-11-15 to 2023-12-31. VALIDATION_DAY_INRANGE RUN_HAS_RESULT COMPLIANT Provided value for dwc:day [15] is in the range 1-28 inclusive. VALIDATION_DAY_STANDARD RUN_HAS_RESULT COMPLIANT Provided value for day '15' is an integer in the range 1 to 31. VALIDATION_MONTH_STANDARD RUN_HAS_RESULT COMPLIANT Provided value for month '1' is an integer in the range 1 to 12. VALIDATION_YEAR_NOTEMPTY RUN_HAS_RESULT NOT_COMPLIANT No value provided for dwc:year. VALIDATION_YEAR_INRANGE INTERNAL_PREREQUISITES_NOT_MET null No value provided for dwc:year. VALIDATION_EVENT_CONSISTENT RUN_HAS_RESULT COMPLIANT Values for provided event terms are consistent with each other. AMENDMENT_EVENT_FROM_EVENTDATE FILLED_IN {dwc:startDayOfYear=15, dwc:year=2020, dwc:endDayOfYear=15} Added year [2020] from eventDate [2020-01-15].|Added startDayOfYear [15] from eventDate [2020-01-15].|Added endDayOfYear [15] from eventDate [2020-01-15]. VALIDATION_YEAR_NOTEMPTY (with amendment accepted) RUN_HAS_RESULT COMPLIANT Some value provided for dwc:year. VALIDATION_YEAR_INRANGE (with amendment accepted) RUN_HAS_RESULT COMPLIANT Provided value for dwc:year '2020' is an integer in the range 1582 to 2023 (current year). That is saying that your dwc:eventDate value is fine, that the provided dwc:month and dwc:day values are fine, that they are consistent with dwc:eventDate, and that you could improve the data by providing values for dwc:year, dwc;startDayOfYear and dwc:endDayOfYear in this case (use caution in populating dwc:year, dwc:month, dwc:day, dwc:startDayOfYear, and dwc:endDayOfYear if your dwc:eventDate is either a date with coarser precision than one day, or a range that spans more than one day, as some or all of these should not be populated in these cases). |
…xing cases where comments were blank and tests were evaluating not null instead of not empty.
Hello, I hope you are doing well. Some time ago, I published a database on GBIF, and by mistake, I forgot to fill in the 'year' field. However, the 'eventDate' field complies with the ISO date requirements. I noticed something interesting: the database marks it as an 'invalid date format', even though the format is correct. Instead, it should create a new label, for example, 'incomplete date', since the 'eventDate' field allows publishing incomplete dates (https://dwc.tdwg.org/list/#dwc_eventDate). It might be important to identify this difference.
Saludos, espero que estén bien. Quería comentarles algo que ocurrió hace un tiempo. Publiqué una base de datos en GBIF y cometí el error de olvidar completar el campo 'year', pero el campo 'eventDate' cumple con los requisitos ISO para fechas. Y noté algo interesante: la base de datos marca este campo como 'invalid date format', a pesar de que el formato es correcto. En lugar de eso, tal vez se debería crear una nueva marca, por ejemplo, 'incomplete date', ya que el campo 'eventDate' permite publicar fechas que no están completamente especificadas (https://dwc.tdwg.org/list/#dwc_eventDate). Creo que sería importante identificar esta diferencia.
The text was updated successfully, but these errors were encountered: