-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix index read error #583
Fix index read error #583
Conversation
Upon transaction, flakes with time types (`xsd:time`, `xsd:date`, `xsd:dateTime`) have their values coerced to `java.time`/`js/Date` objects. When writing index files to disk, we need to write out readable strings for these objects, and then re-coerce them into time objects upon load. Previously we were writing files with unreadable tagged literals in them. Note: This writes out time strings with maximal precision, allowing us to reuse our existing datatype coercion code when loading (time strings with trailing zeros trimmed do not always satisfy our regexes). This makes for a simpler solution, but it does mean potential wasted space in our index files. We can likely optimize this in the future by having more clever formatters and/or having deserialization-specific coercion logic that recognizes truncate d values.
Fixes #580 The built-in java parsers will do the correct thing for subsecond values. We don't need to be doing any manual string-inspecting apart from determining whether an offset is present.
Previously, this value was being written as an `xsd:dateTime`, but that is not a valid `xsd:dateTime` value: https://books.xmlschemata.org/relaxng/ch19-77049.html Apart from just being incorrect, this was causing this one flake to be an anomaly among the `xsd:DateTime` flakes, breaking the assumptions needed for serialization/deserialization.
…d:dateTime` flakes Now that the `commit:time` flake is no longer erroneously of type `xsd:dateTime`, any time-typed flakes we encounter should be standardized to time objects. Therefore, we don't need to check the type, and can just use the formatters directly.
In the course of working on this, I discovered we have the same issue with commit files. I made a separate issue for it here: #584 Once this PR is accepted/merged, then the fix for that one can build off of it. |
|
||
(if-let [offset (peek matches)] | ||
(OffsetDateTime/parse s) | ||
(LocalDateTime/parse s)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is certainly a lot simpler! 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📇 I don't recall why I did the LocalDateTime/of approach instead of parse, but if they handle your test cases I think that's an improvement.
Fixes https://github.com/fluree/core/issues/31
Fixes #580
When creating flakes with time-typed values (
xsd:dateTime
,xsd:date
,xsd:time
), we coerce the values intojava.time
/js/Date
objects. This index-reading bug was caused by not serializing those values correctly, resulting in unreadable index files. This PR introduces special serialization/deserialization logic for these types .We were writing out flakes with unreadable tagged literals, eg:
Which was causing parsing errors:
Custom formatters are used during serialization to ensure that our existing
datatype/coerce
code can be reused to deserialize these values upon load. This is because the defaultstr
/.toString()
behavior will trim off trailing zeros, which produces values that are not valid for the given datatype. This is a simple initial implementation, we can potentially optimize in the future with more clever formatters and/or a separate coercion path for deserialization.For #580, I switched coercion for
xsd:time
andxsd:dateTime
in clj to use the java parsing functions directly. Our hard-coded multiplication of nanoseconds was causing errors/incorrect values, whereas the built-in parsing fns will do the correct thing with those values. I'm not sure if there was a reason to not use these parsing fns in the first place, if there's a significant downside I wasn't aware of then we should discuss other alternatives.