-
Notifications
You must be signed in to change notification settings - Fork 222
Correctly coerce Parquet Int96 timestamps into requested TimeUnits #1532
Correctly coerce Parquet Int96 timestamps into requested TimeUnits #1532
Conversation
Also happy to move the |
I believe that all arrow datatype related conversions are done here, so that is fine. |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #1532 +/- ##
==========================================
- Coverage 83.73% 83.07% -0.66%
==========================================
Files 389 389
Lines 41811 42632 +821
==========================================
+ Hits 35009 35417 +408
- Misses 6802 7215 +413
☔ View full report in Codecov by Sentry. |
Second PR is up! #1533 |
@ritchie46 any chance you could retrigger CI - Seems to be failing on some flaky token issues?
|
Thanks for the stamp @sundy-li! I had to make a new PR to fix lints since they were re-enabled last week. Would appreciate some help with launching CI :) |
Merged, will create a new pr to fix the clippy. |
Addresses the first part of issue #1527
Instead of always naively parsing Parquet Int96 timestamps into
TimeUnit::Nanosecond
, we match on the requested timeunit and perform timeunit-specific parsingThis makes parsing safer when reading Int96 timestamps that are outside of the range of
timestamp[ns]
(e.g. timestamps with dates like the years 1000 or 3000) instead of the current behavior which is to always parse the timestamps with the Nanosecond timeunit, and overflow.