Opt out of floor division for float dtype time encoding #9497

spencerkclark · 2024-09-14T19:45:44Z

This PR provides a quick fix for #9488.

Previously for datetime64[ns] and timedelta64[ns] values we would always use floor division to encode times if we knew that all non-"NaT" values could be represented with integers with the given units, even if we would ultimately cast the values to floats later based on the specified dtype. This causes a problem when encoding "NaT" values, because the placeholder integer value is converted to a float here, which prevents it from properly being filled with the fill value later in the encoding process.

This PR changes things such that if we know that we will convert to floats eventually, we will use floating point division from the start, which preserves "NaT" values as floating point np.nan values. I update and test the behavior for both datetime64[ns] values and timedelta64[ns] values.

cc: @kmuehlbauer

Closes writing datetime64 in netCDF may produce badly formatted or unreadable files #9488
Tests added
User visible changes (including notable bug fixes) are documented in whats-new.rst

kmuehlbauer · 2024-09-15T10:42:57Z

Awesome, really just right around the corner. Thanks so much @spencerkclark!

* main: Opt out of floor division for float dtype time encoding (pydata#9497) fixed formatting for whats-new (pydata#9493)

* main: Opt out of floor division for float dtype time encoding (pydata#9497) fixed formatting for whats-new (pydata#9493) Forbid modifying names of DataTree objects with parents (pydata#9494) DAS-2155 - Merge datatree documentation into main docs. (pydata#9033) Make illegal path-like variable names when constructing a DataTree from a Dataset (pydata#9378) Ensure TreeNode doesn't copy in-place (pydata#9482) `open_groups` for zarr backends (pydata#9469) Update pyproject.toml (pydata#9484) New whatsnew section (pydata#9483)

Opt out of floor division for float dtype time encoding

9485861

spencerkclark mentioned this pull request Sep 14, 2024

writing datetime64 in netCDF may produce badly formatted or unreadable files #9488

Closed

5 tasks

kmuehlbauer approved these changes Sep 15, 2024

View reviewed changes

spencerkclark mentioned this pull request Sep 15, 2024

Refactor datetime and timedelta encoding for increased robustness #9498

Draft

3 tasks

dcherian merged commit ef42335 into pydata:main Sep 16, 2024
28 checks passed

spencerkclark deleted the fix-9488 branch September 17, 2024 00:25

dcherian added a commit to dcherian/xarray that referenced this pull request Sep 17, 2024

Merge branch 'main' into flox-preserve-dtype

d2648bc

* main: Opt out of floor division for float dtype time encoding (pydata#9497) fixed formatting for whats-new (pydata#9493)

hollymandel pushed a commit to hollymandel/xarray that referenced this pull request Sep 23, 2024

Opt out of floor division for float dtype time encoding (pydata#9497)

e4945c0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opt out of floor division for float dtype time encoding #9497

Opt out of floor division for float dtype time encoding #9497

spencerkclark commented Sep 14, 2024

kmuehlbauer commented Sep 15, 2024

Opt out of floor division for float dtype time encoding #9497

Opt out of floor division for float dtype time encoding #9497

Conversation

spencerkclark commented Sep 14, 2024

kmuehlbauer commented Sep 15, 2024