Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort and finalize and regular write/append handle NaT index values differently #1797

Closed
Tracked by #1679
vasil-pashov opened this issue Aug 28, 2024 · 0 comments · Fixed by #1799
Closed
Tracked by #1679

Sort and finalize and regular write/append handle NaT index values differently #1797

vasil-pashov opened this issue Aug 28, 2024 · 0 comments · Fixed by #1799
Assignees
Labels
bug Something isn't working

Comments

@vasil-pashov
Copy link
Collaborator

vasil-pashov commented Aug 28, 2024

Describe the bug

When the index contains NaT regular append/write throw sorting exception while sort and finalize allows NaT values in the index.

Steps/Code to Reproduce

import arcticdb
import pandas as pd
import numpy as np

ac = arcticdb.Arctic("lmdb://test")
lib = ac.get_library("test", create_if_missing=True)
df1 = pd.DataFrame({"a": [1]}, index=pd.DatetimeIndex([pd.NaT]))
lib.write("sym", df1, staged=True)
lib.sort_and_finalize_staged_data("sym") # Does not throw

lib.write("sym_2", df) # Throws

Expected Results

The two should behave the same way and not allow NaT values in the index

OS, Python Version and ArcticDB Version

Python: 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
OS: Windows-10-10.0.22631-SP0
ArcticDB: dev

Backend storage used

No response

Additional Context

No response

@vasil-pashov vasil-pashov added the bug Something isn't working label Aug 28, 2024
vasil-pashov added a commit that referenced this issue Sep 30, 2024
<!--Example: Fixes #1234. See also #3456.-->
Fixes #1738
Fixes #1781
Fixes #1466
Fixes #1795
Fixes #1797
Fixes #1807
Fixes #1828

A notable change is that staged writes no longer validate the index is
sorted. The validation is done at the moment
compact_incompletes/finalize_staged_data/sort_and_finalize_staged_data
is called. This is because sort_and_finalize_staged_data does not
require the segments to be sorted, but the call for adding a staged
segment is the same. We should add a separate call for that.

Note also that all incomplete keys for a symbol are deleted if any of
the finalize calls fail. The other option is to leave the segments. In
that case the user will have the responsibility of calling
`delete_staged_data`.

<details>
  <summary>
   Checklist for code changes...
  </summary>

- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
 - [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>

<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->

---------

Co-authored-by: Vasil Pashov <[email protected]>
vasil-pashov added a commit that referenced this issue Sep 30, 2024
<!--Example: Fixes #1234. See also #3456.-->
Fixes #1738
Fixes #1781
Fixes #1466
Fixes #1795
Fixes #1797
Fixes #1807
Fixes #1828

A notable change is that staged writes no longer validate the index is
sorted. The validation is done at the moment
compact_incompletes/finalize_staged_data/sort_and_finalize_staged_data
is called. This is because sort_and_finalize_staged_data does not
require the segments to be sorted, but the call for adding a staged
segment is the same. We should add a separate call for that.

Note also that all incomplete keys for a symbol are deleted if any of
the finalize calls fail. The other option is to leave the segments. In
that case the user will have the responsibility of calling
`delete_staged_data`.

<details>
  <summary>
   Checklist for code changes...
  </summary>

- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
 - [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>

<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->

---------

Co-authored-by: Vasil Pashov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant