-
Notifications
You must be signed in to change notification settings - Fork 598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: clarify and normalize behavior of Table.rowid
#4991
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov Report
@@ Coverage Diff @@
## master #4991 +/- ##
==========================================
- Coverage 92.12% 87.67% -4.45%
==========================================
Files 210 210
Lines 23239 23244 +5
Branches 3241 3242 +1
==========================================
- Hits 21409 20380 -1029
- Misses 1394 2441 +1047
+ Partials 436 423 -13
|
I think it makes sense to revert the change for the Snowflake backend. |
mik-laj
approved these changes
Dec 11, 2022
cpcloud
reviewed
Dec 12, 2022
cpcloud
reviewed
Dec 12, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the idea, just a question on the implementation.
cpcloud
approved these changes
Dec 12, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In #2345/#2251 a
Table.rowid()
method was added, for accessing therowid
pseudocolumn in HeavyAI (née Omnisci, née MapD). Somewhere along the way the meaning and implications behind this operation were lost and altered to mean something closer to mean a monotonically increasing row number (at least in the docs).All existing implementations of this don't guarantee a monotonically increasing row number:
rowid
in sqlite and heavyai may be monotonically increasing, but deletes may lead to holes. It also only works on physical tables. The value here maps to physical locations in the database storage, supporting fast lookups (provided no changes to the storage have been made in between queries).rowid
in snowflake was added in feat(snowflake): implementrowid
scalar #4828, but implemented usingseq8()
. This implementation doesn't map to physical storage, and maybe should be removed? Either way,seq8()
doesn't guarantee a monotonic sequence.Since the original implementation was asking for something like duckdb/sqlite/heavyai's
rowid
, I'm moving this operation back to the semantics there. Users that want a guaranteed monotonic row number can useibis.row_number()
instead.As such, in this PR we:
duckdb
implementationWe also might want to delete the Snowflake implementation, since it doesn't provide the same fast indexing/physical storage mapping as
rowid
in duckdb/sqlite/heavyai. cc @mik-laj for thoughts here.