Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: clarify and normalize behavior of Table.rowid #4991

Merged
merged 2 commits into from
Dec 12, 2022

Conversation

jcrist
Copy link
Member

@jcrist jcrist commented Dec 9, 2022

In #2345/#2251 a Table.rowid() method was added, for accessing the rowid pseudocolumn in HeavyAI (née Omnisci, née MapD). Somewhere along the way the meaning and implications behind this operation were lost and altered to mean something closer to mean a monotonically increasing row number (at least in the docs).

All existing implementations of this don't guarantee a monotonically increasing row number:

  • rowid in sqlite and heavyai may be monotonically increasing, but deletes may lead to holes. It also only works on physical tables. The value here maps to physical locations in the database storage, supporting fast lookups (provided no changes to the storage have been made in between queries).
  • rowid in snowflake was added in feat(snowflake): implement rowid scalar #4828, but implemented using seq8(). This implementation doesn't map to physical storage, and maybe should be removed? Either way, seq8() doesn't guarantee a monotonic sequence.

Since the original implementation was asking for something like duckdb/sqlite/heavyai's rowid, I'm moving this operation back to the semantics there. Users that want a guaranteed monotonic row number can use ibis.row_number() instead.

As such, in this PR we:

  • Update the docstring to be more correct
  • Add a duckdb implementation
  • Add an error at expression build time if the backing table isn't a physical table.

We also might want to delete the Snowflake implementation, since it doesn't provide the same fast indexing/physical storage mapping as rowid in duckdb/sqlite/heavyai. cc @mik-laj for thoughts here.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 9, 2022

Test Results

         6 files           6 suites   6m 11s ⏱️
  3 376 tests   3 335 ✔️   41 💤 0
20 206 runs  19 960 ✔️ 246 💤 0

Results for commit 47b01d3.

♻️ This comment has been updated with latest results.

@codecov
Copy link

codecov bot commented Dec 9, 2022

Codecov Report

Merging #4991 (47b01d3) into master (09517a5) will decrease coverage by 4.44%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #4991      +/-   ##
==========================================
- Coverage   92.12%   87.67%   -4.45%     
==========================================
  Files         210      210              
  Lines       23239    23244       +5     
  Branches     3241     3242       +1     
==========================================
- Hits        21409    20380    -1029     
- Misses       1394     2441    +1047     
+ Partials      436      423      -13     
Impacted Files Coverage Δ
ibis/backends/duckdb/registry.py 92.92% <ø> (ø)
ibis/backends/snowflake/registry.py 0.00% <ø> (ø)
ibis/expr/operations/generic.py 100.00% <100.00%> (ø)
ibis/expr/types/relations.py 96.50% <100.00%> (+0.02%) ⬆️
ibis/backends/bigquery/version.py 0.00% <0.00%> (-100.00%) ⬇️
ibis/backends/bigquery/rewrites.py 0.00% <0.00%> (-100.00%) ⬇️
ibis/backends/bigquery/operations.py 0.00% <0.00%> (-100.00%) ⬇️
ibis/backends/bigquery/udf/rewrite.py 0.00% <0.00%> (-100.00%) ⬇️
ibis/backends/bigquery/registry.py 0.00% <0.00%> (-97.32%) ⬇️
ibis/backends/bigquery/compiler.py 0.00% <0.00%> (-96.97%) ⬇️
... and 14 more

@mik-laj
Copy link
Contributor

mik-laj commented Dec 11, 2022

I think it makes sense to revert the change for the Snowflake backend.

Copy link
Member

@cpcloud cpcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the idea, just a question on the implementation.

@cpcloud cpcloud added this to the 4.0.0 milestone Dec 12, 2022
@cpcloud cpcloud enabled auto-merge (rebase) December 12, 2022 13:21
@cpcloud cpcloud merged commit dd378f1 into ibis-project:master Dec 12, 2022
@jcrist jcrist deleted the cleanup-rowid branch December 13, 2022 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants