Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fix bug in index remapping when plan contained multiple rewrite groups #1415

Merged

Conversation

westonpace
Copy link
Contributor

A bug would occur when some of the rewrite groups were covered by the index and other rewrite groups were not.

@github-actions
Copy link

ACTION NEEDED

Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

# be compacted with themselves since they have many deleted rows.
#
# However, they should not be compacted together because one of
# them is indexed and the other is not.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what would that look llike if we have many indices on different columns created from many version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the planner will not combine two fragments unless they are covered by the same indices. So if you have:

Index A: [0, 1, 2, 3]
Index B: [1, 2, 3]
Index C: [3]

Then you will get: [0], [1, 2], [3]

@westonpace westonpace changed the title bug: fix bug in index remapping when plan contained multiple rewrite groups fix: fix bug in index remapping when plan contained multiple rewrite groups Oct 13, 2023
assert ds.has_index

tbl = create_table(min=0, max=1, nvec=200)
ds = lance.write_dataset(tbl, base_dir, mode="append")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it compact non-index files into one?

Shall we write new data twice to create two fragments that are not indexed, and verify that they are combined later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can. There are other tests ensuring that two fragments get combined into one (without any indexing being involved). But it's simple enough to make this test more sophisticated. I'll put two fragments on both sides.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the test.

Copy link
Contributor

@eddyxu eddyxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Pending add test coverage.

@westonpace
Copy link
Contributor Author

LGTM overall. Pending add test coverage.

When merge when CI passes

@westonpace westonpace merged commit 92e30e5 into lancedb:main Oct 13, 2023
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants