Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: remap indices on compaction #1403

Merged
merged 3 commits into from
Oct 12, 2023
Merged

Conversation

westonpace
Copy link
Contributor

Currently, after compaction runs, any fragments that were covered by an index, and compacted, will no longer be covered by that index.

This PR fixes that. During compaction we calculate the input and output row ids and then rewrite existing indices so that they cover the newly compacted fragments.

Closes #1378

westonpace and others added 3 commits October 12, 2023 14:37
Added the ability to remap IVF/PQ indices.  This can be used during compaction to notify
the index that row ids have changed.

feat: remove tombstones from the IVF remapping process by shrinking the index (#1397)

I also simplify the remap tasks a bit by removing some unnecessary
traits.
During the commit_compaction step (the last part of compaction, after
the fragments have been rewritten) we run an index remapper to remap the
indices. This is needed because the row addresses will have changed as
part of the compaction. We should only remap indices that had rows in
affected fragments.

I also removed the "rewrite operation" from the python bindings. This
was already a fairly complicated operation and adding rewritten indices
to it would make it even more complicated. I don't think users are going
to be orchestrating compaction themselves anytime soon (and we can add
it back in if needed). Furthermore, the plan compaction capability and
serialization of compaction tasks is a much better solution for anyone
that wants to orchestrate the compaction process.

---------

Co-authored-by: Will Jones <[email protected]>
@westonpace westonpace merged commit 7a645b2 into main Oct 12, 2023
15 checks passed
@westonpace westonpace deleted the feat/remap-indices-on-compaction branch October 12, 2023 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remap row ids during compaction
1 participant