Skip to content
This repository has been archived by the owner on Jun 14, 2024. It is now read-only.

Introduce ZOrderCoveringIndex #518

Merged
merged 4 commits into from
Dec 23, 2021
Merged

Conversation

sezruby
Copy link
Collaborator

@sezruby sezruby commented Dec 15, 2021

What is the context for this pull request?

What changes were proposed in this pull request?

Introduce a new covering index type that is Z-ordering dataset.
The current covering index is bucketed by the indexed column and sorted within a bucket. (a file) For filter queries, globally sorted data can be efficient than bucketed + partially sorted dataset.

However, sorting is usually limited to the first sorting column; if we execute a query without conditions of the first sorting column, we should read all files for the query.

Z-order covering index is a basically sorted dataset by Z-address which is derived from values of indexed columns for each row. As a result, rows having similar values can be collocated within a file. With Z-ordered dataset, we could skip some of unnecessary data by min/max pruning.

Usage

hs.createIndex(ZOrderCoveringIndex("indexName", Seq("zOrderCol1", "zOrderCol2"), Seq("includedCol1", "includedCol2"))

Does this PR introduce any user-facing change?

Yes, the PR introduces a new index type.

How was this patch tested?

unit test

@sezruby sezruby requested a review from clee704 December 16, 2021 00:11
@sezruby sezruby self-assigned this Dec 16, 2021
@sezruby sezruby added the enhancement New feature or request label Dec 16, 2021
@sezruby
Copy link
Collaborator Author

sezruby commented Dec 21, 2021

@clee704 Could you review the PR? I'll update user document & a notebook & python binding with follow up PRs

@sezruby sezruby mentioned this pull request Dec 21, 2021
4 tasks
Copy link

@clee704 clee704 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@sezruby sezruby merged commit 1adddf6 into microsoft:master Dec 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants