Introduce ZOrderCoveringIndex #518

sezruby · 2021-12-15T23:34:00Z

What is the context for this pull request?

Tracking Issue: n/a
Parent Issue: [PROPOSAL]: ZOrderCoveringIndex #515
Dependencies: n/a

What changes were proposed in this pull request?

Introduce a new covering index type that is Z-ordering dataset.
The current covering index is bucketed by the indexed column and sorted within a bucket. (a file) For filter queries, globally sorted data can be efficient than bucketed + partially sorted dataset.

However, sorting is usually limited to the first sorting column; if we execute a query without conditions of the first sorting column, we should read all files for the query.

Z-order covering index is a basically sorted dataset by Z-address which is derived from values of indexed columns for each row. As a result, rows having similar values can be collocated within a file. With Z-ordered dataset, we could skip some of unnecessary data by min/max pruning.

Usage

hs.createIndex(ZOrderCoveringIndex("indexName", Seq("zOrderCol1", "zOrderCol2"), Seq("includedCol1", "includedCol2"))

Does this PR introduce any user-facing change?

Yes, the PR introduces a new index type.

How was this patch tested?

unit test

sezruby · 2021-12-21T03:17:00Z

@clee704 Could you review the PR? I'll update user document & a notebook & python binding with follow up PRs

src/main/scala/com/microsoft/hyperspace/index/IndexConstants.scala

src/main/scala/com/microsoft/hyperspace/index/zordercovering/ZOrderCoveringIndex.scala

src/main/scala/com/microsoft/hyperspace/index/zordercovering/ZOrderCoveringIndexConfig.scala

src/main/scala/com/microsoft/hyperspace/index/zordercovering/ZOrderFilterIndexRule.scala

clee704

LGTM, thanks!

sezruby added 2 commits December 15, 2021 14:48

Introduce ZOrderCoveringIndex

cd8f23f

review commit

4e6c03c

sezruby mentioned this pull request Dec 15, 2021

Introduce ZOrderCoveringIndex #495

Closed

sezruby requested a review from clee704 December 16, 2021 00:11

sezruby self-assigned this Dec 16, 2021

sezruby added the enhancement New feature or request label Dec 16, 2021

add comment

7fb7bed

sezruby mentioned this pull request Dec 21, 2021

[PROPOSAL]: ZOrderCoveringIndex #515

Open

4 tasks

clee704 reviewed Dec 22, 2021

View reviewed changes

review commit

a7fd592

clee704 approved these changes Dec 23, 2021

View reviewed changes

sezruby merged commit 1adddf6 into microsoft:master Dec 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce ZOrderCoveringIndex #518

Introduce ZOrderCoveringIndex #518

sezruby commented Dec 15, 2021 •

edited

Loading

sezruby commented Dec 21, 2021 •

edited

Loading

clee704 left a comment

Introduce ZOrderCoveringIndex #518

Introduce ZOrderCoveringIndex #518

Conversation

sezruby commented Dec 15, 2021 • edited Loading

What is the context for this pull request?

What changes were proposed in this pull request?

Does this PR introduce any user-facing change?

How was this patch tested?

sezruby commented Dec 21, 2021 • edited Loading

clee704 left a comment

Choose a reason for hiding this comment

sezruby commented Dec 15, 2021 •

edited

Loading

sezruby commented Dec 21, 2021 •

edited

Loading