Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: push all possible filters down to parquet exec #1839

Merged
merged 4 commits into from
Jun 28, 2023

Conversation

v0y4g3r
Copy link
Contributor

@v0y4g3r v0y4g3r commented Jun 27, 2023

I hereby agree to the terms of the GreptimeDB CLA

What's changed and what's your intention?

This PR pushes all possible filters down to Parquet exec to improve scan efficiency. Also this PR coerces time range predicate data types to timestamp type in storage schemas to address #992

Future work

  • PagePruningPredicate

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.

Refer to a related PR or issue link (optional)

Fixes #992

@v0y4g3r v0y4g3r marked this pull request as ready for review June 27, 2023 06:53
@codecov
Copy link

codecov bot commented Jun 27, 2023

Codecov Report

Merging #1839 (78bc5e1) into develop (f287d31) will decrease coverage by 0.24%.
The diff coverage is 94.51%.

❗ Current head 78bc5e1 differs from pull request most recent head 9c4ec61. Consider uploading reports for the commit 9c4ec61 to get more accurate results

@@             Coverage Diff             @@
##           develop    #1839      +/-   ##
===========================================
- Coverage    86.51%   86.28%   -0.24%     
===========================================
  Files          588      589       +1     
  Lines        95768    95988     +220     
===========================================
- Hits         82857    82822      -35     
- Misses       12911    13166     +255     

Copy link
Contributor

@killme2008 killme2008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

src/table/src/predicate.rs Show resolved Hide resolved
src/storage/src/sst/pruning.rs Show resolved Hide resolved
src/servers/src/mysql/handler.rs Outdated Show resolved Hide resolved
src/servers/src/mysql/handler.rs Outdated Show resolved Hide resolved
src/storage/src/chunk.rs Show resolved Hide resolved
src/storage/src/sst/parquet.rs Show resolved Hide resolved
src/storage/src/sst/pruning.rs Outdated Show resolved Hide resolved
src/storage/src/compaction/writer.rs Show resolved Hide resolved
src/storage/src/sst/pruning.rs Show resolved Hide resolved
Copy link
Contributor

@evenyag evenyag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM.

src/table/src/predicate.rs Outdated Show resolved Hide resolved
src/table/src/predicate.rs Show resolved Hide resolved
Copy link
Contributor

@killme2008 killme2008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@killme2008 killme2008 merged commit 559d1f7 into GreptimeTeam:develop Jun 28, 2023
paomian pushed a commit to paomian/greptimedb that referenced this pull request Oct 19, 2023
* feat: push all possible filters down to parquet exec

* fix: project

* test: add ut for DatafusionArrowPredicate

* fix: according to CR comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Archived in project
Development

Successfully merging this pull request may close these issues.

Row group pruning predicate does not coerce data types.
3 participants