Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor: distinguish parquet row group pruning type in unit test #8921

Merged
merged 1 commit into from
Jan 20, 2024

Conversation

Ted-Jiang
Copy link
Member

Which issue does this PR close?

Related #8880. when i implements 8880 cause lot ut fail, because not distinguish between parquet row group pruning by statistics or bloom filter

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the core Core DataFusion crate label Jan 20, 2024
@Ted-Jiang Ted-Jiang requested a review from alamb January 20, 2024 08:21
@Ted-Jiang Ted-Jiang changed the title Minor: distinguish between parquet row group pruning in ut Minor: distinguish parquet row group pruning type in ut Jan 20, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me -- thank you @Ted-Jiang

Comment on lines +94 to 102
test_row_group_prune(
Scenario::Timestamps,
"SELECT * FROM t where nanos < to_timestamp('2020-01-02 01:01:11Z')",
Some(0),
Some(1),
Some(0),
10,
)
.await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are so many parameters here it is getting hard to read the tests I think (as you have three different constants that need to be remembered)

Maybe we can (as a follow on PR) make a more self documenting, something like

Suggested change
test_row_group_prune(
Scenario::Timestamps,
"SELECT * FROM t where nanos < to_timestamp('2020-01-02 01:01:11Z')",
Some(0),
Some(1),
Some(0),
10,
)
.await;
RowGroupPruningTest::new()
.with_scenario(Scenario::Timestamps)
.with_query("SELECT * FROM t where nanos < to_timestamp('2020-01-02 01:01:11Z')"),
.with_expected_errors(Some(0)),
.with_pruned_by_stats(Some(1)),
.with_pruned_by_bloom_filter(Some(0)),
.with_expected_rows(10)
)
.await;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File #8922, thx @alamb

@alamb alamb changed the title Minor: distinguish parquet row group pruning type in ut Minor: distinguish parquet row group pruning type in unit test Jan 20, 2024
@Ted-Jiang Ted-Jiang merged commit 95e739c into apache:main Jan 20, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants