Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support RowFilter in ParquetExec #3360

Closed
Tracked by #3462
thinkharderdev opened this issue Sep 5, 2022 · 6 comments · Fixed by #3380
Closed
Tracked by #3462

Support RowFilter in ParquetExec #3360

thinkharderdev opened this issue Sep 5, 2022 · 6 comments · Fixed by #3380
Labels
enhancement New feature or request

Comments

@thinkharderdev
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
(This section helps Arrow developers understand the context and why for this feature, in addition to the what)

Describe the solution you'd like
A clear and concise description of what you want to happen.

arrow-rs has recently added the ability to do row-level filtering while decoding parquet files. This can dramatically reduce decoding and IO overhead when appropriately selective pruning predicates are pushed down to the table scan. We should support this in ParquetExec

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

If a ParquetExec has a PruningPredicate it should be "compiled" to a vector of ArrowPredicate(s) and supplied as a RowFilter to the ParquetRecordBatchStream. We can implement this a couple of different ways:

  1. Take the pruning Expr, create a PhysicalExpr and then implement a single ArrowPredictaeFn which evaluates it.
  2. Break apart conjunctions in the pruning Expr and compile each to a separate ArrowPredicateFn that will be applied sequentially. We can either take the ordering as given or apply some heuristic to determine the ordering.

Some considerations:

  1. We need to be careful to avoid redundant computations. As filter predicates are pushed down in the optimizer you may end up with a predicate on the scan involving the output a scalar function and also a projection in a later stage involving the same computation. If you naively execute all filter predicates during the scan you may end up doing the same work twice.
  2. The filtering is not free. At minimum you may end up decoding the same data multiple times.
  3. Currently, if users have a custom TableProvider they can control what gets pushed down. Is that enough configurability?
  4. Breaking the predicate into multiple filters can help in some situations if you can get the ordering correct (cheap filter first, expensive filters after) but if you get it wrong then it can be bad. This may be amplified as we add the ability to parallelize the column decoding (i.e. as the morsel-driven scheduler progresses).

Additional context
Add any other context or screenshots about the feature request here.

@thinkharderdev thinkharderdev added the enhancement New feature or request label Sep 5, 2022
@thinkharderdev
Copy link
Contributor Author

@tustvold @Ted-Jiang I've been tinkering with this and can submit an RFC today or tomorrow for feedback. If you guys already have thought though we can discuss here :)

@tustvold
Copy link
Contributor

tustvold commented Sep 5, 2022

100% we should get this integrated 👍, awesome that you're working on this. Some miscellaneous thoughts:

  • I think it must be possible to control what predicates get pushed down the scan, an expensive predicate may still make sense as a row group filter but not a row filter
  • We could restrict the pushed down predicates to simple binary predicates on dictionary or primitive columns by default
  • We should make visible in the explain plan what is being pushed down to what level
  • We could use the sort order if any to inform the push down order
  • We need benchmarks, lots of benchmarks 😆

@Ted-Jiang
Copy link
Member

@thinkharderdev Wow! So looking forward! 💪

  • I think it must be possible to control what predicates get pushed down the scan, an expensive predicate may still make sense as a row group filter but not a row filter
  • We could restrict the pushed down predicates to simple binary predicates on dictionary or primitive columns by default
  • We should make visible in the explain plan what is being pushed down to what level
  • We could use the sort order if any to inform the push down order
  • We need benchmarks, lots of benchmarks 😆

Nice write up! Thanks👍

I think one thing we should talk about , how to define the non-selective predicates (expensive predicate).
I think for now if we want to check wether is a predicate selective on no-sorted col , we need know the the result page number, so we need read col-index. if we filter zero page, it will run slower than before.🤔

@Ted-Jiang
Copy link
Member

maybe i will check how impala did this😂

@thinkharderdev
Copy link
Contributor Author

if we filter zero page, it will run slower than before.

This isn't necessarily the case. Even if we don't prune any pages it can still be a pretty significant performance boost to skip decoding.

The general problem with selectivity is that we really don't have much to go on at the time we need to build the filters. We have parquet metadata but that isn't much :). I think the approach I'll go with for the draft PR is something like:

  1. Break apart all conjunctions.
  2. Consider "simple predicates" (binary expressions, is/not null, is true/false, etc).
  3. Apply filters on sorted columns first to potentially leverage the page index.
  4. After that just use total size as the ordering (eg expressions which need to read less data go first).

From there we can tweak it to include fancier hueristics (null counts, etc)

@alamb
Copy link
Contributor

alamb commented Sep 5, 2022

Thank you @thinkharderdev @tustvold and @Ted-Jiang for driving this forwardf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants