Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(optimizer): rewrite predicate and accelerate tpch19 #6301

Merged
merged 5 commits into from
Jun 29, 2022

Conversation

xudong963
Copy link
Member

@xudong963 xudong963 commented Jun 28, 2022

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

Rewrite OR predicates, tpch 19 can be optimized to Inner Join

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Project: [revenue]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|     EvalScalar: [sum(l_extendedprice * 1 - l_discount)]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|         Aggregate: group items: [], aggregate functions: [sum(l_extendedprice * 1 - l_discount)]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
|             EvalScalar: [*(lineitem.l_extendedprice, -(1, lineitem.l_discount))]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
|                 Filter: [((((((part.p_brand = Brand#52) AND (in(part.p_container, SM CASE, SM BOX, SM PACK, SM PKG))) AND (lineitem.l_quantity >= 4)) AND (lineitem.l_quantity <= 14)) AND (<=(part.p_size, 5))) OR (((((part.p_brand = Brand#11) AND (in(part.p_container, MED BAG, MED BOX, MED PKG, MED PACK))) AND (lineitem.l_quantity >= 18)) AND (lineitem.l_quantity <= 28)) AND (<=(part.p_size, 10)))) OR (((((part.p_brand = Brand#51) AND (in(part.p_container, LG CASE, LG BOX, LG PACK, LG PKG))) AND (lineitem.l_quantity >= 29)) AND (lineitem.l_quantity <= 39)) AND (<=(part.p_size, 15)))] |
|                     HashJoin: INNER, build keys: [part.p_partkey], probe keys: [lineitem.l_partkey], join filters: []                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
|                         Filter: [in(lineitem.l_shipmode, AIR, AIR REG), lineitem.l_shipinstruct = DELIVER IN PERSON]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|                             Scan: default.default.lineitem                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
|                         Filter: [>=(part.p_size, 1)]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|                             Scan: default.default.part                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
10 rows in set (0.05 sec)
Read 0 rows, 0.00 B in 0.002 sec., 0 rows/sec., 0.00 B/sec.

Changelog

  • New Feature

Related Issues

Fixes #6096

@mergify
Copy link
Contributor

mergify bot commented Jun 28, 2022

Thanks for the contribution!
I have applied any labels matching special text in your PR Changelog.

Please review the labels and make any necessary changes.

@mergify mergify bot added the pr-feature this PR introduces a new feature to the codebase label Jun 28, 2022
@vercel
Copy link

vercel bot commented Jun 28, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Updated
databend ⬜️ Ignored (Inspect) Jun 29, 2022 at 11:02AM (UTC)

@xudong963 xudong963 requested a review from leiysky June 29, 2022 02:01
@xudong963
Copy link
Member Author

xudong963 commented Jun 29, 2022

Tpch 19 can run without OOM with 1G' data @BohuTANG

mysql> source tpch-q19.sql
+--------------------+
| revenue            |
+--------------------+
| 3083843.0577999996 |
+--------------------+
1 row in set (4.34 sec)
Read 6201215 rows, 950.74 MiB in 4.266 sec., 1.45 million rows/sec., 222.84 MiB/sec.

@leiysky
Copy link
Collaborator

leiysky commented Jun 29, 2022

I think you can add this rule as a RuleNormalizeDisjunctiveFilter.

@xudong963
Copy link
Member Author

I think you can add this rule as a RuleNormalizeDisjunctiveFilter.

Done!

@xudong963 xudong963 merged commit b826d14 into databendlabs:main Jun 29, 2022
@xudong963 xudong963 deleted the predicate branch June 29, 2022 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need-review pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize disjunctive predicates
3 participants