Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: make some SQL builders pure #1526

Merged
merged 11 commits into from
May 4, 2023
Merged

Fix: make some SQL builders pure #1526

merged 11 commits into from
May 4, 2023

Conversation

georgesittas
Copy link
Collaborator

No description provided.

@georgesittas georgesittas requested a review from tobymao May 3, 2023 10:37
sqlglot/expressions.py Outdated Show resolved Hide resolved
sqlglot/expressions.py Outdated Show resolved Hide resolved
sqlglot/expressions.py Outdated Show resolved Hide resolved
@tobymao
Copy link
Owner

tobymao commented May 3, 2023

can you double check and_, not_, or_ and, and paren?

@georgesittas
Copy link
Collaborator Author

Yeah will take a look, thanks for reviewing

@georgesittas
Copy link
Collaborator Author

georgesittas commented May 3, 2023

can you double check and_, not_, or_ and, and paren?

>>> import sqlglot
>>> c = sqlglot.exp.Condition(this="x")
>>> a = sqlglot.exp.and_(c, "true")
>>> id(a.this)
4368906528
>>> id(c)
4368906528

Should we add a copy=True parameter and pass it everywhere to ensure copying by default?

@tobymao
Copy link
Owner

tobymao commented May 3, 2023

@georgesittas yes, and i'd just double check where it's used and see if we need to put false

@georgesittas
Copy link
Collaborator Author

Got it, will look into this soon

@georgesittas
Copy link
Collaborator Author

georgesittas commented May 4, 2023

@tobymao can you take another look? Fixed a couple of TPC-DS tests that broke due to these changes.

For example, check this part in TPC-DS 53. From what I can understand, the optimizer converts this disjunctive normal form to conjunctive, using the distributive property of the OR / AND operators. The current diff can be seen here.

We previously had 7 clauses combined through conjunction in the optimized query, but I think we need 9 of them (the other 2 shown in the diff). A small proof of this claim:

let x = i_category IN ( 'Books', 'Children', 'Electronics' )
let y = i_class IN ( 'personal', 'portable', 'reference', 'self-help' )
let b = i_brand IN ( 'scholaramalgamalg #14', 'scholaramalgamalg #7', 'exportiunivamalg #9', 'scholaramalgamalg #9' )

let z = i_category IN ( 'Women', 'Music', 'Men' )
let w = i_class IN ( 'accessories', 'classical', 'fragrances', 'pants' )
let d = i_brand IN ( 'amalgimporto #1', 'edu packscholar #1', 'exportiimporto #1', 'importoamalg #1' )

let a = x AND y
let c = z AND w

(a AND b) OR (c AND d) ->
(a OR c) AND (a OR d) AND (b OR c) AND (b OR d) ->
((x AND y) OR (w AND z)) AND ((x AND y) OR d) AND (b OR (w AND z)) AND (b OR d) ->
(x OR w) AND (x OR z) AND (y OR w) AND (y OR z) AND (x OR d) AND (y OR d) AND (b OR w) AND (b OR z) AND (b OR d) 

If we substitute x, y, b, z, w and d here we'll get the current optimized version of TPC-DS 53 (the terms are in different order but they're all there).

The other TPC-DS test had a similar issue.

@tobymao tobymao merged commit 2a6a3e7 into main May 4, 2023
@tobymao tobymao deleted the jo/fix_mutations branch May 4, 2023 17:10
adrianisk pushed a commit to adrianisk/sqlglot that referenced this pull request Jun 21, 2023
* Fix: make some SQL builders pure

* Fixup

* Fixup

* Fixup

* Comment

* PR feedback, replace slice syntax with .between(low, high, ..)

* Add copy arg to some methods, fix tests

* Add copy flag to paren too

* Use convert instead of maybe_parse for between helper

* Fix test

* Change isin so that it receives t.Any for expressions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants