Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit SetOperationType #16709

Closed
ajcvickers opened this issue Jul 23, 2019 · 3 comments
Closed

Revisit SetOperationType #16709

ajcvickers opened this issue Jul 23, 2019 · 3 comments
Assignees
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Milestone

Comments

@ajcvickers
Copy link
Member

As discussed in API review - "reduce to three values + distinct, or @smitpatel to suggest something else which only he understands."

@smitpatel
Copy link
Contributor

@roji - Can you post summary in terms of what are supported SQL but unique to set operations (like when is push down required, when it can work without etc.)

@smitpatel smitpatel self-assigned this Jul 23, 2019
@roji
Copy link
Member

roji commented Jul 24, 2019

I'll try to do a quick rundown for the design meeting.

Set operation types

The "standard" supported set operations seem to be UNION, UNION ALL, INTERSECT and EXCEPT (although MySQL only supports UNION and UNION ALL). In addition to these, PostgreSQL also supports INTERSECT ALL and EXCEPT ALL (#16273 tracks any core support for that).

Our current design is to have a SelectExpression represent set operations, by having a SetOperationType enum on it. The value None represents a regular SelectExpression whereas the other values represent the set operations. Currently this enum only represents the standard four operations.

@divega's nice idea was to instead repurpose SelectExpression's IsDistinct to express the distinct dimension of the set operation. This would reduce the enum to have only 3 values (plus none), assuming we want to keep the current enum-based design. I'm not sure any further extensibility is required beyond these three set operations (times two for distinctness) - I've never heard of a fourth type of set operation.

Clauses inside set operands

Both MySQL and PostgreSQL support OrderBy/Take/Skip inside set operands. Consider the following:

(SELECT 1 ORDER BY 1 LIMIT 1)
UNION
(SELECT 2 ORDER BY 1 LIMIT 1)
ORDER BY 1;

This is what leads me to consider set operations as a full expression over two other expressions, as opposed to a clause on a select expression.

Clauses on set operation

PostgreSQL, MySQL and Sqlite support Order, Take and Skip on the set operation itself:

SELECT 1 UNION SELECT 2 ORDER BY 1 LIMIT 1 OFFSET 1;

No other clauses are supported AFAIK, so anything like a predicate causes pushdown. We don't currently render the above, but push down instead (see #16244). SQL Server is a bit peculiar: Take without Offset is expressed with TOP(), but that's not supported on set operations. So that situation needs to be pushed down (on SqlServer only, see #16238).

Precedence

PostgreSQL, Sqlite and SqlServer all assign higher precedence to INTERSECT than to UNION (MySQL only supports UNION/UNION ALL), so:

SELECT 1 UNION SELECT 2 INTERSECT SELECT 3;

is evaluated as

SELECT 1 UNION (SELECT 2 INTERSECT SELECT 3);

and yields 1 (rather than empty resultset). This makes set operations similar to binary expressions, and why for now each set operation has exactly two children, and not more. The current SQL generation logic is to always produce parentheses around operands which are themselves set operations, and whose set operation is different from the parent's.

Hopefully this summarizes the essential information..

@ajcvickers ajcvickers added this to the 3.0.0 milestone Jul 24, 2019
@roji
Copy link
Member

roji commented Aug 5, 2019

@smitpatel, if you're busy and the main point is to get rid of the SetOperationType enum, I could implement the following simple design:

  • Replace SelectExpression.SetOperationType with a SetOperation property that references a new SetOperationExpression.
  • SetOperationExpression would be abstract, have left and right for the operands, and three expression subclasses for Union, Intersect and Except. Distinctness could still be taken from SelectExpression as @divega suggested before.

If this design is sufficient for you, I can poach this and implement.

@ajcvickers ajcvickers modified the milestones: 3.0.0, Backlog Aug 14, 2019
@smitpatel smitpatel modified the milestones: Backlog, 3.0.0 Aug 21, 2019
smitpatel added a commit that referenced this issue Aug 21, 2019
Remove SetOperationType enum
Add support for Intersect/Except Distinct

Resolves #16709
@smitpatel smitpatel added closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug and removed punted-for-3.0 type-enhancement labels Aug 21, 2019
smitpatel added a commit that referenced this issue Aug 21, 2019
Remove SetOperationType enum
Add support for Intersect/Except Distinct

Resolves #16709
smitpatel added a commit that referenced this issue Aug 21, 2019
Remove SetOperationType enum
Add support for Intersect/Except Distinct

Resolves #16709
smitpatel added a commit that referenced this issue Aug 21, 2019
Remove SetOperationType enum
Add support for Intersect/Except Distinct

Resolves #16709
@ajcvickers ajcvickers modified the milestones: 3.0.0, 3.0.0-preview9 Aug 21, 2019
@ajcvickers ajcvickers modified the milestones: 3.0.0-preview9, 3.0.0 Nov 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Projects
None yet
Development

No branches or pull requests

3 participants