[v1] Refactor representation of set ops #1538

alancai98 · 2024-08-05T23:18:30Z

Relevant Issues

Resolves [main/v1] partiql-ast: modeling of s_f_w.set_op does not allow for left associative set operations #1507

Description

Refactors modeling of SQL set ops in parser, ast, plan, and evaluator
In the AST, new modeling supports
- ORDER BY, LIMIT, OFFSET on the SQL set op and the SFW arguments
- left-associative SQL set ops
In the parser,
- fixes parsing when ORDER BY, LIMIT, OFFSET specified on SQL set op
- updates parse tests outdated tests (updates taken from Add set op typing; fixes various behaviors in set op parsing and modeling #1506)
In the plan and planner, uses refactored plan modeling from Add set op typing; fixes various behaviors in set op parsing and modeling #1506
In the evaluator, includes the implementations of PartiQL bag operators (expressions)

Other Information

Updated Unreleased Section in CHANGELOG: [NO]
Any backward-incompatible changes? [YES]
- Changes to AST and plan but on v1 branch
Any new external dependencies? [NO]
Do your changes comply with the Contributing Guidelines
and Code Style Guidelines? [YES]

License Information

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

github-actions · 2024-08-05T23:27:56Z

CROSS-ENGINE Conformance Report ❌

	BASE (LEGACY-FAB43BF)	TARGET (EVAL-FAB43BF)	+/-
% Passing	90.45%	96.62%	6.17% ✅
Passing	5333	5697	364 ✅
Failing	563	199	-364 ✅
Ignored	0	0	0 ✅
Total Tests	5896	5896	0 ✅

Testing Details

Base Commit: fab43bf
Base Engine: LEGACY
Target Commit: fab43bf
Target Engine: EVAL

Result Details

❌ REGRESSION DETECTED. See Now Failing Tests. ❌
Passing in both: 5205
Failing in both: 71
PASSING in BASE but now FAILING in TARGET: 128
FAILING in BASE but now PASSING in TARGET: 492

Now Failing Tests ❌

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Now Passing Tests

492 test(s) were previously failing in BASE (LEGACY-FAB43BF) but now pass in TARGET (EVAL-FAB43BF). Before merging, confirm they are intended to pass.

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

CROSS-COMMIT-LEGACY Conformance Report ❌

	BASE (LEGACY-7AEB1BE)	TARGET (LEGACY-FAB43BF)	+/-
% Passing	90.72%	90.45%	-0.27% ⭕
Passing	5329	5333	4 ✅
Failing	545	563	18 ⭕
Ignored	0	0	0 ✅
Total Tests	5874	5896	22 ✅

Testing Details

Base Commit: 7aeb1be
Base Engine: LEGACY
Target Commit: fab43bf
Target Engine: LEGACY

Result Details

❌ REGRESSION DETECTED. See Now Failing Tests. ❌
Passing in both: 5324
Failing in both: 539
PASSING in BASE but now FAILING in TARGET: 5
FAILING in BASE but now PASSING in TARGET: 0

Now Failing Tests ❌

The following 5 test(s) were previously PASSING in BASE but are now FAILING in TARGET:

Click here to see

outerUnionCoerceScalar, compileOption: LEGACY
outerUnionCoerceStruct, compileOption: LEGACY
outerUnionCoerceNullMissing, compileOption: PERMISSIVE
outerUnionCoerceNullMissing, compileOption: LEGACY
Example 6 — Value Coercion; Coercion of single value, compileOption: LEGACY

CROSS-COMMIT-EVAL Conformance Report ✅

	BASE (EVAL-7AEB1BE)	TARGET (EVAL-FAB43BF)	+/-
% Passing	96.41%	96.62%	0.22% ✅
Passing	5664	5697	33 ✅
Failing	211	199	-12 ✅
Ignored	0	0	0 ✅
Total Tests	5875	5896	21 ✅

Testing Details

Base Commit: 7aeb1be
Base Engine: EVAL
Target Commit: fab43bf
Target Engine: EVAL

Result Details

Passing in both: 5664
Failing in both: 199
PASSING in BASE but now FAILING in TARGET: 0
FAILING in BASE but now PASSING in TARGET: 6

Now Passing Tests

The following 6 test(s) were previously FAILING in BASE but are now PASSING in TARGET. Before merging, confirm they are intended to pass:

Click here to see

Example 6 — Value Coercion, compileOption: LEGACY
outerUnionCoerceScalar, compileOption: LEGACY
outerUnionCoerceStruct, compileOption: LEGACY
outerUnionCoerceNullMissing, compileOption: PERMISSIVE
outerUnionCoerceNullMissing, compileOption: LEGACY
Example 6 — Value Coercion; Coercion of single value, compileOption: LEGACY

partiql-eval/src/main/kotlin/org/partiql/eval/internal/helpers/ToBag.kt

alancai98 · 2024-08-09T22:19:54Z

partiql-ast/src/main/resources/partiql_ast.ion

-      type:    '.set_op',
-      operand: '.expr.s_f_w',
-    },
+  query_set::{


(self-review) primary addition of this PR is adding query_set and query_expr to the AST. This modeling allows us to define ORDER BY, LIMIT, and OFFSET on SFW queries as well as SQL set ops.

Modeling is partially based off

partiql-lang-rust -- https://github.com/partiql/partiql-lang-rust/blob/e5b8ce7497f515bc998a31cd8b45fb2f3e9287a2/partiql-ast/src/ast.rs#L238-L244

similar modeling in the sqlparser Rust crate -- https://docs.rs/sqlparser/0.48.0/sqlparser/ast/struct.Query.html

Nice work. Would we want to add the naming followed by PLR and SQLParser? AKA Query is the top-level product type holding the expression, order-by, etc. And the union types QuerySet (PLR) or SetExpr (SQLParser) holding the different variants (expr, select, wrapped Query, values, etc).

Especially since the Java name will now be: Expr.QuerySet and QueryExpr. I found it a bit confusing in the rest of the code. Could just be: Expr.Query and Expr.Query.Body.

I added some more context on PLR's AST definition in the partiql-tests PR partiql/partiql-tests#121 (comment). The AST definition I used in this PR allows for us to be more strict in what expressions we want to support with LIMIT, OFFSET, and ORDER BY. For now, I'm thinking we only permit those clauses on

SFW

SQL set ops

PartiQL bag ops

PLR also allows those clauses on general expressions. Since it's not in the PartiQL spec, I haven't chosen to allow those clauses on general expressions.

I agree that the previous naming was confusing, so updated to use query_body rather than query_expr.

alancai98 · 2024-08-09T22:22:42Z

partiql-lang/src/main/kotlin/org/partiql/lang/prettyprint/QueryPrettyPrinter.kt

@@ -320,7 +320,14 @@ class QueryPrettyPrinter {
            is PartiqlAst.Expr.Or -> writeNAryOperator("OR", node.operands, sb, level)
            is PartiqlAst.Expr.InCollection -> writeNAryOperator("IN", node.operands, sb, level)
            is PartiqlAst.Expr.BagOp -> {
-                var name = node.op.javaClass.simpleName.toUpperCase().replace("_", " ")
+                var name = when (node.op) {


(self-review) changes to partiql-lang tests follow what I did for a similar PR targeting v0.14.6 to fix some legacy tests -- #1506.

alancai98 · 2024-08-09T22:24:54Z

partiql-plan/src/main/resources/partiql_plan.ion

@@ -322,7 +342,7 @@ rel::{
      _: [
        call::{
          agg:  ref,
-          set_quantifier: [ ALL, DISTINCT ],
+          setq: set_quantifier,


(self-review) moved set_quantifier out of this nested aggregation plan node. Also renamed to setq for consistency w/ the set ops.

alancai98 · 2024-08-09T22:27:18Z

test/partiql-tests

(self-review) currently points to the source branch of partiql/partiql-tests#121. Once that partiql-tests PR is merged in, I will update the submodule commit.

alancai98 · 2024-08-09T22:28:45Z

partiql-eval/src/main/kotlin/org/partiql/eval/internal/helpers/ToBag.kt

-@OptIn(PartiQLValueExperimental::class)
-internal fun PartiQLValue.toBag(): BagValue<*> {
-    TODO("For OUTER set operators")
+internal fun Datum.asIterator(coerce: Boolean): Iterator<Datum> {


(self-review) coercion behavior is slightly different than other type check exceptions in permissive mode. For the bag operators, we coerce to a bag rather than return missing.

johnedquinn

Still need to go through the evaluation implementations for the bag ops -- but overall looks good. I've left some comments on the modelling.

johnedquinn · 2024-08-13T19:08:53Z

partiql-parser/src/main/antlr/PartiQLParser.g4

+        order=orderByClause??
+        limit=limitClause??
+        offset=offsetByClause?? # SfwQuery


This makes me think. Can we remove these completely? Since the AST nodes don't allow for this, this may result in ambiguous results. I believe SQL would only allow this by using parenthesis -- which I believe would be handled even if removing this.

Since the AST nodes don't allow for this, this may result in ambiguous results.

Could you clarify on this? Are you referring to if a set op query is defined without parens and it includes a LIMIT, OFFSET, or ORDER BY? For example,

SELECT * FROM a UNION SELECT * FROM b LIMIT 5

In this case, I believe the parsing behavior we want is to apply the LIMIT to the UNION set operator. This behavior follows what PostgreSQL's parsing does - https://www.postgresql.org/docs/current/queries-union.html.

johnedquinn · 2024-08-13T20:17:36Z

partiql-ast/src/main/resources/partiql_ast.ion

-      type:    '.set_op',
-      operand: '.expr.s_f_w',
-    },
+  query_set::{


Nice work. Would we want to add the naming followed by PLR and SQLParser? AKA Query is the top-level product type holding the expression, order-by, etc. And the union types QuerySet (PLR) or SetExpr (SQLParser) holding the different variants (expr, select, wrapped Query, values, etc).

Especially since the Java name will now be: Expr.QuerySet and QueryExpr. I found it a bit confusing in the rest of the code. Could just be: Expr.Query and Expr.Query.Body.

partiql-eval/src/main/kotlin/org/partiql/eval/internal/operator/rex/ExprUnionAll.kt

partiql-eval/src/main/kotlin/org/partiql/eval/internal/operator/rex/ExprUnionDistinct.kt

alancai98 · 2024-08-14T00:55:14Z

Thanks for the review @johnedquinn. Based on the offline discussion, we decided on the following changes

allow LIMIT, OFFSET, and ORDER BY on the outer bag ops
coercion function for NULL and MISSING will the same as the spec's FROM source coercion (basically wrap the absent type in a bag rather than empty bag).

Will mark this PR as a draft till I address those concerns and the other PR comments.

alancai98 · 2024-08-16T21:47:09Z

partiql-planner/src/main/kotlin/org/partiql/planner/internal/transforms/RelConverter.kt

-            val quantifier = when (setOp.type.setq) {
-                SetQuantifier.ALL -> Rel.Op.Set.Quantifier.ALL
-                null, SetQuantifier.DISTINCT -> Rel.Op.Set.Quantifier.DISTINCT
+        private fun convertSetOp(setExpr: QueryBody.SetOp): Rel {


(self-review) one benefit of reusing the SQL set op plan nodes for the PartiQL bag ops is that we can reuse the Rel scan behavior for non-collections (i.e. coercing to a bag in permissive and erroring in strict mode).

codecov-commenter · 2024-08-16T21:50:46Z

Codecov Report

Attention: Patch coverage is 76.00000% with 6 lines in your changes missing coverage. Please review.

Please upload report for BASE (v1@db9b6bd). Learn more about missing BASE report.

Files	Patch %	Lines
...org/partiql/lang/prettyprint/QueryPrettyPrinter.kt	42.85%	3 Missing and 1 partial ⚠️
.../org/partiql/lang/syntax/impl/PartiQLPigVisitor.kt	88.88%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@          Coverage Diff          @@
##             v1    #1538   +/-   ##
=====================================
  Coverage      ?   77.15%           
  Complexity    ?     2514           
=====================================
  Files         ?      254           
  Lines         ?    18583           
  Branches      ?     3523           
=====================================
  Hits          ?    14337           
  Misses        ?     3212           
  Partials      ?     1034

Flag	Coverage Δ
EXAMPLES	`80.07% <ø> (?)`
LANG	`77.06% <76.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

johnedquinn

Looks good to me! Once the tests PR is merged and the submodule commit is updated, I can approve.

alancai98 · 2024-08-21T19:30:19Z

Since the last review,

rebased the PR off of v1 to include the latest datum comparator commit
53e96c2 (#1538) uses the latest tests from Fix some more SQL set op + PartiQL bag op tests partiql-tests#121 and updates the missing to null coercion for the set op evaluation

…lazy

johnedquinn · 2024-08-21T20:56:43Z

partiql-eval/src/main/kotlin/org/partiql/eval/internal/helpers/RecordUtility.kt

+    fun Record.toDatumArrayCoerceMissing(): Array<Datum> = Array(this.values.size) {
+        val d = [email protected][it]
+        when (d.isMissing) {
+            true -> Datum.nullValue()
+            else -> d
+        }
    }


Just clarifying why you're doing this: even though the comparator treats null/missing as equal, it isn't deterministic which value will be returned in the TreeSet -- so you need to coerce them ahead of time.

Also, I'm leaving a suggestion here because I don't think you need to create a new Array and move all previous pointers to the new one.

Suggested change

fun Record.toDatumArrayCoerceMissing(): Array<Datum> = Array(this.values.size) {

val d = this@toDatumArrayCoerceMissing.values[it]

when (d.isMissing) {

true -> Datum.nullValue()

else -> d

}

}

fun Array<Datum>.coerceMissing() {

for (i in 0 until this.size) {

if (this[i].isMissing) {

this[i] = Datum.nullValue()

}

}

}

Just clarifying why you're doing this: even though the comparator treats null/missing as equal, it isn't deterministic which value will be returned in the TreeSet -- so you need to coerce them ahead of time.

Yes, that's correct. I added a brief comment in the latest commit to give some more context on the behavior.

Also, I'm leaving a suggestion here because I don't think you need to create a new Array and move all previous pointers to the new one.

Thanks for the suggestion. You're right -- it can be simplified quite a bit. I applied your suggestion in the latest commit.

alancai98 changed the title ~~Refactor representation of set ops~~ [draft] [v1] Refactor representation of set ops Aug 5, 2024

alancai98 commented Aug 6, 2024

View reviewed changes

partiql-eval/src/main/kotlin/org/partiql/eval/internal/helpers/ToBag.kt Outdated Show resolved Hide resolved

alancai98 mentioned this pull request Aug 7, 2024

Fix some more SQL set op + PartiQL bag op tests partiql/partiql-tests#121

Merged

alancai98 force-pushed the v1-fix-set-ops-ast branch from fb73da5 to ca557f6 Compare August 9, 2024 21:21

alancai98 commented Aug 9, 2024

View reviewed changes

alancai98 changed the title ~~[draft] [v1] Refactor representation of set ops~~ [v1] Refactor representation of set ops Aug 9, 2024

alancai98 marked this pull request as ready for review August 9, 2024 22:31

alancai98 self-assigned this Aug 9, 2024

alancai98 requested a review from johnedquinn August 9, 2024 22:32

johnedquinn requested changes Aug 13, 2024

View reviewed changes

alancai98 marked this pull request as draft August 14, 2024 00:55

alancai98 force-pushed the v1-fix-set-ops-ast branch 2 times, most recently from 359f674 to 22ca247 Compare August 16, 2024 21:38

alancai98 commented Aug 16, 2024

View reviewed changes

alancai98 marked this pull request as ready for review August 16, 2024 21:47

alancai98 requested a review from johnedquinn August 16, 2024 21:47

johnedquinn reviewed Aug 20, 2024

View reviewed changes

alancai98 force-pushed the v1-fix-set-ops-ast branch from 22ca247 to 53e96c2 Compare August 21, 2024 19:04

alancai98 added 8 commits August 21, 2024 13:55

Refactor ast and plan rep of set ops

889d0e8

Add impls of Rex bag ops

cd9e7f5

Prohibit ORDER BY, LIMIT, OFFSET on bag ops; change eval impl to use …

af7ce9b

…lazy

Update eval implementations to create iterables

16c4d4e

Rename set_quantifier -> setq

2b4b72f

Change outer bag op behavior following discussions

8cd382e

Fix null/missing coerce behavior

a4717a2

Update conformance test submodule

5d5920b

johnedquinn reviewed Aug 21, 2024

View reviewed changes

Remove redundant array construction calls

75f3f35

alancai98 force-pushed the v1-fix-set-ops-ast branch from dc6e8f1 to 75f3f35 Compare August 21, 2024 21:30

alancai98 requested a review from johnedquinn August 21, 2024 21:31

johnedquinn approved these changes Aug 21, 2024

View reviewed changes

alancai98 merged commit 4f783c7 into v1 Aug 21, 2024
14 checks passed

alancai98 deleted the v1-fix-set-ops-ast branch August 21, 2024 21:51

alancai98 mentioned this pull request Aug 21, 2024

Revise bag op RFC-0007 based on latest discussions partiql/partiql-lang#89

Open

alancai98 mentioned this pull request Nov 5, 2024

[main/v1] partiql-ast: modeling of s_f_w.set_op does not allow for left associative set operations #1507

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v1] Refactor representation of set ops #1538

[v1] Refactor representation of set ops #1538

alancai98 commented Aug 5, 2024 •

edited

Loading

github-actions bot commented Aug 5, 2024 •

edited

Loading

alancai98 Aug 9, 2024

johnedquinn Aug 13, 2024

alancai98 Aug 16, 2024

alancai98 Aug 9, 2024

alancai98 Aug 9, 2024

alancai98 Aug 9, 2024

alancai98 Aug 9, 2024

johnedquinn left a comment

johnedquinn Aug 13, 2024

alancai98 Aug 16, 2024

johnedquinn Aug 13, 2024

alancai98 commented Aug 14, 2024

alancai98 Aug 16, 2024

codecov-commenter commented Aug 16, 2024 •

edited

Loading

johnedquinn left a comment

alancai98 commented Aug 21, 2024

johnedquinn Aug 21, 2024 •

edited

Loading

alancai98 Aug 21, 2024

-    fun Record.toDatumArrayCoerceMissing(): Array<Datum> = Array(this.values.size) {
-        val d = this@toDatumArrayCoerceMissing.values[it]
-        when (d.isMissing) {
-            true -> Datum.nullValue()
-            else -> d
-        }
-    }
+    fun Array<Datum>.coerceMissing() {
+        for (i in 0 until this.size) {
+            if (this[i].isMissing) {
+                this[i] = Datum.nullValue()
+            }
+        }
+    }

[v1] Refactor representation of set ops #1538

[v1] Refactor representation of set ops #1538

Conversation

alancai98 commented Aug 5, 2024 • edited Loading

Relevant Issues

Description

Other Information

License Information

github-actions bot commented Aug 5, 2024 • edited Loading

CROSS-ENGINE Conformance Report ❌

Testing Details

Result Details

Now Failing Tests ❌

Now Passing Tests

CROSS-COMMIT-LEGACY Conformance Report ❌

Testing Details

Result Details

Now Failing Tests ❌

CROSS-COMMIT-EVAL Conformance Report ✅

Testing Details

Result Details

Now Passing Tests

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnedquinn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alancai98 commented Aug 14, 2024

Choose a reason for hiding this comment

codecov-commenter commented Aug 16, 2024 • edited Loading

Codecov Report

johnedquinn left a comment

Choose a reason for hiding this comment

alancai98 commented Aug 21, 2024

johnedquinn Aug 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alancai98 commented Aug 5, 2024 •

edited

Loading

github-actions bot commented Aug 5, 2024 •

edited

Loading

codecov-commenter commented Aug 16, 2024 •

edited

Loading

johnedquinn Aug 21, 2024 •

edited

Loading