chore: support complex key pull queries #6628

agavra · 2020-11-17T01:28:26Z

Description

Supports key-lookups for keys that are not just literals. This is necessary for the generic key work being done because we will expose keys that are not just primitives, but can also be arrays and structs.

Note that maps are not tested because of #6621, we may consider just prohibiting maps from being keys altogether. I will port this over after #6375 is merged.

Testing done

new unit tests
new RQTT tests

Reviewer checklist

Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
Ensure relevant issues are linked (description should include text like "Fixes #")

vcrfxia · 2020-11-17T05:03:16Z

...test/resources/rest-query-validation-tests/pull-queries-against-materialized-aggregates.json

@@ -1927,17 +1914,90 @@
      }
    },
    {
-      "name": "IN: fail on non-literal key",
+      "name": "non-windowed - function in where clause",


Do we want to support generic expressions in pull query WHERE clauses at this time? Last I spoke to @AlanConfluent it sounded like we wanted to limit support to literal expressions (i.e., those that create arrays and structs with literal values) for now.

it would be more work to not support this than to support this. Is there any reason why we don't want to?

vpapavas · 2020-11-17T22:05:48Z

...test/resources/rest-query-validation-tests/pull-queries-against-materialized-aggregates.json

      ],
      "expectedError": {
        "type": "io.confluent.ksql.rest.entity.KsqlStatementErrorMessage",
-        "message": "Only comparison to literals is currently supported: (ID IN (CAST(1 AS INTEGER)))",
+        "message": "Unsupported column reference in pull query: (COUNT + 1)",


What should be supported is functions on literals, maps and arrays of literals but not functions on columns?

yes, essentially anything static (column references can't be used anywhere)

Discussed offline. In this PR we only want to support non-literal keys (maps and structs) and expressions that use only literals.

vpapavas · 2020-11-17T22:14:12Z

...test/resources/rest-query-validation-tests/pull-queries-against-materialized-aggregates.json

+      "statements": [
+        "CREATE STREAM INPUT (ID INT KEY, IGNORED INT) WITH (kafka_topic='test_topic', value_format='JSON');",
+        "CREATE TABLE AGGREGATE AS SELECT ID, COUNT(1) AS COUNT FROM INPUT GROUP BY ID;",
+        "SELECT * FROM AGGREGATE WHERE ID IN (CAST(10 AS INTEGER));"


Could we maybe make the IN take more than one key since that is its primary use case? Something like:

SELECT * FROM AGGREGATE WHERE ID IN (CAST(10 AS INTEGER), CAST("11" AS INTEGER));

vpapavas

LGTM with minor comment about a test case

AlanConfluent · 2020-11-17T22:44:25Z

ksqldb-rest-app/src/main/java/io/confluent/ksql/rest/server/execution/PullQueryExecutor.java

+    if (exp instanceof NullLiteral) {
+      obj = null;
+    } else if (exp instanceof Literal) {
+      // skip the GenericExpressionResolver because this is


This is just done because GenericExpressionResolver isn't as low overhead right?

yup, that was exactly the idea

AlanConfluent · 2020-11-17T22:52:08Z

ksqldb-engine/src/main/java/io/confluent/ksql/engine/generic/GenericExpressionResolver.java

@@ -76,11 +83,12 @@ public Object resolve(final Expression expression) {

    @Override
    protected Object visitExpression(final Expression expression, final Void context) {
+      new EnsureNoColReferences(expression).process(expression, context);
      final ExpressionMetadata metadata =
          CodeGenRunner.compileExpression(


Just curious, how much overhead is it to compile the java into bytecode, load the bytecode, and then run it. It makes a lot of sense when you compile once and then run many times, but in this case, there's a 1:1 relationship. In general, it seems like interpreting the expression would be lower overhead and possibly faster. Do we do it this way because this is the main method that currently exists for "evaluating a sql expression"?

I'd be curious to run a benchmark doing just expression lookups (rather than literals).

Talked offline about this, we'll look into running specialized benchmarks to see how much the overhead is. Given that this is just "new" functionality, we won't block the PR on this and we'll run those benchmarks going forward.

agavra requested a review from vpapavas November 17, 2020 01:28

agavra requested a review from a team as a code owner November 17, 2020 01:28

vcrfxia reviewed Nov 17, 2020

View reviewed changes

agavra requested a review from AlanConfluent November 17, 2020 20:33

vpapavas reviewed Nov 17, 2020

View reviewed changes

vpapavas approved these changes Nov 17, 2020

View reviewed changes

AlanConfluent approved these changes Nov 17, 2020

View reviewed changes

agavra added 4 commits November 18, 2020 16:29

chore: support complex key pull queries

5dd7912

chore: update to better error message

a4b4c4c

test: update test case for multiple IN statements

5cb93e5

chore: rebase with master

2756cfb

agavra force-pushed the pull_complex branch from 7d9922d to 2756cfb Compare November 19, 2020 00:58

chore: fix checkstyle

28c43d4

agavra merged commit bec50c3 into confluentinc:master Nov 19, 2020

agavra deleted the pull_complex branch November 19, 2020 03:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: support complex key pull queries #6628

chore: support complex key pull queries #6628

agavra commented Nov 17, 2020

vcrfxia Nov 17, 2020

agavra Nov 17, 2020

vpapavas Nov 17, 2020

agavra Nov 17, 2020

vpapavas Nov 17, 2020

vpapavas Nov 17, 2020

vpapavas left a comment

AlanConfluent Nov 17, 2020

agavra Nov 17, 2020

AlanConfluent Nov 17, 2020

agavra Nov 18, 2020

chore: support complex key pull queries #6628

chore: support complex key pull queries #6628

Conversation

agavra commented Nov 17, 2020

Description

Testing done

Reviewer checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vpapavas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment