Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: support complex key pull queries #6628

Merged
merged 5 commits into from
Nov 19, 2020

Conversation

agavra
Copy link
Contributor

@agavra agavra commented Nov 17, 2020

Description

fixes #6602

Supports key-lookups for keys that are not just literals. This is necessary for the generic key work being done because we will expose keys that are not just primitives, but can also be arrays and structs.

Note that maps are not tested because of #6621, we may consider just prohibiting maps from being keys altogether. I will port this over after #6375 is merged.

Testing done

  • new unit tests
  • new RQTT tests

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@agavra agavra requested a review from vpapavas November 17, 2020 01:28
@agavra agavra requested a review from a team as a code owner November 17, 2020 01:28
@@ -1927,17 +1914,90 @@
}
},
{
"name": "IN: fail on non-literal key",
"name": "non-windowed - function in where clause",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to support generic expressions in pull query WHERE clauses at this time? Last I spoke to @AlanConfluent it sounded like we wanted to limit support to literal expressions (i.e., those that create arrays and structs with literal values) for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be more work to not support this than to support this. Is there any reason why we don't want to?

],
"expectedError": {
"type": "io.confluent.ksql.rest.entity.KsqlStatementErrorMessage",
"message": "Only comparison to literals is currently supported: (ID IN (CAST(1 AS INTEGER)))",
"message": "Unsupported column reference in pull query: (COUNT + 1)",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should be supported is functions on literals, maps and arrays of literals but not functions on columns?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, essentially anything static (column references can't be used anywhere)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline. In this PR we only want to support non-literal keys (maps and structs) and expressions that use only literals.

"statements": [
"CREATE STREAM INPUT (ID INT KEY, IGNORED INT) WITH (kafka_topic='test_topic', value_format='JSON');",
"CREATE TABLE AGGREGATE AS SELECT ID, COUNT(1) AS COUNT FROM INPUT GROUP BY ID;",
"SELECT * FROM AGGREGATE WHERE ID IN (CAST(10 AS INTEGER));"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we maybe make the IN take more than one key since that is its primary use case? Something like:

SELECT * FROM AGGREGATE WHERE ID IN (CAST(10 AS INTEGER), CAST("11" AS INTEGER));

Copy link
Member

@vpapavas vpapavas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor comment about a test case

if (exp instanceof NullLiteral) {
obj = null;
} else if (exp instanceof Literal) {
// skip the GenericExpressionResolver because this is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just done because GenericExpressionResolver isn't as low overhead right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, that was exactly the idea

@@ -76,11 +83,12 @@ public Object resolve(final Expression expression) {

@Override
protected Object visitExpression(final Expression expression, final Void context) {
new EnsureNoColReferences(expression).process(expression, context);
final ExpressionMetadata metadata =
CodeGenRunner.compileExpression(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, how much overhead is it to compile the java into bytecode, load the bytecode, and then run it. It makes a lot of sense when you compile once and then run many times, but in this case, there's a 1:1 relationship. In general, it seems like interpreting the expression would be lower overhead and possibly faster. Do we do it this way because this is the main method that currently exists for "evaluating a sql expression"?

I'd be curious to run a benchmark doing just expression lookups (rather than literals).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked offline about this, we'll look into running specialized benchmarks to see how much the overhead is. Given that this is just "new" functionality, we won't block the PR on this and we'll run those benchmarks going forward.

@agavra agavra merged commit bec50c3 into confluentinc:master Nov 19, 2020
@agavra agavra deleted the pull_complex branch November 19, 2020 03:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support "literal expressions" for pull queries on advanced keys
4 participants