-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds support for using primitive types in joins. #4132
Adds support for using primitive types in joins. #4132
Conversation
Adds support for using primitive types in joins. BREAKING CHANGE: Some existing joins may now fail and the type of `ROWKEY` in the result schema of joins may have changed. When `ROWKEY` was always a `STRING` it was possible to join an `INTEGER` column with a `BIGINT` column. This is no longer the case. A `JOIN` requires the join columns to be of the same type. (See confluentinc#4130 which tracks adding support for being able to `CAST` join criteria). Where joining on two `INT` columns would previously have resulted in a schema containing `ROWKEY STRING KEY`, it would not result in `ROWKEY INT KEY`.
final Optional<Column> keyColumn = schemaKStream | ||
.getKeyField() | ||
.resolve(schemaKStream.getSchema()); | ||
|
||
final ColumnRef rowKey = ColumnRef.of( | ||
tableName, | ||
SchemaUtil.ROWKEY_NAME | ||
); | ||
|
||
final boolean namesMatch = keyColumn | ||
.map(field -> field.ref().equals(joinFieldName)) | ||
.orElse(false); | ||
|
||
if (namesMatch || joinFieldName.equals(rowKey)) { | ||
return (SchemaKTable) schemaKStream; | ||
} | ||
|
||
if (!keyColumn.isPresent()) { | ||
throw new KsqlException( | ||
"Source table (" + tableName.name() + ") has no key column defined. " | ||
+ "Only 'ROWKEY' is supported in the join criteria." | ||
); | ||
} | ||
|
||
throw new KsqlException( | ||
"Source table (" + tableName.toString(FormatOptions.noEscape()) + ") key column (" | ||
+ keyColumn.get().ref().toString(FormatOptions.noEscape()) + ") " | ||
+ "is not the column used in the join criteria (" | ||
+ joinFieldName.toString(FormatOptions.noEscape()) + "). " | ||
+ "Only the table's key column or 'ROWKEY' is supported in the join criteria." | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These checks are now done from the constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @big-andy-coates ! I think it LGTM.
A couple of questions:
- Do we want to test with other data types as well besides
BIGINT
? - What is the decision regarding casting? Currently, it fails if the columns are not of the same type.
I will leave it up to you if we need another +1 from someone more experienced.
@@ -421,33 +413,6 @@ public void testSelectWithExpression() { | |||
|
|||
} | |||
|
|||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, why did you remove these tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because there are QTT tests that cover the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
ksql-engine/src/main/java/io/confluent/ksql/planner/plan/JoinNode.java
Outdated
Show resolved
Hide resolved
@vpapavas - I think this is OK for now, I think it makes sense to require explicit casting for non-equivalent types of fields. |
I'll ensure that's done.
There's an issue tracking that work: #4130 Thanks for the review! |
Description
Fixes: #4094
BREAKING CHANGE: Some existing joins may now fail and the type of
ROWKEY
in the result schema of joins may have changed.When
ROWKEY
was always aSTRING
it was possible to join anINTEGER
column with aBIGINT
column. This is no longer the case. AJOIN
requires the join columns to be of the same type. (See #4130 which tracks adding support for being able toCAST
join criteria).Where joining on two
INT
columns would previously have resulted in a schema containingROWKEY STRING KEY
, it would not result inROWKEY INT KEY
.NOTE: a fair few tests needed to be fixed as they were now doing invalid joins or expecting a STRING key. I also removed some tests that were failing as they duplicated existing QTT tests.
Testing done
usual
Reviewer checklist