Support implicit repartitioning of table sources during joins #4666

fish-face · 2020-02-28T11:43:34Z

We have database changelog messages coming through kafka via connect, and create KSQL tables to use these. Since the source is (somewhat) normalised, we need to do joins to do basically anything.

The only sensible way to set up message keys in this scenario is to use the primary key from the table. Hence when joining, generally one side of the join must be re-keyed. This is quite annoying at the moment. See #2356. Even worse is if there is a type mismatch (connect is creating string keys but the primary keys are still ints, and there is now no longer any implicit conversion to string when rekeying in KSQL) because you need an entire extra stream to CAST the column, as this can no longer be done in the same step as a PARTITION BY.

Since #4278 It is now possible to perform some implicit repartitioning of joins on the stream side, which makes a lot of sense and reduces the amount of abstraction-leak in this scenario greatly. However if you try to make use of this in a table-table join, you get:

Cannot repartition a TABLE source. If this is a join, make sure that the criteria uses the TABLE key ROWKEY instead of <column/function>

Supporting an implicit repartition here would help immensely.

The main alternative here is just making repartitioning of tables less annoying to begin with, i.e. fixing #2365. However note that if there is also a CAST required it cannot currently be done in the same step as a PARTITION BY so there would still be an additional annoyance over the stream situation.

The text was updated successfully, but these errors were encountered:

fish-face added the enhancement label Feb 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support implicit repartitioning of table sources during joins #4666

Support implicit repartitioning of table sources during joins #4666

fish-face commented Feb 28, 2020

Support implicit repartitioning of table sources during joins #4666

Support implicit repartitioning of table sources during joins #4666

Comments

fish-face commented Feb 28, 2020