Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: enforce WITH KEY column type matches ROWKEY type #4147

Merged

Conversation

big-andy-coates
Copy link
Contributor

Description

Fixes: #4097
Fixes: #4146

Note: stacked on top of #4132, review that first!

BREAKING CHANGE: Any KEY column identified in the WITH clause must be of the same Sql type as ROWKEY.

Users can provide the name of a value column that matches the key column, e.g.

CREATE STREAM S (ID INT, NAME STRING) WITH (KEY='ID', ...);

Before primitive keys was introduced all keys were treated as STRING. With primitive keys ROWKEY can be types other than STRING, e.g. BIGINT.
It therefore follows that any KEY column identified in the WITH clause must have the same SQL type as the actual key, i.e. ROWKEY.

With this change the above example statement will fail with the error:

The KEY field (ID) identified in the WITH clause is of a different type to the actual key column.
Either change the type of the KEY field to match ROWKEY, or explicitly set ROWKEY to the type of the KEY field by adding 'ROWKEY INTEGER KEY' in the schema.
KEY field type: INTEGER
ROWKEY type: STRING

As the error message says, the error can be resolved by changing the statement to:

CREATE STREAM S (ROWKEY INT KEY, ID INT, NAME STRING) WITH (KEY='ID', ...);

Testing done

Tests added and updated.

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

Adds support for using primitive types in joins.

BREAKING CHANGE: Some existing joins may now fail and the type of `ROWKEY` in the result schema of joins may have changed.

When `ROWKEY` was always a `STRING` it was possible to join an `INTEGER` column with a `BIGINT` column.  This is no longer the case. A `JOIN` requires the join columns to be of the same type. (See confluentinc#4130 which tracks adding support for being able to `CAST` join criteria).

Where joining on two `INT` columns would previously have resulted in a schema containing `ROWKEY STRING KEY`, it would not result in `ROWKEY INT KEY`.
BREAKING CHANGE:  Any `KEY` column identified in the `WITH` clause must be of the same Sql type as `ROWKEY`.

Users can provide the name of a value column that matches the key column, e.g.

```sql
CREATE STREAM S (ID INT, NAME STRING) WITH (KEY='ID', ...);
```

Before primitive keys was introduced all keys were treated as `STRING`. With primitive keys `ROWKEY` can be types other than `STRING`, e.g. `BIGINT`.
It therefore follows that any `KEY` column identified in the `WITH` clause must have the same SQL type as the _actual_ key,  i.e. `ROWKEY`.

With this change the above example statement will fail with the error:

```
The KEY field (ID) identified in the WITH clause is of a different type to the actual key column.
Either change the type of the KEY field to match ROWKEY, or explicitly set ROWKEY to the type of the KEY field by adding 'ROWKEY INTEGER KEY' in the schema.
KEY field type: INTEGER
ROWKEY type: STRING
```

As the error message says, the error can be resolved by changing the statement to:

```sql
CREATE STREAM S (ROWKEY INT KEY, ID INT, NAME STRING) WITH (KEY='ID', ...);
```
@big-andy-coates big-andy-coates requested a review from a team as a code owner December 16, 2019 20:15
Copy link
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM! Thanks @big-andy-coates

@agavra agavra requested a review from a team December 16, 2019 23:35
@agavra agavra self-assigned this Dec 16, 2019
@big-andy-coates big-andy-coates merged commit 6c6695c into confluentinc:master Dec 19, 2019
@big-andy-coates big-andy-coates deleted the prim_keys_with_schema branch December 19, 2019 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants