Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KSQL 5.4 pull query should be smart enough to determine if <field> is the same as ROWKEY #3527

Closed
rmoff opened this issue Oct 10, 2019 · 7 comments

Comments

@rmoff
Copy link
Contributor

rmoff commented Oct 10, 2019

Table:

ksql> DESCRIBE EXTENDED RATINGS_BY_STATUS;

Name                 : RATINGS_BY_STATUS
Type                 : TABLE
Key field            : CLUB_STATUS
Key format           : STRING
Timestamp field      : Not set - using <ROWTIME>
Value format         : AVRO
Kafka topic          : RATINGS_BY_STATUS (partitions: 1, replication: 1)

❌ Key as shown is CLUB_STATUS. Query using this doesn't work:

ksql> SELECT CLUB_STATUS, NUM_RATINGS FROM RATINGS_BY_STATUS WHERE CLUB_STATUS='platinum';
WHERE clause on unsupported field: CLUB_STATUS
Static queries currently require a WHERE clause that:
 - limits the query to a single ROWKEY, e.g. `SELECT * FROM X WHERE ROWKEY=Y;`.

✔️ Have to literally use ROWKEY.

ksql> SELECT CLUB_STATUS, NUM_RATINGS FROM RATINGS_BY_STATUS WHERE ROWKEY='platinum';

 ROWKEY STRING KEY | CLUB_STATUS STRING | NUM_RATINGS BIGINT
-------------------------------------------------------------
 platinum          | platinum           | 7650
-------------------------------------------------------------
ksql>
  1. KSQL knows what field ROWKEY is so the first query run above should be accepted as valid and CLUB_STATUS substituted for ROWKEY automagically if necessary
  2. The error message should be clearer that you have to literally use ROWKEY, because it's not clear - as a user I think I understand keys, that my key is CLUB_STATUS, so why's it not working.
@derekjn
Copy link
Contributor

derekjn commented Oct 10, 2019

This is a known issue that will get resolved. I don’t think there’s an issue for it yet though so let’s track it with this one.

@big-andy-coates
Copy link
Contributor

Hey @rmoff - I'm aware of this shortcoming. However, I'm looking to deprecate `WITH(KEY='x') syntax soon and so this kind of becomes less important.

#3536 will see users able to name their keys something other than ROWKEY, which totally removes the need for WITH(KEY='x'), (#3537).

However, even with the WITH(KEY=?) syntax deprecated KSQL may still have the concept of 'this field in the value is the same as the key'. If so, then we could potentially still look to support what you're suggesting, but, IMHO, I think ROI is low and complexity is very high.

My vote is to close this issue as a 'will not do'

@big-andy-coates big-andy-coates mentioned this issue Oct 11, 2019
27 tasks
@rmoff
Copy link
Contributor Author

rmoff commented Oct 11, 2019

This was a table created from an aggregate, and the key came from the GROUP BY, so this will continue to be an issue even after WITH (KEY='x') is removed.

We should keep this as a task to do, IMO.

@big-andy-coates
Copy link
Contributor

I don't think it will. Consider:

-- create stream:
CREATE STREAM S (user STRING KEY,  INT ID, NAME STRING) WITH (topic='foo', format='json');

-- create an aggregate table:
CREATE TABLE T AS SELECT COUNT(*) AS COUNT FROM S GROUP BY ID;
-- schema of about will be 'ID INT KEY, COUNT LONG'.

-- issue pull query:
SELECT * FROM T WHERE ID = 10;

As you can see, not a 'key field' in sight... (remember key fields are when there is a copy of the key in the value). This just uses the key column name. Totally different code and much cleaner / clearer functionality!

@big-andy-coates
Copy link
Contributor

See #3537

@rmoff
Copy link
Contributor Author

rmoff commented Dec 2, 2019

@big-andy-coates I follow your login on the STREAM example now. Is this slated to happen 'soon'? I'm planning to blog about it and wondering how to couch an explanation of the current slightly-ugly syntax.

@big-andy-coates
Copy link
Contributor

WITH(KEY) syntax is no more... so closing...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants