Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KSQL should support object names with the - character (hyphen, dash) #1466

Closed
rmoff opened this issue Jun 20, 2018 · 12 comments
Closed

KSQL should support object names with the - character (hyphen, dash) #1466

rmoff opened this issue Jun 20, 2018 · 12 comments

Comments

@rmoff
Copy link
Contributor

rmoff commented Jun 20, 2018

- is rejected but _ is valid. At minimum, we should catch and report this limitation better to the user, as the current message is very unclear:

Doesn't work:

ksql> CREATE STREAM ratings-with-customer-data WITH (PARTITIONS=1) AS \
> SELECT R.RATING_ID, R.CHANNEL, R.STARS, R.MESSAGE, \
>        C.ID, C.CLUB_STATUS, C.EMAIL, \
>        C.FIRST_NAME,  C.LAST_NAME \
> FROM RATINGS R \
>      LEFT JOIN CUSTOMERS C \
>        ON R.USER_ID = C.ID \
> WHERE C.FIRST_NAME IS NOT NULL ;
line 1:22: mismatched input '-' expecting ';'
Caused by: org.antlr.v4.runtime.InputMismatchException

Works:

ksql> CREATE STREAM ratings_with_customer_data WITH (PARTITIONS=1) AS \
> SELECT R.RATING_ID, R.CHANNEL, R.STARS, R.MESSAGE, \
>        C.ID, C.CLUB_STATUS, C.EMAIL, \
>        C.FIRST_NAME, C.LAST_NAME \
> FROM RATINGS R \
>      LEFT JOIN CUSTOMERS C \
>        ON R.USER_ID = C.ID \
> WHERE C.FIRST_NAME IS NOT NULL ;

 Message
----------------------------
 Stream created and running
----------------------------
@rmoff rmoff changed the title KSQL should support topic names with the - character KSQL should support topic names with the - character (hyphen, dash) Jun 21, 2018
@rmoff
Copy link
Contributor Author

rmoff commented Jun 22, 2018

Similar issue if trying to register a stream over an existing topic:

ksql> print 'test-topic2' from beginning;
Format:JSON
{"ROWTIME":1529657665501,"ROWKEY":"null","foo":"bar"}
^CTopic printing ceased
ksql>
ksql> create stream test-topic2 (foo varchar) with (kafka_topic='test-topic2', value_format='json');
line 1:19: mismatched input '-' expecting ';'
Caused by: org.antlr.v4.runtime.InputMismatchException
ksql> create stream 'test-topic2' (foo varchar) with (kafka_topic='test-topic2', value_format='json');
line 1:15: no viable alternative at input 'create stream 'test-topic2''
Caused by: org.antlr.v4.runtime.NoViableAltException
ksql> create stream "test-topic2" (foo varchar) with (kafka_topic='test-topic2', value_format='json');

 Message
----------------
 Stream created
----------------
ksql> select * from test-topic2;
line 1:19: mismatched input '-' expecting ';'
Caused by: org.antlr.v4.runtime.InputMismatchException
ksql> select * from "test-topic2";
Invalid Expression test-topic2.ROWTIME.
ksql>

In this instance, the workaround is simply to create the stream with a non-hyphen name, e.g. :

ksql> create stream test_no_hyphens_thankyou_topic2 (foo varchar) with (kafka_topic='test-topic2', value_format='json');

 Message
----------------
 Stream created
----------------
ksql> select * from test_no_hyphens_thankyou_topic2;
1529657665501 | null | bar

@rmoff rmoff changed the title KSQL should support topic names with the - character (hyphen, dash) KSQL should support object names with the - character (hyphen, dash) Aug 3, 2018
@rmoff
Copy link
Contributor Author

rmoff commented Aug 3, 2018

+1 to this again, hit it today. At very least we should be catching the exception and being more friendly and tell the user not to use hyphens in the object name.

@joefromct
Copy link

It's not only the stream/table object, but also the column (in the json) as well right?

This is difficult for me because although I have control over my ksql table/stream names i have a client controlling the JSON that's actually populating it... and yup they have - in the name. :/

I'm even getting a more ambiguous error such as "name is null".

@fmjrey
Copy link

fmjrey commented Oct 1, 2018

It seems the KSQL grammar is hardcoded to follow the Avro spec.
I think it should follow the spec of whatever underlying value format is being used.
JSON is bound to be a successful format, and since in JSON any string can be a key, I advise KSQL to allow any string to be a field when JSON is the value format.
In my case I have JSON derived from datomic/datascript attributes, so supporting - . / would be needed.

@wlisac
Copy link

wlisac commented Nov 1, 2018

I'm also running into this issue with topic names with - in them.

@apurvam
Copy link
Contributor

apurvam commented Nov 1, 2018

Thanks for the feedback folks. I think this is an important issue to fix. We have audited the code base for multiple ways in which KSQL restricts the identifiers you can use (column names, topic names, etc), and how casing affects those. There is a long list of things to become consistent on.

In that list, I think the category where input data cannot be used (becasue of the topic name or because of the columns in the input schema) are at the top of the list. We hope to have some solutions around this by the March release.

@apurvam
Copy link
Contributor

apurvam commented Nov 6, 2018

related #1888

@Incara
Copy link

Incara commented Apr 16, 2019

Using ksql server 5.2.1 and running into this. Are there plans to bring this to a 2019 release?

@mahipalrampally
Copy link

I have same issue with usage of hyphen , is there any update on the fix?

@agavra agavra self-assigned this Oct 16, 2019
@agavra
Copy link
Contributor

agavra commented Oct 28, 2019

Hello everyone who has given a +1 on this (cc @Incara @mahipalrampally @fmjrey). There are two types of issues:

  • hyphens in field names (e.g. CREATE STREAM foo ("my-col" VARCHAR))
  • hyphens if collection names (e.g. CREATE STREAM "my-stream" (...))

The fix for the first is on master and will be available in the next KSQL release.

I am trying to understand if the second is a problem - KSQL does not require that naming of a stream/table is the same as that of a topic, so is it acceptable to reject statements that create streams/tables with hyphens in them (i.e. not allowing quoted stream/table names)? It will still be possible to do something like CREATE STREAM my_stream WITH (kakfa_topic='my-stream'...).

EDIT: the fix for the second is on the way!

@agavra
Copy link
Contributor

agavra commented Nov 4, 2019

The fix is checked in to master! To use something with a hyphen you will need to use quotes around the object name.

@pkgonan
Copy link
Contributor

pkgonan commented Jul 22, 2021

Hi. Is there any solution about Describe command ?

[Request]

DESCRIBE 'abc-data.kk_bb.ee_dd.mm_zz' EXTENDED;

[Response]

ksql> DESCRIBE 'abc-data.kk_bb.ee_dd.mm_zz' EXTENDED;
line 1:10: no viable alternative at input 'DESCRIBE 'abc-data.kk_bb.ee_dd.mm_zz''
Statement: DESCRIBE 'abc-data.kk_bb.ee_dd.mm_zz' EXTENDED;
Caused by: line 1:10: no viable alternative at input 'DESCRIBE
	'abc-data.kk_bb.ee_dd.mm_zz''
Caused by: org.antlr.v4.runtime.NoViableAltException

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants