Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected table creation issue #936

Closed
vahidhashemian opened this issue Mar 14, 2018 · 6 comments
Closed

Unexpected table creation issue #936

vahidhashemian opened this issue Mar 14, 2018 · 6 comments
Assignees

Comments

@vahidhashemian
Copy link

  1. Create a text file customers.json with the following two lines:
{"id":"1","first_name":"Mitchell","last_name":"Brotheridge","email":"[email protected]","gender":"Male","address":"0358 Lyons Park"}
{"id":"2","first_name":"Rodolphe","last_name":"Lichfield","email":"[email protected]","gender":"Male","address":"06 Barnett Hill"}
  1. Create a kafka topic customer-raw:
$ bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic customer-raw --partitions 1 --replication-factor 1
  1. Produce the json entries above to this topic:
$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic customer-raw < customers.json
  1. Create a KSQL table based on the topic customer-raw:
ksql> create table customer_table (id varchar, first_name varchar, last_name varchar, email varchar, gender varchar, address varchar) with (kafka_topic='customer-raw', value_format='json', key='id');
  1. ksql> set 'auto.offset.reset'='earliest';
  2. ksql> select * from customer_table; hangs without printing any rows!

No relevant info was found in the logs to indicate the reason for the unexpected outcome.


Note that

ksql> create stream customer_stream (id varchar, first_name varchar, last_name varchar, email varchar, gender varchar, address varchar) with (kafka_topic='customer-raw', value_format='json');
ksql> select * from customer_stream;

works as expected, and prints out the two entries:

1521004125737 | null | 1 | Mitchell | Brotheridge | [email protected] | Male | 0358 Lyons Park
1521004125758 | null | 2 | Rodolphe | Lichfield | [email protected] | Male | 06 Barnett Hill
@dguy dguy self-assigned this Mar 14, 2018
@rmoff
Copy link
Contributor

rmoff commented Mar 14, 2018

Works fine producing the data using kafkacat and setting the message keys (which the case here doesn't).

Robin@asgard02 ~> kafkacat -b localhost:9092 -t customer-raw -P -K:
1:{"id":"1","first_name":"Mitchell","last_name":"Brotheridge","email":"[email protected]","gender":"Male","address":"0358 Lyons Park"}
2:{"id":"2","first_name":"Rodolphe","last_name":"Lichfield","email":"[email protected]","gender":"Male","address":"06 Barnett Hill"}
Robin@asgard02 ~>
Robin@asgard02 ~> kafkacat -b localhost:9092 -t customer-raw -C -f 'Key: %k\nValue: %s\n'
Key: 1
Value: {"id":"1","first_name":"Mitchell","last_name":"Brotheridge","email":"[email protected]","gender":"Male","address":"0358 Lyons Park"}
Key: 2
Value: {"id":"2","first_name":"Rodolphe","last_name":"Lichfield","email":"[email protected]","gender":"Male","address":"06 Barnett Hill"}
% Reached end of topic customer-raw [0] at offset 2
^C⏎
Robin@asgard02 ~>
CLI v0.5, Server v0.5 located at http://localhost:8080

Having trouble? Type 'help' (case-insensitive) for a rundown of how things work!

ksql> print 'customer-raw' from beginning;
Format:JSON
{"ROWTIME":1521022237049,"ROWKEY":"1","id":"1","first_name":"Mitchell","last_name":"Brotheridge","email":"[email protected]","gender":"Male","address":"0358 Lyons Park"}
{"ROWTIME":1521022245933,"ROWKEY":"2","id":"2","first_name":"Rodolphe","last_name":"Lichfield","email":"[email protected]","gender":"Male","address":"06 Barnett Hill"}

ksql> create table customer_table (id varchar, first_name varchar, last_name varchar, email varchar, gender varchar, address varchar) with (kafka_topic='customer-raw', value_format='json', key='id');

 Message
---------------
 Table created
---------------
ksql> set 'auto.offset.reset'='earliest';
Successfully changed local property 'auto.offset.reset' from 'null' to 'earliest'
ksql> select * from customer_table;
1521022237049 | 1 | 1 | Mitchell | Brotheridge | [email protected] | Male | 0358 Lyons Park
1521022245933 | 2 | 2 | Rodolphe | Lichfield | [email protected] | Male | 06 Barnett Hill

Could be related to the message key not being set?

@bluemonk3y
Copy link

bluemonk3y commented Mar 14, 2018

Agree w @rmoff - perhaps try:

$ bin/kafka-console-producer --broker-list localhost:9092 --topic logon --property "parse.key=true" --property "key.separator=:"
42:{"logon_id":42,"CUSTOMER_ID":999,"LOGON_DATE":"2017-09-15 09:08:38"}

@dguy
Copy link
Contributor

dguy commented Mar 14, 2018

I guess the question here is if the message key is null and you want to create a table should it use the key as described in the message. i.e., you supply WITH(key='id') and then it isn't used which is not intuitive

@rmoff
Copy link
Contributor

rmoff commented Mar 14, 2018

Related: #749

@vahidhashemian
Copy link
Author

Thank you all for looking at this issue. I confirm that providing a key in the message works. However, as @dguy mentioned, this is not intuitive, and it would be nice if KSQL can handle null keys too in cases like this (perhaps key enforcement can be done by KSQL internally).

@rmoff
Copy link
Contributor

rmoff commented Jun 21, 2018

Closing, #749 is the issue to track as I agree it is not ideal.

@rmoff rmoff closed this as completed Jun 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants