-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON stream/table with non-uppercase field names returns nulls #2551
Comments
I ran into this now. I used "AS" to create a lower_case representation of the key, like below:
gives null results like below:
|
Thanks for reporting this @vcrfxia! I believe it should be fixed in master - I tested all the examples that you provided. Here are the more complicated of which: ksql> PRINT test2 FROM BEGINNING;
Format:JSON
{"ROWTIME":1572282171712,"ROWKEY":"null","myStruct":{"f1":123,"f2":"hello"}}
{"ROWTIME":1572282334159,"ROWKEY":"null","MYSTRUCT":{"f1":123,"f2":"hello"}}
^CTopic printing ceased
ksql> CREATE STREAM source (myStruct STRUCT<"f1" INT, "f2" VARCHAR>) with (KAFKA_TOPIC='test2',VALUE_FORMAT='JSON');
Message
----------------
Stream created
----------------
ksql> SELECT * FROM SOURCE EMIT CHANGES;
+------------------------------+------------------------------+------------------------------+
|ROWTIME |ROWKEY |MYSTRUCT |
+------------------------------+------------------------------+------------------------------+
|1572282171712 |null |{f1=123, f2=hello} |
|1572282334159 |null |{f1=123, f2=hello} |
^CQuery terminated
ksql> SELECT myStruct->"f1", myStruct->"f2" FROM source EMIT CHANGES;
+----------------------------------------------+----------------------------------------------+
|MYSTRUCT__f1 |MYSTRUCT__f2 |
+----------------------------------------------+----------------------------------------------+
|123 |hello |
|123 |hello |
^CQuery terminated
ksql> CREATE STREAM sink1 AS SELECT myStruct->"f1", myStruct->"f2" FROM source;
Message
--------------------------------------------------------------------------------
Stream SINK1 created and running. Created by query with query ID: CSAS_SINK1_5
--------------------------------------------------------------------------------
ksql> SELECT * FROM sink1 EMIT CHANGES;
+----------------------+----------------------+----------------------+----------------------+
|ROWTIME |ROWKEY |MYSTRUCT__f1 |MYSTRUCT__f2 |
+----------------------+----------------------+----------------------+----------------------+
|1572282171712 |null |123 |hello |
|1572282334159 |null |123 |hello |
^CQuery terminated Feel free to reopen if this still happens in production. There's still some issues in JSON if you specify multiple case-sensitive fields that map to the same case-insensitive value (e.g. |
Because KSQL's JSON deserializer currently only works with schemas that have all uppercase field names, if a user defines a JSON stream/table with a schema with field names that are non-uppercase, KSQL will be unable to read data in those fields and will treat them as null.
This leads to a variety of weird behaviors. For example, given this JSON topic with data:
this works as expected (with
auto.offset.reset=earliest
):but this doesn't:
In particular, those last two values shouldn't be null.
As another example, this limitation of the KSQL JSON deserializer can also lead to situations where a
SELECT <query>
prints data fine, but aCREATE STREAM ... AS SELECT <query>
followed bySELECT *
will print nulls.Suppose I've created a JSON topic and put some data in it:
I can define a stream and verify the data is there (with
auto.offset.reset=earliest
):Then, the following
SELECT
query prints results as expected:but not as
CREATE STREAM ... AS SELECT
followed bySELECT *
:The reason for this is because though the
CREATE STREAM sink1 AS SELECT ...
correctly creates and populates the new streamsink1
, KSQL is unable to read data back out ofsink1
sincesink1
has non-uppercase field names:Note that there is a simple workaround (until these JSON deserializer issues are resolved): simply use uppercase field names, which happens by default.
For the second example the workaround looks like:
The text was updated successfully, but these errors were encountered: