0.28.2 (2022-08-02)
- Adding ksqlDB Query Status metric. (#9283) (5ad9cf8)
- Support pausing/resuming persistent queries (#9203) (9f0f74a)
- enable new emit-final implementation (#9141) (2af201f)
- Added numerous trigonometric UDFs (#9213) (cd85a43)
- donot reset version for release branch (227563b)
- make api client recognize ddl warnings better (#9341) (c565ac5)
- Map invalid casts to null. (#9336) (36608cf)
- allow YATT to insert into and check contents of DDL sources (#9321) (76e75b8)
- classify KsqlSerializationException as USER error based on topic(KSE-1045) (56dddbb)
- Create a KsqlSerializationException class (e986f66)
- DESCRIBE FUNCTION failing for annotated UDAFs with initial args (#9277) (cc69c20)
- remove regex from migration tool (#9254) (5089065)
- Return proper status code for QPS ratelimit. (dcfe794)
- use JsonSchemaConverter to support JSON anyOf types (#9130) (798c12d)
- add getAuthToken method to AuthenticationPlugin interface (#9239) (b6fc8d9)
- change auth token provider to accept token strings instead of principals (#9255) (b45841a)
- Excludes Guava from Guava-retrying in order to manage Guava dependencies more reliably. (#9260) (c901ac4)
- Removing reverted configuration org.apache.kafka.streams.Streams… (#9248) (bde8f40)
- allow trig fxn differences between Java versions in tests (#9226) (0cae003)
- change consumer_group_member_id tag to just member to match Druid name (#9225) (904d995)
- respect authentication.skip.paths properly (#9224) (4c33edd)
- remove topic tag conversion hack and update tests (#9219) (adb7a67)
0.27.1 (2022-07-21)
- add assert methods to java client (#9099) (e7109f6)
- add ASSERT SCHEMA statement (#9091) (4b450ec)
- add ASSERT TOPIC command (#9066) (cd5254f)
- add metric for query restarts (#9045) (b016139)
- add metric that's emitted when processing log emits an error (#9035) (3f0a0ad)
- add ProtoBuf as a content type for pull queries over /query-stream endpoint (#9103) (e64e284)
- add support for assert statements to migration tool (#9107) (81a0b47)
- assert not exists topic (#9086) (4b57b55)
- automatically build confluent cloud image on every master merge (#9096) (07161e1)
- clean up processing log metric (#9105) (926e440)
- close websocket connections with expired tokens (#9147) (eeb8324)
- cull the list of API consumable/editable properties (#9134) (8b808ca)
- enable max/min udaf for string & bytes data types (030f214)
- introduce ATTR aggregation function (#9168) (2e1c457)
- migrate java client to use application/vnd.ksql.v1+json format (1bb24c3)
- Support all wildcard (*) on struct reference syntax (#9120) (ff9b28b)
- Support BYTES and time types for TopKDistinct, Greatest, and Least (#9202) (6824f23), closes #9125 #9125
- support checking preconditions before starting core app (#9026) (33a6a04)
- allow aggregations without group bys (#8986) (3721a1e)
- allow STREAMS with no key (#8949) (ffb1b38)
- Add PROTOBUF_NOSR format and fix multi-format rtests (#9078) (53afd8a)
- convert topic tag name and add consumer group member id tag to ThroughputTotalMetrics (#9215) (67b4e17)
- add BYTES support for KAFKA format (#9180) (844d275)
- Allows results from CAST to compared. (#9186) (3defb6b)
- ambiguous reference to close issue (#9167) (2e7d0db)
- change group name and extend CumulativeSum in ThroughputMetricsReporter (#9211) (6bcd47f)
- classify KsqlFunctionException as USER error (f2877e8)
- classify SR missing subject and access rights query errors as USER errors (#9072) (90a609d)
- fix the interval check in RocksDBMetricCollectorTest (#9210) (6fee500)
- Fixes a few null handling bugs (#9127) (bd90347)
- move misplaced query-level configs to the correct list (#9144) (86d4fbb)
- re fetch streams for each materializationProviderBuilder (#9036) (4340dfa)
- reset collector before reconfiguring (#9205) (a641a0b)
- revert default
/query-stream
Content-Type toapplication/vnd.ksqlapi.delimited.v1
fromapplication/vnd.ksql.v1+protobuf
(#9145) (3e3322e) - throw KsqlFunctionException while aggregating in sum udaf #9052 (9e9d10e)
- use the engine's KsqlConfig to build queries (#9040) (503e4cd)
- fail validation on CREATE CONNECTOR if connector already exists (#9014) (94a74fa)
- Improved/fixed aggregate function error messages. (#8977) (57f3596), closes #8976
- include header columns when injecting schemas (#9023) (4b5feb0)
- move create connector validation to validate phase (#8999) (73e5463)
- register state listener after restarting runtime (#9032) (1f7a09a)
- remove ErrorEntity and throw on connector error instead (#8998) (6e55521)
- Repartition RHS of a FK join if it uses SR schema (#8926) (09c7bbe)
- Revert "chore: upgrade vertx to 4.2 (#8975)" (#9003) (d4214de)
- shared runtimes calculate cache size for validation properly (#8923) (979d4a5)
- wait longer while waiting for expected spq (#8973) (3ce54f1)
0.26.0 (2022-04-28)
- Add support for Stream-Stream and Table-Table right join (#8845) (c8975d3)
- Support MIN/MAX udafs for Time/TS/Date types (#8924) (6399d30)
- INSERT fails when serializing Proto/Avro nested Structs (#9038) (cd6cbf2)
- INSERT/VALUES on a stream with SCHEMA_ID/SCHEMA_FULL_NAME fails (#9047) (bfd910b)
- update QTT tests for Proto multiple schema definitions (8e077cb)
- Create stream fails when multiple Protobuf schema definitions exist (#8933) (87c0ce1)
- Guard null struct dereferncing inside function calls (#8918) (51753aa)
- INSERT VALUES fail when SR schema has a non-default name (#8984) (75da9fd)
- remove double quotes from json_records function (#9028) (86544ca)
- only make shared runtime that its chosen (#8932) (eb215c8)
- docs: functional tests min version for lower bound (#8922) (c33ee7b)
- back out post-3.1 changes to fix 7.1.x build (#8599) (f42225d)
0.25.1 (2022-04-07)
- add java client support for push query v2 ALOS through continue method (#8785) (11ede7c)
- add cleanup service metrics (#8779) (1cb7846)
- add rate-limiting to ksql command topic (#8809) (b2f1540)
- allow users to specify custom migrations dir location (#8844) (cbe447e)
- cleanup transient query resources (#8694) (24b2a7a)
- create command topic with command topic configs (#8742) (ea5d6d7)
- Extend Udaf interface to allow for polymorphic UDAFs. (#8871) (2dae2a1)
- Generalize the UDAFs collect_list and collect_set (#8877) (1416ecd)
- Generalize the UDAFs earliest_by_offset and latest_by_offset (#8878) (adac458)
- include checking the the config during validation of SET (#8718) (d8ff588)
- support custom request headers from java client and migrations tool (#8787) (ffe57f5)
- preserve old schema behavior for protobuf wrapped primitives (#8934) (36485e2)
- Add null handling to functions (#8726) (6117604)
- Adds error handling for nested functions (#8850) (2a60269)
- Apply the ExtensionSecurityManager to UDAFs (#8776) (a37688c)
- bug preventing decimals in (-1, 1) from being inserted / queried (#8720) (de8284a)
- CLI should return non-zero error code on failure (#8892) (238f4fd)
- coerce property values to correct type (#8765) (7f7a076)
- do not include schema id in session config (#8869) (5ddb852)
- Ensures response end handler is invoked just once (#8849) (a2efcc5)
- Fix bugs in sample standard deviation UDAF (#8728) (b2f993b)
- Gives a query completed message for stream pull queries (#8612) (a44a16b)
- ksql: add ifExists/ifNotExist parameters to java client connector functions (#8851) (eaf2b1f)
- ksql: allow migrations tool to run connector commands with IF [NOT] EXISTS clauses (#8855) (a7c8689)
- make writes to the backup file atomic (#8566) (b113e9e)
- pass DCN_NULLPOINTER_EXCEPTION spotbugs error (#8775) (afd7fb1)
- prevent hanging stream pull queries on truncated topics (#8740) (d66107c)
- re-order cache usage error (#8909) (336c690)
- register schema within sandbox (#8614) (ba572e0), closes #1394
- reinstate the old KsqlRestClient.create overload (#8761) (ee4a1bc)
- resolve schema registry issue for pull query (#8876) (6a1c2ae)
- restore process fails due to DROP constraints (#8803) (db070a2)
- Set the sslFactory properly for the SR REST client. (#8830) (e69a545)
- update rate limiting test so it's not flaky (#8872) (466f8fe)
- update restore command topic tool to work with command topic configs (#8802) (371b200)
0.24.0 (2022-02-11)
- new CLI params to provide credentials for cloud connector management (#8684) (5e5af50)
- [UIF-1113] Add weighting to search results based on domain (#8565) (f505f31)
- Add int/bigint/double conversion functions from bytes (#8426) (da77c8a)
- Add is_json_string UDF (#8600) (c12a745)
- Add json_array_length UDF (#8602) (f7c1334)
- add json_keys UDF (#8603) (bb78fcd)
- Add json_records, to_json_string, and json_concat UDFs (#8632) (3b6d828)
- add max task usage and num stateful tasks metrics (#8549) (ba27e3a)
- add support for connect specific https configs (#8553) (37507ab)
- Adds support to /query-stream for http1.1 and StreamedRow json format (#8449) (dd61db3)
- allow custom connect auth header configuration (#8635) (aebec54)
- Allow to plug-in custom error handling for Connect server errors (#8480) (c4b4d67)
- Change API to return Position from Streams (#8590) (b2b6a73)
- Continuation tokens and ALOS for SPQs (#8342) (16c2820)
- custom auth configs for ksql connector requests (#8476) (7a9a61a)
- don't book keep sources in runtime as well (#8532) (110acc7)
- support custom headers for connector requests (#8357) (97fa117)
- un-synchronize PullQueryQueue row queuing when limit absent (#8563) (c7793a8)
- Use IQv2 when executing range and table scan pull queries (#8556) (6eda7a4)
- Validate connector config before creating it (#8445) (62c8021)
- pull query LIMIT clause (#8333) (5c9be20)
- expose Kafka headers to ksqlDB (#8350,#8366,#8416,#8417,#8475,#8496,#8516) (db76b3e,12dbdad,e030f10,0239a95,065de82,bd452aa,b18fb09)
- Allow schema id in source creation (#8441,#8421,#8411,#8401,#8185,#8572,a60bb21,b490204,8a11ac2,8aa55b1,7adc5cb)
- fix create connector config handling for name config (d51c6db)
- 404 for /topics connect endpoint logs a warn instead of showing up in CLI (#8462) (d722743)
- add GRACE and PERIOD to the nonReserved and automate testing (#8596) (b975086)
- Allow disabling Scalable Push Query ALOS to be set via config (#8542) (7ac36e0)
- Consider topics created by join operations internal (#8520) (ebe0c49)
- disallow negative integers in pull query limit clause (#8491) (3b5d91b)
- disallow persistent query limit clause (#8506) (60e9efa)
- explicitly set confluent log4j version to 1.2.17-cp5 (#8497) (b2470f3)
- Fix DROP STREAM IF EXISTS DELETE TOPIC (#8559) (ab09de6)
- Fix some typos (#8518) (795629f)
- Fixes an NPE which was causing terminate to fail (#8593) (796267a)
- Fixes the AllHostsLocatorTest (#8664) (802f130)
- invoke per-query listeners upon state change in shared runtime (#8509) (998cc15)
- make adding to pull query queue threadsafe (#8544) (6454ef1)
- make HEADER and HEADERS non-reserved (#8490) (c8c4438)
- Make JaasPrincipal public (#8567) (8da7fe5)
- make PullQueryConsistencyFunctionalTest more reliable (#8530) (482880a)
- make PullQueryLimitHARoutingTest more stable (#8448) (85e5d29)
- make tutorial work (#8625) (52b6d67)
- maybe fix flaky test (#8457) (acd80ef)
- Prevents internal http2 requests from having shared connections closed (#8507) (5b0be58)
- pull queries are supported on source streams now. (#8597) (9e6ec15)
- refactoring properties replacement before restarting runtime (#8510) (be89c60)
- remove unnecessary import (#8451) (33ef63f)
- remove usages of internal Streams API/variable that was removed upstream (#8660) (9615a02)
- SandboxKafkaTopicClient should use default replication factor if applicable (#8551) (5c8c186)
- Streams overrides bugfix (#8514) (366af1f)
- typo in TimestampExtractionPolicyFactory (#8539) (ce1e9b2)
- variable substitution with CREATE CONNECTOR in migrations tool (#8547) (baa9422)
- allow valid @ character on JSON extract functions (#8422) (290ff98)
- fix broken test (#8439) (6b15250)
- Fixes flaky ScalablePushBandwidthThrottleIntegrationTest caused by timeout (#8410) (fbbc076)
- make to_bytes work with lowercase hex (#8423) (fee904e)
- tolerate out-of-order execution on HARouting tests (#8415) (c4cc6d5)
- fix warning (#8397) (66c96f4)
- synchronizing writes to localcommand file (#8406) (ed50b05)
- Fixes a timestamp bug with Scalable Push Queries (#8377) (507bdb5)
- return error message if table functions are used inside a CASE (#8327) (f0b9e96)
- upgrade Netty for 5.5.x (#8389) (e95ff38)
- Makes PullQueryRoutingFunctionalTest more reliable and adds back test (#8351) (152eea8)
- Makes tests wait only 500ms for consumer group cleanup (#8367) (1a2f7b2)
- Pull Query Key Extraction Optimizations (#8346) (6ad333e)
- update client to fix build (#8369) (9ad563d)
- port changes from #7863 to 5.4.x (#8409) (25c9ca5)
- When creating connectors through ksqlDB, ksqlDB no longer automatically sets the
key.converter
config to be theStringConverter
, because ksqlDB has supported key types other than strings for many releases now.
- Revert "Bump Confluent to 7.2.0-0, Kafka to 7.2.0-0" (b6e795f)
0.23.1 (2021-12-14)
- [UIF-1010] Add swiftype metatag for site of ksqldb (#8308) (916f38b)
- Add consistency vector handling to CLI and Java client (#8264) (a651677)
- add detailed scalable push query metrics with type breakdown (#8178) (561af53)
- Add metrics for stream pull queries to WS (#8174) (d75f254)
- Add support for TIMESTAMP type in the WITH/TIMESTAMP property (#8271) (ecb43e2)
- enable ROWPARTITION and ROWOFFSET pseudo columns (KLIP-50) (#8245) (7bdc41d)
- Re-enable GRACE period with new stream-stream joins semantics (#8236) (f640f5e), closes #8020 #8027 #8028 #8047
- Add user-friendly error message for SELECT null AS column (#8276) (57b7dea)
- allow insertion of null values in migrations tool (#8297) (f42d40a)
- allow more time for consumer group cleanup (#8189) (d2ff1ac)
- Allows remote pull queries to be cancelled (#8252) (4efa445)
- ClassCastException when dropping sources with 2+ insert queries (#8205) (a7c6ebe)
- close command runner when command topic is deleted (#8208) (294171c)
- doesnt print error (#8232) (901b968)
- dont throw error on processing local commands, just log (#8310) (8bc57a2)
- During ksql startup, avoid recreating deleted command topic when a valid backup exists #7753 (#8257) (f3f1d5c)
- fix_application_id_to_work_with_acls (#8277) (64f58e8)
- get rows returned metric working (#8230) (da4f71e)
- if not exists return type (#8322) (9da204c)
- make parse_date able to parse partial dates (#8330) (6a82026)
- Print only failed line on parsing exception (#8282) (701db5f)
- Pull query table scans support LIKE and BETWEEN operators (#8299) (bc3ea64)
- Refactor ConnectFormatSchemaTranslator to take translator object instead of lamda function (#8177) (6d3b351)
- Specify source type in DROP referential integrity errors (#8253) (03bd713)
- update error msg (#8221) (8c30780)
0.22.0 (2021-11-03)
- add configurations around endpoint logging (#8249) (21f4e03)
- add consumerGroupId to QueryDescription (#8073) (cce585b)
- add CumulativeSum total bytes metric (#7987) (aa213bd)
- add SOURCE streams/tables #8085,#8063,#8061,#8004,#7945,#8022,#8043,#8009) (5416cde,88c6192,e2c3211,0d0e85a,70565f2,a322310,381b7bf,2c22b8f)
- add methods and classes to execute low level HTTP requests in the ksqldb-api-client (#8118) (aae5f95), closes #8042
- add pull queries on streams (#8126,#8168,#8124,#8143,#8115,#8064,#8045) (e651891,c09c362,a722184,57626f3,a14f957,4831e6b,886da99)
- add persistent query saturation metric (#7955) (eed625b)
- add shared runtime config to QueryPlan (#8074) (351f695)
- optimize key-range queries in pull queries (#7993) (22a79bc)
- scalable push query bandwidth throttling (#8087) (d5af6a1)
- update storage utilization metrics to start when app is initialized (#8095) (8add4ac)
- add storage utilization metrics (#7868) (22a8741)
- Allow scalable push queries to handle rebalances (#7988) (b3dbed3)
- fail scalable push query if error is detected in subscribed persistent query (#7996) (eed501c)
- perform SchemaRegistry permissions on C*AS sink subjects (#8039) (6878233)
- shared runtimes (#7721) (44e8129)
- terminate transient queries by id (#7947) (1ee8487)
- updated minor syntax updates in docs (#7949) (248dc91)
- add logcal cluster id to observability metrics (#8141) (d76a4fc)
- add new types to udaf functions (#8081) (a3ea6a4)
- add time types to Java client (#8091) (fd9faf2)
- Allow multiple EXTRACTJSONFIELD invocations on different paths (#8122) (#8123) (7f1d407)
- Always dec. concurrency counter, even hitting rate limits (#8165) (0e05519)
- change metrics tag (#8268) (ff277ba)
- CREATE OR REPLACE TABLE on an existing query fails while initializing the kafka streams (#8130) (f03755e)
- Enables ALPN for internal requests when http2 and tls are in use (#8094) (7faf77c)
- Ensure that we always close /query writer (#8164) (f7c2002)
- Ensures background timer completes for scalable push queries (#8132) (1a1297a)
- fix spq bandwidth throttling on http2 (#8119) (d5d413d)
- Fixes errors and increased latency for pull queries from closing connection (#8248) (d98f50a)
- issue 7948. Allow insert into table using Java API (#8114) (a6c2cac)
- Removes config check to insert SPQ Processor (#8062) (a0be1d5)
- skip adding invalid if not exists to cmd topic (#8206) (e164b18)
- stream pull query internal overrides shouldn't clash with query configs (#8166) (0c57258)
- use a looser check on the error msg for overflow (#8098) (0088fb1)
- date parsing functions: case-insensitive name parsing (#8015) (73bdb18)
- Fixes pull query latency distribution metrics (#7992) (c798cd7)
- make KsqlAvroSerializerTest work with Java 16 (#7873) (bab8874)
0.21.0 (2021-09-15)
- add
ARRAY_CONCAT
UDF (#7761) (1de9ef8) - add BYTES type to ksqlDB (#7778,#7804,#7791,#7823) (06657ba,02352f2,2ae4cae,df5964e)
- add interface for metrics reporter (#7788) (0ee06d2)
- add observability metric skeleton (#7769) (3362c00)
- add TO_BYTES/FROM_BYTES functions for bytes/string conversions (#7831) (cea0989)
- allow expressions on left table columns in FK-joins (#7904) (a9668de)
- allow Java clients to set HTTP/2 multiplex limit (#7871) (790d6fe)
- don't start queries when corruption is detected during startup (#7821) (4c0c181)
- enable BYTES for LPAD and RPAD (#7909) (f0c23b1)
- enables BYTES for CONCAT and CONCAT_WS (#7876) (f631c2b)
- make SchemaRegistry permission validations on KSQL requests (#7773) (ad01b72)
- update len function to accept BYTES (#7865) (eaaa0db)
- Update SUBSTRING function to accept BYTES types (#7861) (fccc56d)
0.20.0 (2021-07-26)
- add
LEAST
andGREATEST
UDFs (#7683) (0d84733) - add DATEADD and DATESUB functions (#7744) (c63e924)
- add FROM_DAYS and update UNIX_DATE function (#7742) (3c68710)
- add PARSE_DATE and FORMAT_DATE functions (#7733) (5a64ed7)
- add PARSE_TIME and FORMAT_TIME functions (#7722) (9a381a8)
- add TIMEADD and TIMESUB functions (#7727) (75806a0)
- pull query bandwidth based throttling (#7738) (8f01ad9)
- add the DATE and TIME sql types (#7641,#7664,#7718,#7734,#7708,#7740,#7674,#7700)(661f198,7537d87,a94f1f2,78b9ae8,18cc030,79d14fb,7718955,4175ad5)
- block out of order migrations in migrations tool (#7693) (1d617d3) (#7678) (9bf9abf)
- enable schema inference for timestamp/time/date (#7737) (35b1cad)
- enable time unit functions for interpreter (#7709) (a26a297)
- fixed nondeterminism in UdfIndex (#7719) (cd1a988)
- make current java clients compatible with pre-0.15 servers (#7667) (8f2d799)
- remove time/date component when casting timestamp to date/time (#7724) (87cd3c7)
- return an error message on http2 /query-stream endpoint (#7750) (3a4348b)
- Fixes race condition exposing uninitialized query (#7627) (98b1e3c)
- Existing queries that relied on vague implicit casting will not be started after an upgrade, and new queries that rely on vague implicit casting will be rejected. For example, foo(INT, INT) will not be able to resolve against two underlying function signatures of foo(BIGINT, BIGINT) and foo(DOUBLE, DOUBLE). Calling a function whose only parameter is variadic with an explicit null will also result in the call being rejected as vague.
It's worth noting that queries which relied on this vague implicit casting were never truly supported, as they would have been nondeterministic. UDFs most likely to be adversely effected are ones which have multiple numerical overloads with repeated logic. For example, if you had two UDFs foo(INT, INT) and foo(BIGINT, BIGINT), both which relied on logic which returns null if both input parameters are null, then the prior behavior would have been to nondeterministically route to one of the two functions, at which null would be returned either way (due to the duplicate logic). This change will break existing queries that relied on this nondeterministic routing.
0.19.0 (2021-06-08)
- add idle timeout (#7556) (db35b98), closes #6970
- added NULLIF function (#6567) (#6685) (d7c9e43)
- Adds Scalable Push Query physical operators (#7430) (100767d)
- Adds Scalable Push Query Routing (#7587) (278a261)
- Adds ScalablePushRegistry and peeking ability in a persistent query (#7424) (89c1588)
- Allow ksqlDB to detect FK-join table-table join condition (#7452) (344d36d)
- build physical plan for FK table-table joins (#7517) (744fa36)
- made class to compute essential meta-data (#7434) (e53aa13)
- ungate support for foreign key joins (#7591) (061fb4a)
- use Connect default precision for Avro decimals if unspecified (MINOR) (#7615) (1abdb0d)
- allow for KSQL_GC_LOG_OPTS env variable to be passed through (#7577) (aa2ed0a)
- better error message on illegal arithmetic with NULL values (#7554) (867a587)
- change the isError to not use the state (#7483) (41ec430)
- DROP stream for persistent query doesn't always drop underlying query (#7601) (b751cad)
- extended query anonymizer tests with functional tests queries (#7480) (9a67191)
- fix shouldNotBeAbleToUseWssIfClientDoesNotTrustServerCert test (#7614) (39ef7bb)
- fix the 5.2.x build to be compatible with newer jetty/jackson versions (#7575) (103eeec), closes #5725
- Fixes org.mock-server to be 5.11.2 to utilize newer netty (#7458) (0df7dfc)
- multi-column keys are broken in some scenarios when rearranged (#7477) (453ca8b)
- qualified select * now works in n-way joins with repartitions and multiple layers of nesting (#7585) (0879cb0)
- reject mismatched decimals from avro topics (#7544) (85ba0f1)
- Update Maven wrapper and document its use to fix version resolution (#7620) (535fc09), closes confluentinc/maven#1
- Use Java's Base64 instead of Jersey's. (#7534) (0ea0ae3)
- Fix: ksqlDB engine does not infer the Struct type correctly from protobuf schema (#7642)
0.18.0 (2021-05-26)
- implemented working query anonymizer (#7357) (fa0445f)
- add 'show connector plugins' syntax (#7284) (be50d2d)
- Detailed pull query metrics broken down by type and source (#7272) (9c173a6)
- emit an error reason before closing websocket (#7390) (c2d9372)
- Materialize Table-Table join results (#7246) (4ae1b31)
- include task metadata information from remote hosts in query descriptions (#7331) (c0e1e73)
- print stats/errors breakdown by host in cli (#7296) (20a4ea5)
- include aggregated metrics in source descriptions (#7252) (0b30ed9), (#7235) (924fc5b)
- add --define flag to migrations tool apply command (#7401) (165e972)
- support DEFINE and UNDEFINE statements in migrations tool (#7366) (330db93)
- enable variable substitution for /query-stream and /ksql endpoints (#7271) (f6dd212)
- enable variable substitution for java client (#7335) (c82a072)
- Add connector functions to java client (#7222) (7766195)
- Add line breaks to error message (#7324) (1695f39), closes #7205
- Append state.dir directive to ksql-server.properties (#7003) (4893e90)
- Bubble up errors from HARouting unless using StandbyFallbackException (#7238) (ec12516)
- fix NPE when backing a record that has null key/values (#7268) (0cbd4e8)
- preserve the rest of a struct when one field has a processing error (#7373) (6d708db)
- stop long-running queries from blocking the main event loop (#7420) (242fefb), closes #7358
- stop worker-poll tasks from blocking main loop (#7427) (0b0bf65), closes #7358
- allow java client to accept statements with more than one semicolon (#7243) (4086acb)
- fix NPE when closing transient queries (#7530) (bc64edd)
0.17.0 (2021-04-26)
- adds support for lambda functions (#6955) (1b39ab5), (#6868) (dd3f365), (#7075) (d6529a3), (#7148) (c8f745e), (#6966) (d09c99e), (#7056) (1a042cd), (#6994) (563ff9b)
- Adds ability to bypass cache for pull queries (#6891) (4b3bc96)
- Allows pull queries with generic WHERE clauses (#6939) (c3fe8a1)
- Adds an expression interpreter to improve pull query performance (#7006) (5d2cd83)
- implements migrations tool and corresponding commands (#6988) (8bdb09a), (#7161) (2c614cd), (#7190) (4151bae), (#7137) (608cb5e), (#7099) (a75c355), (#7153) (2e546b6), (#7133) (c91e23c), (#7145) (956e799), (#7087) (d4cf400)
- add maven wrapper (#7307) (d0ab425)
- CLI should fail for unsupported server version (#7097) (a0745b9)
- Add timestamp arithmetic functionality (#6901) (e2c06dc)
- added script to export antlr tokens to use in frontend editor (#7118) (7e1c180)
- makes tables queryable through efficient and flexible table scans (#7155) (71becea), (#7188) (55f5403), (#7085) (65d0df1)
- limit the number of active push queries everywhere using "ksql.max.push.queries" config (#7109) (906f4c5)
- classify authorization exception as user error (#7061) (a74b77c)
- check for null inputs for various timestamp functions (#7180) (42496c1)
- Ensures BaseSubscriber.makeRequest is called on context in PollableSubscriber (#7212) (da67bd9)
- fix the cache max bytes buffering check (#7181) (f383800)
- get all available restore commands even on poll timeout (#6985) (28a7ba9)
- ksql.service.id should not be usable as a query parameter (#7192) (cc5cd81)
- Make pull query metrics apply only to pull and not also push (#6944) (1db18b3)
- prevent IOB when printing topics with a key/value with an empty string (#7162) (177d0db)
- Pull Queries: Avoids KsqlConfig copy with overrides since this is very inefficient (#7193) (b36a3ce)
- update default kafka log4j appender configs for sync sends (#7078) (8bc16b3)
0.16.0 (Not released publicly, build hiccups)
- client: add getInfo method to java client (#7030) (b09f003)
- add describe option to list streams/tables (#6827) (26e3dea)
- add functions to support use of timestamp data type (#6852) (2cee618)
- add standard deviation udf (#6845) (f2fdbb3)
- display precision and scale when describing DECIMAL columns (#6872) (fb89998)
- enable decimal for Protobuf (#6884) (6b5877c)
- Make pull queries streamed asynchronously (#6813) (b69e3f8)
- make UDAFs configurable and remove limit on COLLECT_LIST/SET (#6851) (63ae169)
- Rewrites pull query WHERE clause to be in DNF and allow more expressions (#6874) (b8e0c99)
- Support timestamp protobuf serde (#6927) (5ea1ce4)
- timestamp support - casting, comparisons and serde (#6806) (a27df46)
- client/server SSL settings fail when 'ssl.key.password' is set (#6763) (3e48540)
- do not log data on JSON deserialization errors (MINOR) (#6930) (1834da6)
- ff-4240-upgrade httpclient version (#6935) (19f4d63)
- fix how the buffer limit check evaluates streams config (#6876) (ad1cc2a)
- format cast arguments with passed context (6.0.x) (#7032) (8c0d93d)
- make formattimestamp and parsetimestamp default to utc (#6954) (a5ea98a)
- Make Select * avoid code gen for projections (#6846) (0896b85)
- npe when getting topic configs (#6946) (5e026d4)
- remove mutable subscriber field on an endpoint (#6905) (98c7d73)
- use the right name for fallback subject for transient queries (#6821) (d044d64), closes #6817
0.15.0 (2021-01-20)
- expose support for array and struct keys (#6722) (c7fc2b0)
- support PARTITION BY on multiple expressions (#6803) (5a6b48e)
- ungate support for multi-column GROUP BY (#6786) (9900623)
- expose AVRO and JSON_SR as key formats (#6694) (07dc0c7)
- support PROTOBUF keys (#6692) (821faac)
- add partitions to PRINT TOPIC output (#6641) (1f4eff8)
- Adds logging for every request to ksqlDB (#6615) (57b0c91)
- optional
KAFKA_TOPIC
(862c59e) - support table joins on key format mismatch (#6708) (989e52b)
- cli to show tombstones in transient query output (#6462) (ef3039a)
- new syntax to interact with session variables (define/undefine/show variables) (#6474) (df98ef4)
- terminate persistent query on DROP command (#6143) (b5ac1bd)
- update ksql restore command to skip incompatible commands if flag set (#6524) (4d0c997)
- catch stack overflow error when parsing/preparing statements (#6727) (37371cc)
- change locate() error message for a more user-friendly message (#6709) (e6ba436)
- CREATE IF NOT EXISTS does not work at all (#6073) (6edf7ec)
- don't create threads per request (#6665) (132d50d)
- fix error categorization on NPE from streams (#6655) (db6ad5b)
- Fixes bug in latests-by-offset when using nulls and sessions windows (#6699) (8ff52ca)
- include 'ksql.streams.topic.*' prefix properties on LIST PROPERTIES output (#6753) (8071af2)
- LDAP Authentication (#6800) (1db8b5b)
- Makes response codes rate limited as well as prints a message when it is hit (#6701) (bdec3dd)
- Removes orphaned topics from transient queries (#6714) (06d6e3e)
- throw error message on create source with no value columns (#6680) (14465a2)
- allow reserved keywords on variables names (#6572) (2da360a)
- Bypass window store cache when doing windowed pull queries (#6548) (8f84e41)
- cannot reference variables in DEFINE statement (#6573) (ee31fde)
- Check for index before removing value in undo of COLLECT_LIST (#6603) (2d92144)
- propagate null-valued records in repartition (#6647) (d3007f2)
- (minor) don't use deprecated jersey calls (#6732) (a44b3e9)
- use Java's Base64 instead of jersey's (#6702) (f9fb523)
- Queries with GROUP BY clauses that contain multiple grouping expressions now result in multiple key columns, one for each grouping expression, rather than a single key column that is the string-concatenation of the grouping expressions. Note that this new behavior (and breaking change) apply only to new queries; existing queries will continue to run uninterrupted with the previous behavior, even across ksqlDB server upgrades.
- stream-table key-to-key joins on mismatched formats will now repartition the table (right hand side) instead of the stream. Old enqueued commands will not be affected, so this change should remain invisible to the end-user.
0.14.0 (2020-10-28)
- Add support for IN clause to pull queries (#6409) (d5fc365)
- Add support for ALTER STREAM|TABLE (#6400) (a58e041)
- support variable substitution in SQL statements (#6504) (e185c1f)
- enable support for
JSON
key format (#6411) (fe97cde) - support for
DELIMITED
key format (#6344) (04af65c)
NONE
format for key-less streams (#6349) (25bb352)- add aggregated rocksdb metrics (#6354) (ecc6625)
- Add an endpoint for returning the query limit configuration (#6353) (84d202d)
- add commandRunnerCheck to healthcheck detail (#6346) (5f64d05)
- Add metrics for pull query request/response size in bytes (#6148) (946d2d3)
- avoid spurious tombstones in table output (#6405) (4c7c9b5)
- new CLI parameter to execute a command and quit (without CLI interaction) (#6267) (0d60246)
- support Comparisons on complex types (#6149) (0695213)
- new command to restore ksqlDB command topic backups (#6361) (036df20)
- new CLI parameter (-f,--file) to execute commands from a file (6440) (0e03a38 )
- new syntax to interact with session variables (define/undefine/show variables) (6474) (df98ef4)
- #6319 default port for CLI: set default port if not mentionned (#6410) (d46b147)
- add back configs for setting TLS protocols and cipher suites (#6558) (cf7de69)
- avoid RUN SCRIPT to override CLI session variables/properties (#6551) (8603ec8)
- backup files are re-created on every restart (#6348) (28b8486)
- check for nested UnspportedVersionException during auth op check (#6467) (369c3f1)
- Internal Server Error for /healthcheck endpoint in RBAC-enabled (#6482) (ebee5ec)
- JSON format to set correct scale of decimals (#6295) (57b7b2e)
- Properly clean up state when executing a command fails (#6437) (242a32e)
- recovery hangs when using TERMINATE ALL (#6397) (7a57b3c)
- support joins on key formats with different default serde features (#6550) (61e4073)
- support unwrapped struct value inference (#6446) (ed3ca5e)
- NPE in PARTITON BY on null value (#6508) (dbc7867)
- clean up leaked admin client threads when issuing join query to query-stream endpoint (#6532) (9cc7698)
- CASE expressions can now handle 12+ conditions in docker + cloud (#6535) (#6541) (f6c39dc)
-
This change fixes a bug where unnecessary tombstones where being emitted when a
HAVING
clause filtered out a row from the source that is not in the output table. For example, given:sql -- source stream: CREATE STREAM FOO (ID INT KEY, VAL INT) WITH (...); -- aggregate into a table: CREATE TABLE BAR AS SELECT ID, SUM(VAL) AS SUM FROM FOO GROUP BY ID HAVING SUM(VAL) > 0; -- insert some values into the stream: INSERT INTO FOO VALUES(1, -5); INSERT INTO FOO VALUES(1, 6); INSERT INTO FOO VALUES(1, -2); INSERT INTO FOO VALUES(1, -1);
Where previously the contents of the sink topicBAR
would have contained records: | Key | Value | Notes | |:----|:--------|:------------------------------------------------------------------------------------------------| | 1. | null. | Spurious tombstone: the table does not contain a row with key1
, so no tombstone is required. | | 1. | {sum=1} | Row added as HAVING criteria now met | | 1. | null. | Row deleted as HAVING criteria now not met | | 1. | null. | Spurious tombstone: the table does not contain a row with key1
, so no tombstone is required. |The topic will now contain:
Key Value 1. {sum=1} 1. null. -
Adds support for using primitive types in joins. (#4132) (d595985), closes #4132
0.13.0 (2020-09-29)
- add KSQL processing log message on uncaught streams exceptions (#6253) (ac8875f)
- Adds support for 0x, X'...', x'...' type hex strings in udf:encode (#6118) (d492556)
- clarify key or value in (de)serialization processing log messages (#6109) (7a16b91)
- CommandRunner enters degraded state when corruption detected in metastore (#6164) (2b29ee0)
- latest and earliest ByOffset UDFs to capture N values (#6014) (96bb12a)
- Support for IF NOT EXISTS on CREATE CONNECTOR (#6036) (8466197)
- Support IF NOT EXISTS on CREATE TYPE (#6173) (a0f381b)
- support PARTITION BY NULL for creating keyless stream (#6096) (81e3142)
- surface error to user when command topic deleted while server running (#6240) (c5d6b56)
- allow expressions in flat map (#6163) (52a897b)
- create /var/lib/kafka-streams directory on RPM installation (#6126) (31ed3d3)
- delete zombie consumer groups 🧟 (#6160) (2d1697a)
- delimited format should write decimals in a format it can read (#6238) (626965e)
- don't use queryId of last terminate command after restore (#6278) (2753ccd)
- earliest/latest_by_offset should accept nulls (#5729) (6eb5a41)
- fail on non-string MAP keys (#6182) (9d4cc6d)
- improve error handling of invalid Avro identifier (#6239) (8dd3942)
- missing topic classifier now uses MissingSourceTopicException (#6172) (cf8e15d)
- register correct unwrapped schema (#6188) (cb25f9c)
- scale of ROUND() return value (#6236) (42ab721)
0.12.0 (2020-09-01)
- support CREATE OR REPLACE w/ config guard but w/o restrictions (#5766) (e7ff81a)
- add a serverStatus to ServerInfo and display the status in the CLI (#6040) (1921d0e)
- Add consumer offsets to DESCRIBE EXTENDED (#5476) (9ce3c97)
- add Ksql warning to KsqlResource response when CommandRunner degraded (#6039) (6d547da)
- add serialization exceptions to processing logger (#6084) (8ab98a5)
- add service to restart failed persistent queries (#5807) (0ffe3e2)
- Allow udfs to be provided via a new flag to KsqlTestingTool (#5964) (a85ef14)
- CommandRunner enters degraded states when it processes command with higher version than it supports (#6032) (a841443)
- DistributingExecutor fails DDL statement if CommandRunner DEGRADED (#6031) (62b6d9a)
- Enable datagen to set the message timestamp (#5849) (3cba869)
- hard delete schemas for push queries (#6061) (a036b8b)
- introduce the sql-based testing tool (YATT) (#6051) (33e71c3)
- move command topic deserialization to CommandRunner and introduce DEGRADED CommandRunnerStatus (#6012) (ab8cec2)
- New ksql.properties.overrides.denylist to deny clients configs overrides (#5877) (7d1ad25)
- Support [IF EXISTS] on DROP TYPE command (#5962) (431c2ff)
- Support IF EXISTS keyword on DROP CONNECTOR (#6067) (2c99df9)
- Support subscript and nested functions in grouping queries (#5998) (8d383db)
- client: support describe source in Java client (#5944) (a154373)
- support UDAFs with and without init Args with same param type (#5982) (0bf3296)
- allow implicit cast of numbers literals to decimals on insert/select (#6005) (2bc15dd)
- allow joins in windowed aggregations (9452ab6), closes #5898 #5931
- avoid losing cause of processing errors (#5937) (014c049)
- NPE when udf metrics enabled (#5960) (e32183f)
- protobuf format does not support unwrapping (#6033) (6c08e5d)
- remove unnecessary parser token (#6048) (a6c3864)
- set restarted query as healthy during a time threshold (#6018) (ae5c215)
- Use a SandboxedPersistentQueryMetadata to not interact with KafkStreams (#6066) (e000b41)
- Uses pull query metrics for all paths, not just /query (#5983) (143849c)
0.11.0 (2020-08-03)
- client: support streaming inserts in Java client (#5641) (1e109bf)
- client: support admin operations in Java client (#5671) (7d0079a)
- client: support list queries in Java client (#5682) (4d860f8)
- client: support DDL/DML statements in Java client (#5775) (53ca76f)
- support WINDOWEND in WHERE of pull queries (#5680) (40f2f13)
- Adds SSL mutual auth support to intra-cluster requests (#5482) (82b137f)
- new array_remove UDF (#5843) (4ebeff2)
- Replay command topic to local file to backup KSQL Metastore (#5831) (8523051)
- organize UDFs by category (#5813) (7f5b843)
- expose query ID in CommandStatusEntity (MINOR) (#5814) (bb29185)
- circumvent KAFKA-10179 by forcing changelog topics for tables (#5781) (ef8fa4f)
- always use the changelog subject in table state stores (#5823) (e69acb4)
- remove schema compat check if schema exists (#5872) (6338270)
- ensure null values cast to varchar/string remain null (#5769) (530eb7f)
- ksqlDB should not truncate decimals (#5763) (ba833f7)
- Make sure UDTF describe shows actual function description (#5744) (afe85d9)
- Reuse KsqlClient instance for inter node requests (#5742) (cd7f540)
- SEC-1034: log4j migration to confluent repackaged version (#5783) (4563d02)
- show overridden props in CLI (#5750) (f6fd2ee)
- simplify pull query error message (#5672) (9bc4755)
- Upgrade to Vert.x 3.9.1 which depends on version of Netty which allows backported ALPN in JDK 1.8.0_252 to be used, and provide warning if openSSL is not installed (#5818) (36e44a6)
- windowed tables now have cleanup policy compact+delete (#5743) (2038770)
- configure topic retention based on retention clause for windowed tables (#5835) (b509c99)
- set Schema Registry port in tutorials docker compose (f46d358)
- create the metastore backups directory if it does not exist (#5879) (d77d8b7)
- close query on invalid use of HTTP/2 with /query endpoint (#5883) (bcab116)
- adds a handler to gracefully shutdown (#5895) (5fbf171)
- ksqlDB now creates windowed tables with cleanup policy "compact,delete", rather than "compact". Also, topics that back streams are always created with cleanup policy "delete", rather than the broker default (by default, "delete").
0.10.2 (2020-10-05)
- adds a handler to gracefully shutdown (#5895) (5fbf171)
- allow expressions in flat map (#6165) (b6ad9bc)
- allow joins in windowed aggregations (9452ab6), closes #5898 #5931
- always use the changelog subject in table state stores (#5823) (#5837) (c87aa69)
- avoid losing cause of processing errors (#5937) (014c049)
- close query on invalid use of HTTP/2 with /query endpoint (#5885) (0b75411)
- create /var/lib/kafka-streams directory on RPM installation (#6126) (31ed3d3)
- ksqlDB should not truncate decimals (#5763) (ba833f7)
- remove schema compat check if schema exists (#5871) (9bcb5b0)
- Reuse KsqlClient instance for inter node requests (#5742) (#5844) (7645abd)
- windowed table topic retention fixes (#5842) (b3db23c)
- ksqlDB now creates windowed tables with cleanup policy "compact,delete", rather than "compact". Also, topics that back streams are always created with cleanup policy "delete", rather than the broker default (by default, "delete").
0.10.1 (2020-07-09)
- allow empty structs in schema inference (#5656) (3c38c8c)
- do not overwrite schemas from a CREATE STREAM/TABLE (#5756) (5aa0b72)
- make sure old query stream doesn't block on close (#5730) (663a67b)
- query w/ scoped all columns, join and where clause (#5684) (304eb0c)
- support CASE statements returning NULL (#5703) (5062942)
- UDTF don't return null on keyless stream (#5761) (190f9e2)
0.10.0 (2020-06-25)
- Any key name (#5093) (1f0ca3e)
- Explicit keys (#5533) (d0db0cf)
- add extra log messages for pull queries (#4909) (d622ecc)
- Adds the ability have internal endpoints listen on ksql.internal.listener (#5212) (46acb73)
- create MetricsContext for ksql metrics reporters (#5528) (50561a5)
- drop WITH(KEY) syntax (#5363) (bb43d23)
- expose JMX metric that classifies an error state (#5374) (52271bf)
- Expose Vert.x metrics (#5340) (e82f762)
- introduce RegexClassifier for classifying errors via cfg (#5412) (b25dd98)
- Pull Queries: QPS check utilizes internal API flag to determine if forwarded (#5392) (08b428f)
- reload TLS certificate without restarting server (#5516) (a5920b0)
- support TIMESTAMP being a key column (#5542) (286ce08)
- turn on snappy compression for produced data (#5495) (27d8ad5)
- client: Java client with push + pull query support (#5200) (280ef0c)
- client: support (non-streaming) insert into in Java client (#5448) (9e8234a)
- client: support push query termination in Java client (#5371) (62dacca)
- New UDF/UDAF
- Adds UDF regexp_extract_all (#5507) (e19233c)
- Adds UDF regexp_replace (#5504) (30309bf)
- Adds udf regexp_split_to_array (#5501) (3766129)
- new UDFs for array max/min/sort (#5505) (415d930)
- new UDFs for set-like operations on Arrays (#5548) (50428c7)
- new UDFs for working with Maps (#5536) (bc9ad2e)
- new UUID UDF (#5535) (cfa65da)
- new split_to_map udf (#5563) (a68b9ad)
- new string UDFs LPad, RPad (#5546) (00f5083)
- implement earliest_by_offset() UDAF (#5273) (2a356ac)
- INSTR function #881 (#5385) (ca86bbf)
- add CHR UDF (#5559) (5a746e8)
- implements ARRAY_JOIN as requested in (#5028) (#5474) (#5638) (6c67866)
- Add encode udf (#5523) (b02f1ce)
- /inserts-stream endpoint now accepts complex types (#5469) (0840160)
- allow dynamic construction of an ARRAY of STRUCTS with duplicate values #5436 (#5506) (0b1162c)
- allow setting auto.offset.reset=latest on /query-stream endpoint (#5455) (d91c016)
- allow structs in schema provider return types (#5287) (2e604f0)
- Allow value delimiter to be specified for datagen (#5332) (865e834)
- avoid unnecessary warnings from PRINT (#5459) (080fba9)
- Block writer thread if response output buffer is full (#5386) (0edda40)
- deals with issue #5521 by adding more descriptive error message (#5529) (86f8b67)
- disallow requests to /inserts-stream if insert values disabled (#5592) (e277f25)
- exclude window bounds from persistent query value & schema match (#5425) (3596ceb)
- fail AVRO/Protobuf/JSON Schema statements if SR is missing (#5597) (85a0320)
- fail on WINDOW clause without matching GROUP BY (#5431) (68354d4)
- improve print topic (#5552) (e193576)
- Improve pull query error logging (#5477) (f23412e)
- KSQL does not accept more queries when running QueryLimit - 1 queries (#5461) (d64f1bc)
- make stream and column names case-insensitive in /inserts-stream (#5591) (e9e3042)
- NPE in latest_by_offset if first element being processed has null… (#4975) (a9668d2)
- Prevent memory leaks caused by pull query logging (#5532) (723b6cb)
- remove leading zeros when casting decimal to string (#5270) (e1cc8ad)
- Remove stacktrace from error message (#5478) (b63d7e8)
- Retry on connection closed (#5515) (8eb1f88)
- set retention.ms to -1 instead of Long.MAX_VALUE (#5560) (22da8a0)
- update Concat UDF to new framework and make variadic (#5513) (cab6a86)
- zero decimal bug (#5531) (1b9094b)
- /query-stream endpoint should serialize Struct (MINOR) (#5205) (12b092b)
- Move Cors handler in front of /chc handlers (#5239) (004ced2)
- use schema in annotation as schema provider if present (1a90eeb)
- use sr's jackson-jsonschema version (#5213) (0b3899a)
- /inserts-stream endpoint now supports nested types (#5621) (866ae34)
- don't fail if broker does not support AuthorizedOperations (#5617) (0feb081)
- ensure only deserializable cmds are written to command topic (#5645) (4ad2bde)
- support GROUP BY with no source columns used (#5644) (a8e6630)
An associated [blog post](An associated blog post also covers many of these breaking changes: https://www.confluent.io/blog/ksqldb-0-10-updates-key-columns/) also covers many of these breaking changes.
Statements containing PARTITION BY, GROUP BY, or JOIN clauses now produce different output schemas.
For PARTITION BY and GROUP BY statements, the name of the key column in the result is determined by the PARTITION BY or GROUP BY clause:
- Where the partitioning or grouping is a single column reference, then the key column has the same name as this column. For example:
-- OUTPUT will have a key column called X;
CREATE STREAM OUTPUT AS
SELECT *
FROM INPUT
GROUP BY X;
- Where the partitioning or grouping is a single struct field, then the key column has the same name as the field. For example:
-- OUTPUT will have a key column called FIELD1;
CREATE STREAM OUTPUT AS
SELECT *
FROM INPUT
GROUP BY X->field1;
- Otherwise, the key column name is system-generated and has the form
KSQL_COL_n
, wheren
is a positive integer.
In all cases, except where grouping by more than one column, you can set the new key column's name by defining an alias in the projection. For example:
-- OUTPUT will have a key column named ID.
CREATE TABLE OUTPUT AS
SELECT
USERID AS ID,
COUNT(*)
FROM USERS
GROUP BY ID;
For groupings of multiple expressions, you can't provide a name for the system-generated key column. However, a work around is to combine the grouping columns yourself, which does enable you to provide an alias:
-- products_by_sub_cat will have a key column named COMPOSITEKEY:
CREATE TABLE products_by_sub_cat AS
SELECT
categoryId + ‘§’ + subCategoryId AS compositeKey
SUM(quantity) as totalQty
FROM purchases
GROUP BY CAST(categoryId AS STRING) + ‘§’ + CAST(subCategoryId AS STRING);
For JOIN statements, the name of the key column in the result is determined by the join criteria.
- For INNER and LEFT OUTER joins where the join criteria contain at least one column reference, the key column is named based on the left-most source whose join criteria is a column reference. For example:
-- OUTPUT will have a key column named I2_ID.
CREATE TABLE OUTPUT AS
SELECT *
FROM I1
JOIN I2 ON abs(I1.ID) = I2.ID JOIN I3 ON I2.ID = I3.ID;
The key column can be given a new name, if required, by defining an alias in the projection. For example:
-- OUTPUT will have a key column named ID.
CREATE TABLE OUTPUT AS
SELECT
I2.ID AS ID,
I1.V0,
I2.V0,
I3.V0
FROM I1
JOIN I2 ON abs(I1.ID) = I2.ID
JOIN I3 ON I2.ID = I3.ID;
- For FULL OUTER joins and other joins where the join criteria are not on column references, the key column in the output is not equivalent to any column from any source. The key column has a system-generated name in the form
KSQL_COL_n
, wheren
is a positive integer. For example:
-- OUTPUT will have a key column named KSQL_COL_0, or similar.
CREATE TABLE OUTPUT AS
SELECT *
FROM I1
FULL OUTER JOIN I2 ON I1.ID = I2.ID;
The key column can be given a new name, if required, by defining an alias in the projection. A new UDF has been introduced to help define the alias called JOINKEY
. It takes the join criteria as its parameters. For example:
-- OUTPUT will have a key column named ID.
CREATE TABLE OUTPUT AS
SELECT
JOINKEY(I1.ID, I2.ID) AS ID,
I1.V0,
I2.V0
FROM I1
FULL OUTER JOIN I2 ON I1.ID = I2.ID;
JOINKEY
will be deprecated in a future release of ksqlDB once multiple key columns are supported.
CREATE TABLE
statements will now fail if the PRIMARY KEY
column is not provided.
For example, a statement such as:
CREATE TABLE FOO (
name STRING
) WITH (
kafka_topic='foo',
value_format='json'
);
Will need to be updated to include the definition of the PRIMARY KEY, for example:
CREATE TABLE FOO (
ID STRING PRIMARY KEY,
name STRING
) WITH (
kafka_topic='foo',
value_format='json'
);
If using schema inference, i.e. loading the value columns of the topic from the Schema Registry, the primary key can be provided as a partial schema, for example:
-- FOO will have value columns loaded from the Schema Registry
CREATE TABLE FOO (
ID INT PRIMARY KEY
) WITH (
kafka_topic='foo',
value_format='avro'
);
CREATE STREAM
statements that do not define a KEY
column no longer have an implicit ROWKEY
key column.
For example:
CREATE STREAM BAR (
NAME STRING
) WITH (...);
Previously, the above statement would have resulted in a stream with two columns: ROWKEY STRING KEY
and NAME STRING
.
With this change, the above statement results in a stream with only the NAME STRING
column.
Streams with no KEY column are serialized to Kafka topics with a null
key.
A statement that creates a materialized view must include the key columns in the projection. For example:
CREATE TABLE OUTPUT AS
SELECT
productId, // <-- key column in projection
SUM(quantity) as unitsSold
FROM sales
GROUP BY productId;
The key column productId
is required in the projection. In previous versions of ksqlDB, the presence
of productId
in the projection would have placed a copy of the data into the value of the underlying
Kafka topic's record. But starting in version 0.10.0, the projection must include the key columns, and ksqlDB stores these columns
in the key of the underlying Kafka record. Optionally, you may provide an alias for
the key column(s).
CREATE TABLE OUTPUT AS
SELECT
productId as id, // <-- aliased key column
SUM(quantity) as unitsSold
FROM sales
GROUP BY productId;
If you need a copy of the key column in the Kafka record's value, use the
AS_VALUE function to indicate this to ksqlDB. For example, the following statement produces an output inline with the previous version of ksqlDB
for the above example materialized view:
CREATE TABLE OUTPUT AS
SELECT
productId as ROWKEY, // <-- key column named ROWKEY
AS_VALUE(productId) as productId, // <-- productId copied into value
SUM(quantity) as unitsSold
FROM sales
GROUP BY productId;
In previous versions, all key columns were called ROWKEY
. To enable using a more
user-friendly name for the key column in queries, it was possible
to supply an alias for the key column in the WITH clause, for example:
CREATE TABLE INPUT (
ROWKEY INT PRIMARY KEY,
ID INT,
V0 STRING
) WITH (
key='ID',
...
);
With the previous query, the ID
column can be used as an alias for ROWKEY
.
This approach required the Kafka message value to contain an exact copy of the key.
KLIP-24
removed the restriction that key columns must be named ROWKEY
, negating the need for the WITH(KEY)
syntax, which has been removed. Also, this change removed the requirement for
the Kafka message value to contain an exact copy of the key.
Update your queries by removing the KEY
from the WITH
clause and naming
your KEY
and PRIMARY KEY
columns appropriately. For example, the previous
CREATE TABLE statement can now be rewritten as:
CREATE TABLE INPUT (
ID INT PRIMARY KEY,
V0 STRING
) WITH (...);
Unless the value format is DELIMITED
, which means the value columns are
order dependent, so dropping the ID
value column would result in a
deserialization error or the wrong values being loaded. If you're using
DELIMITED
, consider rewriting as:
CREATE TABLE INPUT (
ID INT PRIMARY KEY,
ignoreMe INT,
V0 STRING
) WITH (...);
0.9.0 (2020-05-11)
- add multi-join expression support (#5081) (002cd5a)
- support more advanced suite of LIKE expressions (#5013) (67cd9d9)
- add COALESCE function (#4829) (251c237)
- add GROUP BY support for any key names (#4899) (e7cbdfc), closes #4898
- partition-by primitive key support (#4098) (7addf88), closes #4098
- add KsqlQueryStatus to decouple from KafkaStreams.State (#5029) (e8cbcde)
- Adds rate limiting to pull queries (#4951) (6284111)
- Do not allow access to new streaming endpoints using HTTP1.x (#5193) (8b90035)
- fail startup if command contains incompatible version (#5104) (a1751b1)
- klip-14 - rowtime as pseduocolumn (#5150) (d541420)
- 'SHOW QUERIES [EXTENDED]' statement returns results for all nodes in the cluster (#4875) (7385a31)
- transient queries added to show queries output (#5105) (e8a2a63)
- 'drop (stream|table) if exists' should not fail if source does not exist (#4872) (b0669a0)
- add deserializer for SqlType (#4830) (eed9912)
- allow trailing slash in listeners config (MINOR) (#5012) (13b0455)
- Allows unclosed quote characters to exist in comments (#4993) (fd65021)
- avoid duplicate column name errors from auto-generated aliases (#4827) (258d0b0)
- avoid long possible format lists on PRINT TOPIC for nulls (#4867) (a66e489)
- better error code on shutdown (#4754) (17d758d)
- Catch server startup exceptions (#4974) (898f3a1)
- CommandRunner metric has correct metric displayed when thread dies (#4653) (1db542b)
- do not allow GROUP BY and PARTITION BY on boolean expressions (#4940) (d84d2d1)
- do not allow grouping sets (#4942) (51bb9f6)
- do not allow implicit casting in UDAF function lookup (#5145) (6709010)
- do not filter out rows where PARTITION BY resolves to null (#4823) (e75a792)
- Filter out hosts with no lag info by default (#4859) (e10bbce)
- fix repartition semantics (#4816) (609e9e2)
- generated aliases for struct field access no longer require quoting (#4977) (2002458)
- handle leap days correctly during timestamp extraction (#4878) (c54db81), closes #4864
- If startup hangs (esp on preconditions), shutdown server correctly (#4889) (20c4b59)
- Improve error message for where/having type errors (#5023) (23eb80d)
- improve handling of NULLs (#5019) (c53dd68), closes #4912
- include lower-case identifiers among those that need quotes (#3723) (#5139) (3bcbcf4)
- make endpoints available while waiting for precondition (#5069) (1136162)
- Make sure internal client is configured for TLS (#5059) (37c7713)
- NPE in latest_by_offset if first two elements processed are both null (#4975) (4c72f93)
- only create processing log stream if it doesn't exist (#4805) (8dead0f)
- output valid multiline queries when running SHOW QUERIES (#4956) (ec74a33)
- query id for TERMINATE should be case insensitive (#5005) (588c1e9)
- reject requests to new API server if server is not ready (#5048) (d988722)
- Remove unnecessary error logging for heartbeat (#4809) (fa84576)
- replace 'null' in explain plan with correct op type (#5075) (f9bc0e6)
- Returns empty lag info for a dead host rather than last received lags (#4837) (3d98527)
- Stop PARTITION BY and UDTFs that fail from terminating the query (#4822) (522fe84)
- support quotes in explain statements (MINOR) (#5142) (bee2fe0)
- throw error when column does not exist during INSERT INTO (#4926) (89e261d)
- remove immutable properties from headless mode (#4936) (5550880)
- rename ksqldb in normal logging path (MINOR) (#4915) (16172ba)
- support trailing slashes in listener URLs (#5076) (e9e0431)
- logic when closing ksqlEngine fixed (#4917) (a217eb9)
- use describeTopics() for Kafka healtcheck probe (#4814) (578d0d5)
- Do not allow access to new streaming endpoints using HTTP1.x (#5193) (8b90035)
- speed up restarts by not building topologies for terminated queries (#5002) (2472382)
-
Select star, i.e.
select *
, no longer expands to includeROWTIME
column(s). Instead,ROWTIME
is only included in the results of queries if explicitly included in the projection, e.g.select rowtime, *
. This only affects new statements. Any view previously created via aCREATE STREAM AS SELECT
orCREATE TABLE AS SELECT
statement is unaffected. -
This release changes the system generated column name for any columns in projections that are struct field dereferences. Previously, the full path was used when generating the name, now only the final field name is used. For example,
SELECT someStruct->someField, ...
previously generated a column name ofSOMESTRUCT__SOMEFIELD
and now generates a name ofSOMEFIELD
. Generated column names may have a numeral appended to the end to ensure uniqueness, for exampleSOMEFIELD_2
.Note: it is recommended that you do not rely on system generated column names for production systems, because naming logic may change between releases. Providing an explicit alias ensures consistent naming across releases, for example,
SELECT someStruct->someField AS someField
. Backward compatibility: existing running queries will not be affected by this change, and they will continue to run with the same column names. Any statements executed after the upgrade will use the new names where no explicit alias is provided. Add explicit aliases to your statements if you require the old names, for example:SELECT someStruct->someField AS SOMESTRUCT__SOMEFIELD, ...
-
Existing queries that reference a single GROUP BY column in the projection would fail if they were resubmitted, due to a duplicate column. The same existing queries will continue to run if already running, i.e. this is only a change for newly submitted queries. Existing queries will use the old query semantics.
-
Push queries, which rely on auto-generated column names, may see changes in column names. Pull queries and any existing persistent queries are unaffected, e.g. those created with
CREATE STREAM AS SELECT
,CREATE TABLE AS SELECT
orINSERT INTO
. -
The ksqlDB server no longer ships with Jetty. This means that when you start the server, you must supply Jetty-specific dependencies, like certain login modules used for basic authentication, by using the KSQL_CLASSPATH environment variable for them to be found.
If you're upgrading from a ksqlDB version before 0.7, follow these upgrade instructions.
0.8.1 (2020-03-30)
- Don't wait for streams thread to be in running state (#4908) (2f83119)
- Infer TLS based on scheme of server string (#4893) (a519ed3)
0.8.0 (2020-03-18)
- support Protobuf in ksqlDB (#4469) (a77cebe)
- introduce JSON_SR format (#4596) (daa04d2)
- support for tunable retention, grace period for windowed tables (#4733) (30d49b3), closes #4157
- add REGEXP_EXTRACT UDF (#4728) (a25f0fb)
- add ARRAY_LENGTH UDF (#4725) (31a9d9d)
- Implement latest_by_offset() UDAF (#4782) (0c13bb0)
- add confluent-hub to ksqlDB docker image (#4729) (b74867a)
- add the topic name to deserialization errors (#4573) (0f7edf6)
- Add metrics for pull queries endpoint (#4608) (23e3868)
- log groupby errors to processing logger (#4575) (b503d25)
- Provide upper limit on number of push queries (#4581) (2cd66c7)
- display errors in CLI in red text (#4509) (56f9c9b)
- enhance
PRINT TOPIC
's format detection (#4551) (8b19bc6)
- change default exception handling for timestamp extractors (#4632) (1576af0)
- create schemas at topic creation (#4717) (514025d)
- don't display decimals in scientific notation in CLI (#4723) (3626f42)
- stop logging about command topic creation on startup if exists (MINOR) (#4709) (f4cec0a)
- added special handling for forwarded pull query request (#4597) (ba4fe74)
- backport fixes from query close (#4662) (8168002)
- change configOverrides back to streamsProperties (#4675) (ce74cf8)
- csas/ctas with timestamp column is used for output rowtime (#4489) (ddddf92)
- patch KafkaStreamsInternalTopicsAccessor as KS internals changed (#4621) (eb07370)
- use HTTPS instead of HTTP to resolve dependencies in Maven archetype (#4511) (f21823f)
- add deserializer for SqlType (#4830) (eed9912)
0.7.1 (2020-02-28)
-
support custom column widths in cli (#4616) (cb66e05)
A new ksqlDB CLI configuration allows you to specify the width of each column in tabular output.
ksql> SET CLI COLUMN-WIDTH 10
Given a customized value, subsequent renderings of output use the setting:
ksql> SELECT * FROM riderLocations > WHERE GEO_DISTANCE(latitude, longitude, 37.4133, -122.1162) <= 5 EMIT CHANGES; +----------+----------+----------+----------+----------+ |ROWTIME |ROWKEY |PROFILEID |LATITUDE |LONGITUDE | +----------+----------+----------+----------+----------+
The default behavior, which determines column width based on terminal width and the number of columns, can be re-enabled using:
ksql> SET CLI COLUMN-WIDTH 0
- add functional-test dependencies to Docker module (#4586) (04fcf8d)
- don't cleanup topics on engine close (#4658) (ad66a81)
- idempotent terminate that can handle hung streams (#4643) (d96db14)
0.7.0 (2020-02-11)
Note that ksqlDB 0.7.0 has a number of breaking changes when compared with ksqlDB 0.6.0 (see the 'Breaking changes' section below for details). Please make sure to read and follow these upgrade instructions if you are upgrading from a previous ksqlDB version.
-
feat: primitive key support (#4478) (ddf09d)
ksqlDB now supports the following primitive key types:
INT
,BIGINT
,DOUBLE
as well as the existingSTRING
type.The key type can be defined in the CREATE TABLE or CREATE STREAM statement by including a column definition for
ROWKEY
in the formROWKEY <primitive-key-type> KEY,
, for example:CREATE TABLE USERS (ROWKEY BIGINT KEY, NAME STRING, RATING DOUBLE) WITH (kafka_topic='users', VALUE_FORMAT='json');
ksqlDB currently requires the name of the key column to be
ROWKEY
. Support for arbitrary key names is tracked by #3536.ksqlDB currently requires keys to use the
KAFKA
format. Support for additional formats is tracked by https://github.com/confluentinc/ksql/projects/3.Schema inference currently only works with
STRING
keys, Support for additional key types is tracked by #4462. (Schema inference is where ksqlDB infers the schema of a CREATE TABLE and CREATE STREAM statements from the schema registered in the Schema Registry, as opposed to the user supplying the set of columns in the statement).Apache Kafka Connect can be configured to output keys in the
KAFKA
format by using a Converter, e.g."key.converter": "org.apache.kafka.connect.converters.IntegerConverter"
. Details of which converter to use for which key type can be found here: https://docs.confluent.io/current/ksql/docs/developer-guide/serialization.html#kafka in theConnect Converter
column.@rmoff has written an introductory blog about primitive keys: https://rmoff.net/2020/02/07/primitive-keys-in-ksqldb/
-
add a new default SchemaRegistryClient and remove default for SR url (#4325) (e045f7c)
-
Adds lag reporting and API for use in lag aware routing as described in KLIP 12 (#4392) (cb9ae29)
-
better error message when transaction to command topic fails to initialize by timeout (#4486) (a5fed3b)
-
hide internal/system topics from SHOW TOPICS (#4322) (075fed3)
-
Implement pull query routing to standbys if active is down (#4398) (ace23b1)
-
Implementation of heartbeat mechanism as part of KLIP-12 (#4173) (37c1eaa)
-
add COUNT_DISTINCT and allow generics in UDAFs (#4150) (2d5e680)
-
remove WindowStart() and WindowEnd() UDAFs (#4459) (eda2e34)
-
make (certain types of) server error messages configurable (#4121) (cedf47e)
-
allow environment variables to configure embedded connect (#4260) (e032ea9)
-
enable Kafla ACL authorization checks for Pull Queries (#4187) (5ee1e9e)
-
show properties now includes embedded connect properties and scope (#4099) (ebac104)
-
add support to terminate all running queries (#3944) (abbce84)
-
expose execution plans from the ksql engine API (#3482) (067139c)
- Avoids logging INFO for rest-util requests, since it hurts pull query performance (#4302) (50b4c1c)
- Improves pull query performance by making the default schema service a singleton (#4216) (f991752)
- add ksql-test-runner deps to ksql package lib (#4272) (6e28cc4)
- ConcurrentModificationException in ClusterStatusResource (#4510) (c79cba9)
- deadlock when closing transient push query (#4297) (ac8fb63)
- delimiters reset across non-delimited types (reverts #4366) (#4371) (5788729)
- do not throw error if VALUE_DELIMITER is set on non-DELIMITED topic (#4366) (2b59b8b)
- exception on shutdown of ksqlDB server (#4483) (126e2cf)
- fix compilation error due to
Format
refactoring (#4465) (07a4dcd) - fix NPE in CLI if not username supplied (#4312) (0b6da0b)
- Fixes the single host lag reporting case (#4494) (6b8bc2a)
- floating point comparison was inexact (#4372) (2a4ca47)
- Include functional tests jar in docker images (#4274) (2559b2f)
- include valid alternative UDF signatures in error message (MINOR) (#4403) (f397ad8)
- Make null key serialization/deserialization symmetrical (#4351) (2a61acb)
- partial push & persistent query support for window bounds columns (#4401) (48aa6ec)
- print root cause in error message (#4505) (6299410)
- pull queries should work across nodes (#4169) (#4271) (2369213)
- remove deprecated Acl API (#4373) (a2b69f7)
- remove duplicate comment about Schema Regitry URL from sample server properties (#4346) (0d542c5)
- rename stale to standby in KsqlConfig (#4467) (f8bb986)
- report window type and query status better from API (#4313) (ca9368a)
- reserve
WINDOWSTART
andWINDOWEND
as system column names (#4388) (ea0a0ac) - Sets timezone of RestQueryTranslationTest test to make it work in non UTC zones (#4407) (50b25d5)
- show queries now returns the correct Kafka Topic if the query string contains with clause (#4430) (1b713cd)
- support conversion of STRING to BIGINT for window bounds (#4500) (9c3cbf8)
- support WindowStart() and WindowEnd() in pull queries (#4435) (8da2b63)
- add logging during restore (#4270) (4e32da6)
- log4j properties files (#4293) (5911faf)
- report clearer error message when AVG used with DELIMITED (#4295) (307bf4d)
- better error message on self-join (#4248) (1281ab2)
- change query id generation to work with planned commands (#4149) (91c421a)
- CLI commands may be terminated with semicolon+whitespace (MINOR) (#4234) (096b78f)
- decimals in structs should display as numeric (#4165) (75b539e)
- don't load current qtt test case from legacy loader (#4245) (9479fd6)
- immutability in some more classes (MINOR) (#4179) (cbd3bab)
- include path of field that causes JSON deserialization error (#4249) (5cc718b)
- reintroduce FetchFieldFromStruct as a public UDF (#4185) (a50a665)
- show topics doesn't display topics with different casing (#4159) (0ac8747)
- untracked file after cloning on Windows (#4122) (04de30e)
- array access is now 1-indexed instead of 0-indexed (#4057) (f09f797)
- Explicitly disallow table functions with table sources, fixes #4033 (#4085) (60e20ef)
- fix issues with multi-statement requests failing to validate (#3952) (3e7169b), closes #3363
- NPE when starting StandaloneExecutor (#4119) (c6c00b1)
- properly set key when partition by ROWKEY and join on non-ROWKEY (#4090) (6c80941)
- remove mapValues that excluded ROWTIME and ROWKEY columns (#4066) (a6982bd), closes #4052
- robin's requested message changes (#4021) (422a2e3)
- schema column order returned by websocket pull query (#4012) (85fef09)
- some terminals dont work with JLine 3.11 (#3931) (ad183ec)
- the Abs, Ceil and Floor methods now return proper types (#3948) (3d6e119)
- UncaughtExceptionHandler not being set for Persistent Queries (#4087) (e193a2a)
- unify behavior for PARTITION BY and GROUP BY (#3982) (67d3f8c)
- wrong source type in pull query error message (#3885) (65523c7), closes #3523
- existing queries that perform a PARTITION BY or GROUP BY on a single column of one of the above supported primitive key types will now set the key to the appropriate type, not a
STRING
as previously. - The
WindowStart()
andWindowEnd()
UDAFs have been removed from KSQL. Use theWindowStart
andWindowEnd
system columns to access the window bounds within the SELECT expression instead. - the order of columns for internal topics has changed. The
DELIMITED
format can not handle this in a backwards compatible way. Hence this is a breaking change for any existing queries the use theDELIMITED
format and have internal topics. This change has been made now for two reasons:- its a breaking change, making it much harder to do later.
- The recent confluentinc#4404 change introduced this same issue for pull queries. This current change corrects pull queries too.
- Any query of a windowed source that uses
ROWKEY
in the SELECT projection will see the contents ofROWKEY
change from a formattedSTRING
containing the underlying key and the window bounds, to just the underlying key. Queries can access the window bounds usingWINDOWSTART
andWINDOWEND
. - Joins on windowed sources now include
WINDOWSTART
andWINDOWEND
columns from both sides on aSELECT *
. WINDOWSTART
andWINDOWEND
are now reserved system column names. Any query that previously used those names will need to be changed: for example, alias the columns to a different name. These column names are being reserved for use as system columns when dealing with streams and tables that have a windowed key.- standalone literals that used to be doubles may now be interpreted as BigDecimal. In most scenarios, this won't affect any queries as the DECIMAL can auto-cast to DOUBLE; in the case were the literal stands alone, the output schema will be a DECIMAL instead of a DOUBLE. To specify a DOUBLE literal, use scientific notation (e.g. 1.234E-5).
- The response from the RESTful API has changed for some commands with this commit: the
SourceDescription
type no longer has aformat
field. Instead it haskeyFormat
andvalueFormat
fields. Response now includes astate
property for each query that indicates the state of the query. e.g.The CLI output was:{ "queryString": "create table OUTPUT as select * from INPUT;", "sinks": ["OUTPUT"], "id": "CSAS_OUTPUT_0", "state": "Running" }
and is now:ksql> show queries; Query ID | Kafka Topic | Query String CSAS_OUTPUT_0 | OUTPUT | CREATE STREAM OUTPUT WITH (KAFKA_TOPIC='OUTPUT', PARTITIONS=1, REPLICAS=1) AS SELECT * FROM INPUT INPUT EMIT CHANGES; CTAS_CLICK_USER_SESSIONS_5 | CLICK_USER_SESSIONS | CREATE TABLE CLICK_USER_SESSIONS WITH (KAFKA_TOPIC='CLICK_USER_SESSIONS', PARTITIONS=1, REPLICAS=1) AS SELECT CLICKSTREAM.USERID USERID, COUNT(*) COUNT FROM CLICKSTREAM CLICKSTREAM WINDOW SESSION ( 300 SECONDS ) GROUP BY CLICKSTREAM.USERID EMIT CHANGES; For detailed information on a Query run: EXPLAIN <Query ID>;
Note the addition of theQuery ID | Status | Kafka Topic | Query String CSAS_OUTPUT_0 | RUNNING | OUTPUT | CREATE STREAM OUTPUT WITH (KAFKA_TOPIC='OUTPUT', PARTITIONS=1, REPLICAS=1) AS SELECT *FROM INPUT INPUTEMIT CHANGES; For detailed information on a Query run: EXPLAIN <Query ID>;
Status
column and the fact thatQuery String
is now longer being written across multiple lines. old CLI output:New CLI output:ksql> describe CLICK_USER_SESSIONS; Name : CLICK_USER_SESSIONS Field | Type ROWTIME | BIGINT (system) ROWKEY | INTEGER (system) USERID | INTEGER COUNT | BIGINT For runtime statistics and query details run: DESCRIBE EXTENDED <Stream,Table>;
Note the addition of theksql> describe CLICK_USER_SESSIONS; Name : CLICK_USER_SESSIONS Field | Type ROWTIME | BIGINT (system) ROWKEY | INTEGER (system) (Window type: SESSION) USERID | INTEGER COUNT | BIGINT For runtime statistics and query details run: DESCRIBE EXTENDED <Stream,Table>;
Window Type
information. The extended version of the command has also changed. Old output:ksql> describe extended CLICK_USER_SESSIONS; Name : CLICK_USER_SESSIONS Type : TABLE Key field : USERID Key format : STRING Timestamp field : Not set - using <ROWTIME> Value Format : JSON Kafka topic : CLICK_USER_SESSIONS (partitions: 1, replication: 1) Statement : CREATE TABLE CLICK_USER_SESSIONS WITH (KAFKA_TOPIC='CLICK_USER_SESSIONS', PARTITIONS=1, REPLICAS=1) AS SELECT CLICKSTREAM.USERID USERID, COUNT(*) COUNT FROM CLICKSTREAM CLICKSTREAM WINDOW SESSION ( 300 SECONDS ) GROUP BY CLICKSTREAM.USERID EMIT CHANGES; Field | Type ROWTIME | BIGINT (system) ROWKEY | INTEGER (system) USERID | INTEGER COUNT | BIGINT
- Any
KEY
column identified in theWITH
clause must be of the same Sql type asROWKEY
. Users can provide the name of a value column that matches the key column, e.g.Before primitive keys was introduced all keys were treated asCREATE STREAM S (ID INT, NAME STRING) WITH (KEY='ID', ...);
STRING
. With primitive keysROWKEY
can be types other thanSTRING
, e.g.BIGINT
. It therefore follows that anyKEY
column identified in theWITH
clause must have the same SQL type as the actual key, i.e.ROWKEY
. With this change the above example statement will fail with the error:As the error message says, the error can be resolved by changing the statement to:The KEY field (ID) identified in the WITH clause is of a different type to the actual key column. Either change the type of the KEY field to match ROWKEY, or explicitly set ROWKEY to the type of the KEY field by adding 'ROWKEY INTEGER KEY' in the schema. KEY field type: INTEGER ROWKEY type: STRING
CREATE STREAM S (ROWKEY INT KEY, ID INT, NAME STRING) WITH (KEY='ID', ...);
- Some existing joins may now fail and the type of
ROWKEY
in the result schema of joins may have changed. WhenROWKEY
was always aSTRING
it was possible to join anINTEGER
column with aBIGINT
column. This is no longer the case. AJOIN
requires the join columns to be of the same type. (See confluentinc#4130 which tracks adding support for being able toCAST
join criteria). Where joining on twoINT
columns would previously have resulted in a schema containingROWKEY STRING KEY
, it would not result inROWKEY INT KEY
. - A
GROUP BY
on single expressions now changes the SQL type ofROWKEY
in the output schema of the query to match the SQL type of the expression. For example, consider:Previously, the above would have resulted in an output schema ofCREATE STREAM INPUT (ROWKEY STRING KEY, ID INT) WITH (...); CREATE TABLE OUTPUT AS SELECT COUNT(*) AS COUNT FROM INPUT GROUP BY ID;
ROWKEY STRING KEY, COUNT BIGINT
, whereROWKEY
would have stored the string representation of the integer from theID
column. With this commit the output schema will beROWKEY INT KEY COUNT BIGINT
. - Any
GROUP BY
expression that resolves toNULL
, including because a UDF throws an exception, now results in the row being excluded from the result. Previously, as the key was aSTRING
a value of"null"
could be used. With other primitive types this is not possible. As key columns must be non-null any exception is logged and the row is excluded. - commands that were persisted with RUN SCRIPT will no longer be executable
- the ARRAYCONTAINS function now needs to be referenced as either JSON_ARRAY_CONTAINS or ARRAY_CONTAINS depending on the intended param types
- A
PARTITION BY
now changes the SQL type ofROWKEY
in the output schema of a query. For example, consider:Previously, the above would have resulted in an output schema ofCREATE STREAM INPUT (ROWKEY STRING KEY, ID INT) WITH (...); CREATE STREAM OUTPUT AS SELECT ROWKEY AS NAME FROM INPUT PARTITION BY ID;
ROWKEY STRING KEY, NAME STRING
, whereROWKEY
would have stored the string representation of the integer from theID
column. With this commit the output schema will beROWKEY INT KEY, NAME STRING
. - any queries that were using array index mechanism should change to use 1-base indexing instead of 0-base.
- The maxInterval parameter for ksql-datagen is now deprecated. Use msgRate instead.
- this change makes it so that PARTITION BY statements use the source schema, not the value/projection schema, when selecting the value to partition by. This is consistent with GROUP BY, and standard SQL for GROUP by. Any statement that previously used PARTITION BY may need to be reworked. 1/2
- when querying with EMIT CHANGES and PARTITION BY, the PARTITION BY clause should now come before EMIT CHANGES. 2/2
- KSQL will now, by default, not create duplicate changelog for table sources.
fixes: confluentinc#3621
Now that Kafka Steams has a
KTable.transformValues
we no longer need to create a table by first creating a stream, then doing a select/groupby/aggregate on it. Instead, we can just useStreamBuilder.table
. This change makes the switch, removing theStreamToTable
types and calls and replaces them with eitherTableSource
orWindowedTableSource
, copying the existing pattern forStreamSource
andWindowedStreamSource
. It also reinstates a change inKsqlConfig
that ensures topology optimisations are on by default. This was the case for 5.4.x, but was inadvertently turned off. With the optimisation config turned on, and the new builder step used, KSQL no longer creates a changelog topic to back the tables state store. This is not needed as the source topic is itself the changelog. The change includes new tests intable.json
to confirm the change log topic is not created by default and is created if the user turns off optimisations. This change also removes the line in theTestExecutor
that explicitly sets topology optimisations toall
. The test should not of being doing tis. This may been why the bug turning off optimisations was not detected. - this change removes the old method of generating query IDs based on their sequence of successful execution. Instead all queries will use their offset in the command topic. Similarly, all DROP queries issued before 5.0 will no longer cascade query terminiation.
ALL
is now a reserved word and can not be used for identifiers without being quoted.- abs, ceil and floor will now return types aligned with other databases systems (i.e. the same type as the input). Previously these udfs would always return Double.
- Statements in the command topic will be retried until they succeed. For example, if the source topic has been deleted for a create stream/table statement, the server may fail to start since command runner will be stuck processing the statement. This ensures that the same set of streams/tables are created when restarting the server. You can check to see if the command runner is stuck by:
- Looking in the server logs to see if a statement is being retried.
- The JMX metric
_confluent-ksql-<service-id>ksql-rest-app-command-runner
will be in anERROR
state
v0.6.0 (2019-11-19)
- add config to disable pull queries when validation is required (#3879) (ccc636d), closes #3863
- add configurable metrics reporters (#3490) (378b8af)
- add flag to disable pull queries (MINOR) (#3778) (04e206f)
- add health check endpoint (#3501) (2308686)
- add KsqlUncaughtExceptionHandler and new KsqlRestConfig for enabling it (#3425) (d83c787)
- add request logging (#3518) (c401ec0)
- Add UDF invoker benchmark (#3592) (83dfc24)
- Added UDFs ENTRIES and GENERATE_SERIES (#3724) (0a4558b)
- build ks app from an execution plan visitor (#3418) (b57d194)
- build materializations from the physical plan (#3494) (f45d649)
- change /metadata REST path to /v1/metadata (#3467) (ed94895)
- Config file for the no-response bot which closes issues which haven't been responded to (#3765) (1dfdb68)
- drop legacy key field functionality (#3764) (5369dc2)
- expose query status through EXPLAIN (#3570) (8ef82eb)
- expression support for insert values (#3612) (37f9763)
- Implement complex expressions for table functions (#3683) (200022b)
- Implement describe and list functions for UDTFs (#3716) (b0bbea4)
- Implement EXPLODE(ARRAY) for single table function in SELECT (#3589) (8b52aa8)
- Implement schemaProvider for UDTFs (#3690) (4e66825)
- Implement user defined table functions (#3687) (e62bd46)
- Makes timeout for owner lookup in StaticQueryExecutor and rebalancing in KsStateStore configurable (#3856) (39245c6)
- pass auth header to connect client for RBAC integration (#3492) (cef0ea3)
- serialize expressions (#3721) (e1cd477)
- Support multiple table functions in queries (#3685) (44be5a2)
- support numeric json serde for decimals (#3588) (8621594)
- support quoted identifiers in column names (#3477) (be2bdcc)
- Transactional Produces to Command Topic (#3660) (cba2877)
- static: allow logical schema to have fields in any order (#3422) (d935af3)
- static: allow windowed queries without bounds on rowtime (#3438) (6593ee3)
- static: fail on ROWTIME in projection (#3430) (2f27b68)
- static: support for partial datetimes on
WindowStart
bounds (#3435) (99f6e24) - static: support ROWKEY in the projection of static queries (#3439) (9218766)
- static: switch partial datetime parser to use UTC by default (#3473) (81557e3)
- static: unordered table elements and meta columns serialization (#3428) (3b23fd6)
- add KsqlRocksDBConfigSetter to bound memory and set num threads (#3167) (cdcaa2d)
- add logs/ to .gitignore (MINOR) (#3353) (81272cf)
- add offest to QueuedCommand and flag to Command (#3343) (fd112a4)
- add option for datagen to produce indefinitely (MINOR) (#3307) (6281738)
- add REST /metadata path to display KSQL server information (replaces /info) (#3313) (8be29b9)
- add SHOW TYPES to list all custom types (#3280) (13fde33)
- Add support for average aggregate function (#3302) (6757d9f)
- back out Connect auto-import feature (#3386) (d4c748c)
- build execution plan from structured package (#3285) (0d0b1c3)
- change the public API of schema provider method (#3287) (1324285)
- custom comparators for array, map and struct (#3385) (fe63d21)
- Implement ROUND() UDF (#3404) (f9783a9)
- Implement user defined delimiter for value format (#3393) (b84d0aa)
- move aggregation to plan builder (#3391) (3aaeb73)
- move filters to plan builder (#3346) (d4d52f3)
- move groupBy into plan builders (#3359) (730c913)
- move joins to plan builder (#3361) (e243c74)
- move selectKey impl to plan builder (#3362) (f312fcc)
- move setup of the sink to plan builders (#3360) (bfbdc20)
- remove equalsIgnoreCase from all Name classes (MINOR) (#3411) (b78619c)
- update query id generation to use command topic record offset (#3354) (295314e)
- cli: add the feature to turn of WRAP for CLI output (#3341) (3814c71)
- static: add custom jackson JSON serde for handling LogicalSchema (#3322) (c571508)
- static: add forEach() to KsqlStruct (MINOR) (#3320) (587b545)
- static: initial syntax for static queries (#3300) (8917e48)
- static: static select support (#3369) (e4b3275)
- move toTable kstreams calls to plan builder (#3334) (06aa252)
- use coherent naming scheme for generated java code (#3417) (06a2a57)
- static: initial drop of static query functionality (#3340) (54c5139)
- add ability to DROP custom types (#3281) (32005ed)
- Add schema resolver method to UDF specification (#3215) (08855ad)
- add support to register custom types with KSQL (
CREATE TYPE
) (#3266) (08ffebf) - include error message in DESCRIBE CONNECTOR (MINOR) (#3289) (458f1d8)
- perform topic permission checks for KSQL service principal (#3261) (ba1f613)
- wire in the KS config needed for point queries (MINOR) (#3251) (5152d06)
- add a new module ksql-execution for the execution plan interfaces (#3125) (3251d25)
- add an initial set of execution steps (#3214) (c860793)
- add config for custom metrics tags (5.3.x) (#2996) (76f5590)
- add config for enabling topic access validator (#3079) (440e247)
- add connect templates and simplify JDBC source (MINOR) (#3231) (ba0fb99)
- add DESCRIBE functionality for connectors (#3206) (a79adb4)
- add DROP CONNECTOR functionality (#3245) (103c958)
- add extension for custom metrics (5.3.x) (#2997) (94a8ae7)
- add Logarithm, Exponential and Sqrt functions (#3091) (a4ca934)
- add SHOW CONNECTORS functionality (#3210) (0bf31eb)
- Add SHOW TOPICS EXTENDED (#3183) (dd3eb5f), closes #1268
- add SIGN, REPLACE and INITCAP (#3189) (ab67684)
- Add window size for time window tables (#3102) (6ff07d5)
- allow for decimals to be used as input types for UDFs (#3217) (4a2e4b9)
- enhance datagen for use as a load generator (#3230) (ddb970b)
- Extend UdfLoader to allow loading specific classes with UDFs (#3234) (99c79b3)
- format error messages better (MINOR) (#3233) (c727d50)
- improved error message and updated error code for PrintTopics command (#3246) (4b94f22)
- some robustness improvements for Connect integration (#3227) (bc1a2f8)
- spin up a connect worker embedded inside the KSQL JVM (#3241) (4d7ef2a)
- validate createTopic permissions on SandboxedKafkaTopicClient (#3250) (0ea157b)
- data-gen: support KAFKA format in DataGen (#3120) (cb7abcc)
- ksql-connect: poll connect-configs and auto register sources (#3178) (6dd21fd)
- wrap timestamps in ROWTIME expressions with STRINGTOTIMESTAMP (#3160) (42acd78)
- ksql-connect: introduce ConnectClient for REST requests (#3137) (15548ce)
- ksql-connect: introduce syntax for CREATE CONNECTOR (syntax only) (#3139) (e823659)
- ksql-connect: wiring for creating connectors (#3149) (cd20d57)
- udfs: generic support for UDFs (#3054) (a381c48)
- cli: improve CLI transient queries with headers and spacing (partial fix for #892) (#3047) (050b72a)
- serde: kafka format (#3065) (2b5c3d1)
- add basic support for key syntax (#3034) (ca6478a)
- Add REST and Websocket authorization hooks and interface (#3000) (39af991)
- Arrays should be indexable from the end too (#3004) (a166075), closes #2974
- decimal math with other numbers (#3001) (14d2bb7)
- New UNIX_TIMESTAMP and UNIX_DATE datetime functions (#2459) (39ce7f4)
- do not spam the logs with config defs (#3044) (94904a3)
- Only look up index of new key field once, not per row processed (#3020) (fda1c7f)
- Remove parsing of integer literals (#3019) (6195b76)
/query
rest endpoint should return valid JSON (#3819) (b278e83)- address upstream change in KafkaAvroDeserializer (#3372) (b32e6a9)
- address upstream change in KafkaAvroDeserializer (revert previous fix) (#3437) (bed164b)
- allow streams topic prefixed configs (#3691) (939c45a), closes #817
- apply filter before flat mapping in logical planner (#3730) (f4bd083)
- band-aid around RestQueryTranslationTest (#3326) (677e03c)
- be more lax on validating config (#3599) (3c80cf1), closes #2279
- better error message if tz invalid in datetime string (#3449) (e93c445)
- better error message when users enter old style syntax for query (#3397) (f948ec0)
- changes required for compatibility with KIP-479 (#3466) (567f056)
- Created new test for running topology generation manually from the IDE, and removed main() from TopologyFileGenerator (#3609) (381e563)
- do not allow inserts into tables with null key (#3605) (7e326b7), closes #3021
- do not allow WITH clause to be created with invalid WINDOW_SIZE (#3432) (96bfc11)
- Don't throw NPE on null columns (#3647) (6969768), closes #3617
- drop
TopicDescriptionFactory
class (#3528) (5281c74) - ensure default server config works with IP6 (fixes #3309) (#3310) (92b03ec)
- error message when UDAF has STRUCT type with no schema (#3407) (49f456e)
- fix Avro schema validation (#3499) (a59954d)
- fix broken map coersion test case (#3694) (b5ea24c)
- fix NPE when printing records with empty value (MINOR) (#3470) (47313ff)
- fix parsing issue with LIST property overrides (#3601) (6459fa1)
- fix test for KIP-307 final PR (#3402) (d77db50)
- improve error message on query or print statement on /ksql (#3337) (dae28eb)
- improve print topic error message when topic does not exist (#3464) (0fa4d24)
- include lower-case identifiers among those that need quotes (#3723) (62c47bf)
- make sure use of non threadsafe UdfIndex is synchronized (#3486) (618aae8)
- pull queries available on
/query
rest & ws endpoint (#3820) (9a47eaf), closes #3672 #3495 - quoted identifiers for source names (#3695) (7d3cf92)
- race condition in KsStateStore (#3474) (7336389)
- Remove dependencies on test-jars from ksql-functional-tests jar (#3421) (e09d6ad)
- Remove duplicate ksql-common dependency in ksql-streams pom (#3703) (0620906)
- Remove unnecessary arg coercion for UDFs (#3595) (4c42530)
- Rename Delimiter:parse(char c) to Delimiter.of(char c) (#3433) (8716c41)
- renamed method to avoid checkstyle error (#3652) (a8a3588)
- revert ipv6 address and update docs (#3314) (0ff4a51)
- Revert named stores in expected topologies, disable naming stores from StreamJoined, re-enable join tests. (#3550) (0b8ccc1), closes #3364
- should be able to parse empty STRUCT schema (MINOR) (#3318) (a6549e1)
- Some renaming around KsqlFunction etc (#3747) (b30d965)
- standardize KSQL up-casting (#3516) (7fe8772)
- support NULL return values from CASE statements (#3531) (eb9e41b)
- support UDAFs with different intermediate schema (#3412) (70e10e9)
- switch AdminClient to be sandbox proxy (#3351) (6747d5c)
- typo in static query WHERE clause example (#3423) (7ad3248)
- Update repartition processor names for KAFKA-9098 (#3802) (2b86cd8)
- Use the correct Immutable interface (#3488) (a1096bf)
- 3356: struct rewritter missed EXPLAIN (#3398) (daf974b)
- 3441: stabilize the StaticQueryFunctionalTest (#3442) (44ae3a0), closes #3441
- 3524: improve pull query error message (#3540) (2be8385), closes #3524
- 3525: sET should only affect statements after it (#3529) (5315f1e), closes #3525
- address deprecation of getAdminClient (#3276) (6a50fca)
- error message with DROP DELETE TOPIC is invalid (#3279) (4284b8c)
- find bugs issue in KafkaTopicClientImpl (#3268) (70e880f)
- fixed how wrapped KsqlTopicAuthorizationException error messages are displayed (#3258) (63672ae)
- improve escaping of identifiers (#3295) (04435d7)
- respect reserved words in more clauses for SqlFormatter (MINOR) (#3284) (6974a80)
- schema converters should handle List and Map impl (#3290) (af779dc)
COLLECT_LIST
can now be applied to tables (#3104) (c239785)- add ksql-functional-tests to the ksql package (#3111) (9548135)
- authorization filter is logging incorrect username (#3138) (b15c6d0)
- broken build due to bad import statements (#3204) (8ec4c2b)
- check for other sources using a topic before deleting (#3070) (b3fa315)
- default timestamp extractor override is not working (#3176) (d1db07b)
- drop succeeds even if missing topic or schema (#3131) (ba03d6f)
- dummy placeholder class in ksql-execution (#3142) (c9f1cbb)
- expose user/password command line options (#3129) (1fd70fa)
- filter null entries before creating KafkaConfigStore (#3147) (2852af1)
- fix auth error message with insert values command (#3257) (abe410a)
- Implement new KIP-455 AdminClient AlterPartitionReassignments and tPartitionReassignments APIs (#3218) (d951026)
- incorrect SR authorization message is displayed (#3186) (b3b6c82)
- logicalSchema toString() to include key fields (MINOR) (#3123) (0984529)
- Remove delete.topic.enable check (#3089) (71ec1c0)
- remove non-standard JavaFx usage of Pair (#3145) (3508847)
- replace nashorn @Immutable with errorprone for JDK12 (MINOR) (#3239) (0c47a34)
- request / on makeRootRequest instead of /info (#3197) (7935488)
- the QTTs now run through SqlFormatter & other formatting fixes (#3222) (79da68c)
- use errorprone Immutable annotation instead of nashorn version (#3150) (e7f5e17)
DESCRIBE
now works for sources with decimal types (#3083) (0eaa101)- don't log out to stderr on parser errors (#3052) (29dea47)
- drop describe topic functionality (MINOR) (#3072) (1290b82)
- ensure topology generator test runs in build (#3067) (3168150)
- misplaced commas when formatting CTAS statements (#3058) (c05615d)
- remove any rowtime or rowkey columns from query schema (MINOR) (Fixes 3039) (#3043) (0346933)
- remove last of registered topics stuff from api / cli (MINOR) (#3068) (24d874c)
- sqlformatter to correctly handle describe (#3074) (8de57bd)
- Introduced
EMIT CHANGES
syntax to differentiate push queries from new pull queries. Persistent push queries do not yet require anEMIT CHANGES
clause, but transient push queries do. - the response from the RESTful API for push queries has changed: it is now a valid JSON document containing a JSON array, where each element is JSON object containing either a row of data, an error message, or a final message. The
terminal
field has been removed. - the response from the RESTful API for push queries has changed: it now returns a line with the schema and query id in a
header
field and null fields are not included in the payload. The CLI is backwards compatible with older versions of the server, though it won't output column headings from older versions. - If users are relying on the previous behaviour of uppercasing topic names, this change breaks that
- If no value is passed for the KSQL datagen option
iterations
, datagen will now produce indefinitely, rather than terminating after a default of 1,000,000 rows. - Previously CLI startup permissions were based on whether the user has access to /info, but now it's based on whether the user has access to /. This change implies that if a user has permissions to / but not /info, they now have access to the CLI whereas previously they did not.
- "SHOW TOPICS" no longer includes the "Consumers" and
"ConsumerGroups" columns. You can use "SHOW TOPICS EXTENDED" to get the
output previous emitted from "SHOW TOPICS". See below for examples.
This change splits "SHOW TOPICS" into two commands:
- "SHOW TOPICS EXTENDED", which shows what was previously shown by
"SHOW TOPICS". Sample output:
ksql> show topics extended; Kafka Topic | Partitions | Partition Replicas | Consumers | ConsumerGroups -------------------------------------------------------------------------------------------------------------------------------------------------------------- _confluent-command | 1 | 1 | 1 | 1 _confluent-controlcenter-5-3-0-1-actual-group-consumption-rekey | 1 | 1 | 1 | 1
- "SHOW TOPICS", which now no longer queries consumer groups and their
active consumers. Sample output:
ksql> show topics; Kafka Topic | Partitions | Partition Replicas --------------------------------------------------------------------------------------------------------------------------------- _confluent-command | 1 | 1 _confluent-controlcenter-5-3-0-1-actual-group-consumption-rekey | 1 | 1
- "SHOW TOPICS EXTENDED", which shows what was previously shown by
"SHOW TOPICS". Sample output:
Release notes for KSQL releases prior to ksqlDB v0.6.0 can be found at docs/changelog.rst.