Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to 2.2.0 #18

Merged
merged 44 commits into from
Feb 19, 2024
Merged

Conversation

JonathonO
Copy link

@JonathonO JonathonO commented Feb 8, 2024

Conflict files:

.github/workflows/IntegrationTest.yml
.github/workflows/PerfTest.yml
pom.xml
pom_confluent.xml
src/main/java/com/snowflake/kafka/connector/Utils.java
src/main/java/com/snowflake/kafka/connector/internal/InternalUtils.java
src/main/java/com/snowflake/kafka/connector/internal/SnowflakeConnectionService.java
src/main/java/com/snowflake/kafka/connector/internal/streaming/SchematizationUtils.java
src/main/java/com/snowflake/kafka/connector/records/RecordService.java

sfc-gh-alhuang and others added 30 commits August 18, 2023 12:59
…atization (snowflakedb#693)

To Handle Reserved Keywords and Special Characters in Schematization, we will convert all names to uppercase and then add double quotes, except for the names that have double quotes already. By doing this, we tell compiler to bypass the reserved keyword or special characters check, and we’re also preserving the old behavior for customers
Examples:

start -> “START”
a@b -> “A@B”
c1/C1 -> c1, C1, “C1” would all work
“ab” -> “ab”
…r Schematization (snowflakedb#730)

Before this change, every element in the array will be added as a STRING, this change preserves the old data type in the source, for example when the input is [1, 2], the ingested value will be [1, 2] now instead of ["1", "2"]

Forked from snowflakedb#727, with additional tests
There're 10 AVRO logical types listed in https://avro.apache.org/docs/1.11.0/spec.html#Logical+Types and we're able to support 4 of them, the mapping is as below:

date -> DATE
time-mills -> TIME(6)
timestamp-mills -> TIMESTAMP_NTZ(6)
decimal -> VARCHAR (We can't do NUMBER because we could have precision bigger than 36)

We can't do for the rest of 6 types because that's not supported by ConnectSchema, see code for more detail. We need to find another way to support other logical types or any sources (like Debezium).
…d of one client for multiple connectors configurations (snowflakedb#744)
sfc-gh-rcheng and others added 9 commits December 7, 2023 15:29
java.lang.NoClassDefFoundError: Could not initialize class net.snowflake.client.jdbc.internal.apache.arrow.memory.RootAllocator occurs when the JDBC driver tries to process the result of the offset migration query. This is a known long-standing issue in the JDBC driver so a workaround is introduced with this fix.
…ause non-exactly once delivery (snowflakedb#775)

Fix two exactly once delivery behavior issues with Snowpipe Streaming:

- When there're gaps between offsets due to records being put into the DLQ or NULL records being skipped, the current logic doesn't work after a channel reopening event (e.g. schema evolution) since it expects continuous offsets in order to guarantee exactly once delivery, causing ingestion to stop.
- When a flush is triggered by size or rowcount threshold, it's possible that only partial rows in the buffer are flushed and with a channel reopening event (e.g. schema evolution), currently the leftover rows in the buffer are still being considered which is causing the offset to be advanced and skip some offsets.

The fix in this PR makes sure we won't flush any leftover rows in the buffer, they will all be skipped and discarded. The next batch should start from the expected offset after the offset reset as part of the channel reopening event.
Conflicts:
	.github/workflows/IntegrationTest.yml
	.github/workflows/PerfTest.yml
	pom.xml
	pom_confluent.xml
	src/main/java/com/snowflake/kafka/connector/Utils.java
	src/main/java/com/snowflake/kafka/connector/internal/InternalUtils.java
	src/main/java/com/snowflake/kafka/connector/internal/streaming/SchematizationUtils.java
	src/main/java/com/snowflake/kafka/connector/records/RecordService.java
@JonathonO JonathonO self-assigned this Feb 8, 2024
@JonathonO
Copy link
Author

@acristu will have 2.2.0 code changes merged shortly. Can you please review this though and the features we've added?

For reference, here's the conflict file list.

We should have column ordering, debezium type handling and auto schematization changes in some of these classes.

Worth nothing Snowflake introduced special char/reserved keyword handling in this release and a new Utils method for quoting column names. Not sure if it affects your column ordering code?

The pom*.xmls also had some version changes (backwards in some cases) and a change in junit dependency. Not sure if that's an issue?

Conflicts:
.github/workflows/IntegrationTest.yml
.github/workflows/PerfTest.yml
pom.xml
pom_confluent.xml
src/main/java/com/snowflake/kafka/connector/Utils.java
src/main/java/com/snowflake/kafka/connector/internal/InternalUtils.java
src/main/java/com/snowflake/kafka/connector/internal/streaming/SchematizationUtils.java
src/main/java/com/snowflake/kafka/connector/records/RecordService.java

Conflicts:
	.github/workflows/snyk-issue.yml
	.github/workflows/snyk-pr.yml
	src/main/java/com/snowflake/kafka/connector/internal/SnowflakeConnectionService.java
	src/main/java/com/snowflake/kafka/connector/internal/streaming/SchematizationUtils.java
	src/main/java/com/snowflake/kafka/connector/records/RecordService.java
@@ -413,8 +443,9 @@ public static JsonNode convertToJson(Schema schema, Object logicalValue) {
ISO_DATE_TIME_FORMAT.get().format((java.util.Date) value));
}
if (schema != null && Time.LOGICAL_NAME.equals(schema.name())) {
return JsonNodeFactory.instance.textNode(
TIME_FORMAT.get().format((java.util.Date) value));
ThreadLocal<SimpleDateFormat> format =
Copy link

@acristu acristu Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JonathonO this looks bad to me ... why instantiate a ThreadLocal as a local variable inside a function ? but I don't think it does any harm, just wasteful ... assume this comes from upstream ...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acristu yes, this comes from upstream.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps raise an issue ticket to discuss?

@JonathonO JonathonO mentioned this pull request Feb 8, 2024
@JonathonO
Copy link
Author

@acristu any issue with merging this now?

@JonathonO JonathonO merged commit 92b4b8f into streamkap-main Feb 19, 2024
0 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.