Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PROD-39429 Add parameter for connector name in channel name: #732

Merged
merged 6 commits into from
Oct 23, 2023

Conversation

sfc-gh-japatel
Copy link
Collaborator

@sfc-gh-japatel sfc-gh-japatel commented Oct 22, 2023

PROD-39429

Description

This Commit is adding a parameter to 3bf9106 and is also reverting the behavior.
i.e If this commit is pushed and is in release branch, the behavior of determining the channelName is reverted. (Contains only topic name and partition number)

Users will have to enable the parameter snowflake.enable.new.channel.name.format (set to true) to have connector name in their channel name.

Scenarios for users reading this commit:

  1. For customers on 2.1.0 where this change was introduced, we are asking you all to be cautious and look for duplicates. Chances of duplicates are very slim but still possible. We would encourage you all to upgrade to next patch version which has this revert released and set parameter to true to continue with your 2.1.0's behavior.
  2. For users who are below 2.1.0, you dont have to do anything except not using 2.1.0 :)

Why

TLDR:
Commit 3bf9106 which was release in v2.1.0 is not compatible with old versions upgrading to 2.1.0 because it changes the channel name and we lose offset information in new channels.
Longer Description:

Assumptions: TopicName = HALLOWEEN, partitions 2, connector_name = connector_name1
1. Channel name HALLOWEEN_1, HALLOWEEN_2 -> offset token 10, 12 respectively in snowflake but last committed offset in kafka is 8.10. 
3. We turned off connector. Upgraded jar and used 2.1.0 kc version. (One with the commit)
4. This new version will create  connector_name1_HALLOWEEN_1, connector_name1_HALLOWEEN_2. 
5. These two channels will have null offset token and since last committed in kafka is 8, 10, we accept that in kafka connector since channel offsets are null on snowflake. 
6. This has potential to have duplicates.

Plan

  • Patch release a new version with this commit.
  • Existing 2.1.0 users will have to enable param to true. (We will reach out to all those users)
  • Default param is false.
  • Long term fix: Smooth transition between old version and new version to transfer offsets from old channelNames to new channelNames

Tests

Added tests using ParameterizedTest feature.

Copy link
Contributor

@sfc-gh-tzhang sfc-gh-tzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, PTAL, and thanks for the quick change!

@@ -168,6 +168,18 @@ public class SnowflakeSinkConnectorConfig {
"Whether to optimize the streaming client to reduce cost. Note that this may affect"
+ " throughput or latency and can only be set if Streaming Snowpipe is enabled";

public static final String ENABLE_CONNECTOR_NAME_IN_STREAMING_CHANNEL_NAME =
"enable.connector_name.in.streaming_channel_name";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a Snowflake config, similar to most of the Snowflake specific configs, let's prefix it with snowflake ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I suggest we make it a more general config, something like snowflake.enable.new.channel.format

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, like that suggestion, let me do that!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it too. We should have done the same for one client optimization but too late now

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have used snowflake.enable.streaming.channel.format.v2
Please let me know if you all like this instead!
(Chances are we will not change it ever and hence I went with v2)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, this proves again that naming is the toughest task in SWE, so living upto that.
We have agreed on snowflake.enable.new.channel.name.format

Will make changes in description.

@sfc-gh-japatel sfc-gh-japatel force-pushed the japatel-PROD-39429-connector-name-channel-name branch from 120233d to 1684a0d Compare October 23, 2023 06:04
@@ -168,6 +168,20 @@ public class SnowflakeSinkConnectorConfig {
"Whether to optimize the streaming client to reduce cost. Note that this may affect"
+ " throughput or latency and can only be set if Streaming Snowpipe is enabled";

public static final String SNOWFLAKE_ENABLE_STREAMING_CHANNEL_FORMAT_V2 =
"snowflake.enable.streaming.channel.format.v2";
public static final boolean SNOWFLAKE_ENABLE_STREAMING_CHANNEL_FORMAT_V2_DEFAULT = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we doing default false or default true?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default false ensures 2.0.1 or below upgrade is safe.
default true ensures 2.1.0 upgrade is safe.

default false makes sense to me but we need to make sure the scenario for 2.1.0 that upgrade are well tested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default should be false, and we will include the communication on how 2.1.0 customers will do the upgrade (they have the choice of making it true, or using false with their own risk (stop the ingestion and waiting for everything to be committed)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default should be false.

Copy link
Collaborator

@sfc-gh-rcheng sfc-gh-rcheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm, lets wait on toby and xin's approvals too.

Have you gotten a repro of the upgrade issue and confirmed that this change fixes it for them?

@@ -527,11 +556,20 @@ public Optional<MetricRegistry> getMetricRegistry(String partitionChannelKey) {
* or PROD)
* @param topic topic name
* @param partition partition number
* @param shouldUseConnectorNameInChannelName If true, use connectorName, else not
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add jira number or pr number in case we plan to revert this change in the future

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We dont have plans to revert it for now!

Copy link
Contributor

@sfc-gh-tzhang sfc-gh-tzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

import static com.snowflake.kafka.connector.SnowflakeSinkConnectorConfig.INGESTION_METHOD_OPT;
import static com.snowflake.kafka.connector.SnowflakeSinkConnectorConfig.KEY_CONVERTER_CONFIG_FIELD;
import static com.snowflake.kafka.connector.SnowflakeSinkConnectorConfig.VALUE_CONVERTER_CONFIG_FIELD;
import static com.snowflake.kafka.connector.SnowflakeSinkConnectorConfig.*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: No star import

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack!

@sfc-gh-japatel
Copy link
Collaborator Author

sfc-gh-japatel commented Oct 23, 2023

Overall lgtm, lets wait on toby and xin's approvals too.

Have you gotten a repro of the upgrade issue and confirmed that this change fixes it for them?

Repro is not done yet since it is non-trivial and also this fix doesn't confirm it will fix issue for existing 2.1.0 customers. Existing 2.1.0 customers will have to use the configuration and set it to true.

This revert and parameter is to reduce the blast radius and going into an unwanted territory. (I will modify description)

Repro'ing this along with blast radius analysis is going to be my next priority.

Comment on lines 568 to 572
if (shouldUseConnectorNameInChannelName) {
return connectorName + "_" + topic + "_" + partition;
} else {
return topic + "_" + partition;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, this can be reduced to:

return shouldUseConnectorNameInChannelName ? 
    connectorName + "_" + topic + "_" + partition :
    topic + "_" + partition;

Copy link

@sfc-gh-tjones sfc-gh-tjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM barring the outstanding comments others have made.

@sfc-gh-japatel sfc-gh-japatel merged commit feaa491 into master Oct 23, 2023
31 of 32 checks passed
@sfc-gh-japatel sfc-gh-japatel deleted the japatel-PROD-39429-connector-name-channel-name branch October 23, 2023 22:22
sfc-gh-japatel added a commit that referenced this pull request Oct 26, 2023
khsoneji pushed a commit to confluentinc/snowflake-kafka-connector that referenced this pull request Dec 4, 2023
khsoneji pushed a commit to confluentinc/snowflake-kafka-connector that referenced this pull request Dec 4, 2023
khsoneji pushed a commit to confluentinc/snowflake-kafka-connector that referenced this pull request Dec 4, 2023
EduardHantig pushed a commit to streamkap-com/snowflake-kafka-connector that referenced this pull request Feb 1, 2024
sudeshwasnik pushed a commit to confluentinc/snowflake-kafka-connector that referenced this pull request Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants