Generalize schema retriever #291

sahithi03 · 2020-07-28T18:15:26Z

Changes made according to the Github issue - #245

Added regexRouter SMT which replaces the topic field of each SinkRecord with (dataset)=(tablename)
Removed the topicToTableResolver class since it will no longer be used. Refactored and adjusted behavior accordingly to make use of sinkRecord with topic field containing destination table.
SchemaRetriever API changed to retrieve the schema from each SinkRecord.
Removed SchemaRegistrySchemaRetriever and MemorySchemaRetriever as these retrieve schema based on the topic and wouldn't be compatible with the new API and SMTs.
Implemented Schema Unionization logic to retrieve the schema from a batch of sink records with heterogenous schemas and for Schema Updates - Currently handles only two cases:
- Adding new fields to the schema
- Changing a field's mode from Required to Nullable.
Updated Java doc
Integration tests needed some changes to work with SMTs and schema unionization.
Unit tests yet to be fixed.

CLAassistant · 2020-07-28T18:15:31Z

All committers have signed the CLA.

C0urante · 2020-07-29T18:46:35Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/BigQuerySinkTask.java

+    // SMT RegexTransformation replaces the topic with <Dataset>=<TableName>
+    String dataset = record.topic().split("=")[0];
+    String tableName = record.topic().split("=")[1];


Is there heavy demand for dynamic routing to different datasets? An alternative here could be to allow the user to configure a single static dataset in the connector and, if multiple datasets are desired, use multiple connector instances.

Or, for something in the middle, we can allow the user to specify a static dataset as a "default" in case the record topic doesn't use the <dataset>=<table> syntax.

What do you think?

This sounds good to me

I agree with @C0urante. This part feels a little over-engineered to me. Some notes:

Allow users to set a default dataset.

Multiple identical string splits are wasteful (slow) operations. Just split once.

Nit: = as a divider is confusing to me for dataset/table splitter. Something like : seems better--it's used for splitting Java classpaths, for example. : is a reserved char in both Kafka topics and BQ dataset/tables (I think), so should be safe.

Will update accordingly

C0urante · 2020-07-29T18:48:12Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/BigQuerySinkTask.java

-    // Dynamic update shall not require connector restart and shall compute table id in runtime.
-    if (!topicsToBaseTableIds.containsKey(record.topic())) {
-      TopicToTableResolver.updateTopicToTable(config, record.topic(), topicsToBaseTableIds);
+  private void ensureExistingTable(TableId table) {


We could rename this to maybeEnsureExistingTable and then do the check for config.getBoolean(config.TABLE_CREATE_CONFIG) here instead of putting that burden on the caller.

Yes. Will make this change.

C0urante · 2020-07-29T18:50:54Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/BigQuerySinkTask.java

+        BucketInfo bucketInfo = BucketInfo.of(bucketName);
+        bucket = gcs.create(bucketInfo);
+      }
+      else throw new ConfigException("Bucket does not exist. Set "+ config.AUTO_CREATE_BUCKET_CONFIG + " to true");


Suggestion:

Suggested change

else throw new ConfigException("Bucket does not exist. Set "+ config.AUTO_CREATE_BUCKET_CONFIG + " to true");

else throw new ConnectException("Bucket '" + bucketName + "' does not exist; create the bucket manually, or set '" + config.AUTO_CREATE_BUCKET_CONFIG + "' to true.");

Nit: wrap all blocks in braces...

else { }

Nit: " + ... (add space before plus)

kcbq-connector/quickstart/properties/connector.properties

C0urante · 2020-07-29T18:55:11Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/SchemaManager.java

+    for (SinkRecord record: records) {
+      Schema kafkaValueSchema = schemaRetriever.retrieveValueSchema(record);
+      Schema kafkaKeySchema = kafkaKeyFieldName.isPresent() ? schemaRetriever.retrieveKeySchema(record) : null;
+      tableDescription = (kafkaValueSchema.doc() != null) ? kafkaValueSchema.doc() : tableDescription;


Nit: don't need the parentheses here

Suggested change

tableDescription = (kafkaValueSchema.doc() != null) ? kafkaValueSchema.doc() : tableDescription;

tableDescription = kafkaValueSchema.doc() != null ? kafkaValueSchema.doc() : tableDescription;

Not quite sure how to think about this, but in cases where sink records don't all have the same docstring, the last one will be used. Not something I think we need to worry about, but wanted to call it out.

C0urante · 2020-07-29T18:56:34Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/config/BigQuerySinkConfig.java

+  public static final Boolean AUTO_CREATE_BUCKET_DEFAULT =                true;
+  private static final ConfigDef.Importance AUTO_CREATE_BUCKET_IMPORTANCE = ConfigDef.Importance.MEDIUM;
+  private static final String AUTO_CREATE_BUCKET_DOC =
+          "Whether to automatically create the given bucket, if it does not exist";


This only applies when GCS batch loading is enabled, right? If so, can we briefly clarify that here?

Yes! Will update the doc.

C0urante · 2020-07-29T18:57:00Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/config/BigQuerySinkConfig.java

+  public static final Boolean ADD_NEW_BQ_FIELDS_DEFAULT =               false;
+  private static final ConfigDef.Importance ADD_NEW_BQ_FIELDS_IMPORTANCE = ConfigDef.Importance.MEDIUM;
+  private static final String ADD_NEW_BQ_FIELDS_DOC =
+          "If true, new fields can be added to BigQuery tables during subsequent schema updates"


How is this different from the existing schema update behavior?

I think we should eliminate the existing autoUpdateSchemas and just have autoCreateBucket and addNewBigQueryFields

Bike shedding, but I think allowNewBigQueryFields is more explicit.

Ah, gotcha. So really we're replacing autoUpdateSchemas with a combination of addNewBigQueryFields and changeRequiredFieldsToNullable, right? That should cover the only two schema evolutions based on the BigQuery docs.

Will the unionization logic be configurable? Think we might want to make it toggle-able and even off by default, since it's a bit of a footgun if you accidentally send even just one bad record to the connector (a bunch of unnecessary columns will get added to your table and BigQuery doesn't allow deletion of columns, so you're pretty much stuck with them unless you want to delete and then recreate the table).

Yea, agree. It should be configurable, and off by default.

Hmm. Thought on this a bit more. I believe the update logic is configurable and off by default now.

We have allowNewBigQueryFields and allowBigQueryRequiredFieldRelaxation now. These both default to false. I think these effectively covers the unionization logic, since the if is:

if (!currentFields.containsKey(entry.getKey())) { if (allowNewBQFields && (entry.getValue().getMode().equals(Field.Mode.NULLABLE) || (entry.getValue().getMode().equals(Field.Mode.REQUIRED) && allowBQRequiredFieldRelaxation))) { ... } else { if (currentFields.get(entry.getKey()).getMode().equals(Field.Mode.REQUIRED) && newFields.get(entry.getKey()).getMode().equals(Field.Mode.NULLABLE)) { if (allowBQRequiredFieldRelaxation) { currentFields.put(entry.getKey(), entry.getValue().toBuilder().setMode(Field.Mode.NULLABLE).build()); ...

Since both default to false, schemas won't get updated by default.

@criccomini @stoynov96 @sahithi03 sorry to dredge this up again, but I think we might want to reconsider some of the logic here.

The prior behavior of the connector was to basically send a single record's schema to BigQuery and let validation happen there; the only permitted operations were (and still are) adding new columns to a table, and relaxing existing columns from REQUIRED to NULLABLE. This meant that it was possible to relax required fields to nullable, but only if there was a corresponding upstream schema change.

The new behavior of the connector still catches this case, but also automatically relaxes REQUIRED fields in the existing table schema to NULLABLE if they're missing from the most recent upstream schema.

This is risky, since it means that a single misplaced record with a completely disjoint schema from the existing table schema can cause permanent modifications to be made to the BigQuery table schema. Granted, this would require allowNewBQFields and allowBQRequiredFieldRelaxation to both be set to true, but it's not unreasonable for people to want to enable both with the expectation that they would cause the connector to act in the same way as it would have with autoUpdateSchemas.

I think we might still want to add a third config property, allowSchemaUnionization, that toggles the schema unionization behavior. If it's set to false and both allowNewBQFields and allowBQRequiredFieldRelaxation are set to true, then the prior behavior of the autoUpdateSchemas property should be preserved effectively for users who still want that.

WDYT?

C0urante · 2020-07-29T18:57:53Z

...nnector/src/main/java/com/wepay/kafka/connect/bigquery/retrieve/IdentitySchemaRetriever.java

@@ -0,0 +1,28 @@
+package com.wepay.kafka.connect.bigquery.retrieve;


Don't forget the copyright header :)

(On all new files)

C0urante · 2020-07-29T19:00:38Z

kcbq-connector/test/resources/connector-template.properties

+transforms=RegexTransformation
+transforms.RegexTransformation.type=org.apache.kafka.connect.transforms.RegexRouter
+transforms.RegexTransformation.regex=.*
+transforms.RegexTransformation.replacement=$0


What's this supposed to do? AFAICT it's going to grab the entire topic (matched by .*), and then replace it with the entire match ($0)... so will this have any effect on the records here?

No, this will not have any effect on the records. The regex was just meant to be an example/placeholder :)

Ah, gotcha. In that case, can we do one of the following (no string preference on my part for/against any of them):

Add a comment explaining what this is and how someone might alter it to achieve table routing behavior

Remove it

Alter it to provide a more practical example (such as appending or stripping a prefix or suffix, or redirecting all records from one topic to a different topic but leaving all others unaffected)

I vote for:

Alter it to provide a more practical example (such as appending or stripping a prefix or suffix, or redirecting all records from one topic to a different topic but leaving all others unaffected)

...egration-test/java/com/wepay/kafka/connect/bigquery/it/BigQueryConnectorIntegrationTest.java

criccomini · 2020-07-29T20:53:30Z

...egration-test/java/com/wepay/kafka/connect/bigquery/it/BigQueryConnectorIntegrationTest.java

@@ -176,8 +173,15 @@ private Object convertField(Field fieldSchema, FieldValue field) {
    List<Object> result = new ArrayList<>();
    assert (rowSchema.size() == row.size());

+    for (int i=0; i < rowSchema.size(); i++) {


Nit: i = 0 with spaces

criccomini · 2020-07-29T20:54:09Z

...ector/src/integration-test/java/com/wepay/kafka/connect/bigquery/it/utils/BucketClearer.java

+    Bucket bucket = gcs.get(bucketName);
+    if (bucket != null) {
+      logger.info("Deleting objects in the Bucket {}", bucketName);
+      for (Blob blob: bucket.list().iterateAll()) {


Nit: blob : (+ space)

criccomini · 2020-07-29T20:56:01Z

...egration-test/java/com/wepay/kafka/connect/bigquery/it/BigQueryConnectorIntegrationTest.java

@@ -176,8 +173,15 @@ private Object convertField(Field fieldSchema, FieldValue field) {
    List<Object> result = new ArrayList<>();
    assert (rowSchema.size() == row.size());

+    for (int i=0; i < rowSchema.size(); i++) {
+      if (rowSchema.get(i).getName().equals("row")) {


I'm confused. Why is this done twice, once with ! and then the subsequent loop without?

Theschema unionizationlogic creates fields in the BigQuery table which might not follow the same order as the fields in the schema.json file. The integration tests compare the actual and expected rows based on testRow.get(0) which is supposed to be the “row” field. But since the fields would now be shuffled, I tried to add the “row” field values first in the testRow list to make sure that it is the first value. Then in the subsequent loop skipped the "row" field values, so that these are not entered twice in the list. But I would want to know if there's a better way.

Not sure how generalized indexOf() can be in java, can we use this to make this cleaner?

Edit: looks like we can't specify a comparator for it actually

...ector/src/integration-test/java/com/wepay/kafka/connect/bigquery/it/utils/BucketClearer.java

criccomini · 2020-07-29T22:07:25Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/utils/SinkRecordConverter.java

+    private RecordConverter<Map<String, Object>> recordConverter;
+
+    public SinkRecordConverter(BigQuerySinkTaskConfig config) {
+        this.config = config;


Would prefer this class takes recordConverter, boolean sanitizeFieldName, (optional) keyName, (optional) dataName. This makes the class cleaner to mock.

Will update accordingly.

criccomini · 2020-07-29T22:07:37Z

...onnector/src/main/java/com/wepay/kafka/connect/bigquery/write/batch/GCSBatchTableWriter.java

-import java.util.ArrayList;
-import java.util.List;
-import java.util.Map;
+import java.util.*;


Wildcard again

criccomini · 2020-07-29T22:10:17Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/write/row/GCSToBQWriter.java

@@ -100,11 +98,11 @@ public GCSToBQWriter(Storage storage,
   * @param blobName the name of the GCS blob to write.
   * @throws InterruptedException if interrupted.
   */
-  public void writeRows(List<RowToInsert> rows,
+  // writeRows -> needs to be changed


oops, left this here by mistake. Will remove it.

criccomini · 2020-07-29T22:11:46Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/write/row/GCSToBQWriter.java

-      schemaManager.createTable(tableId, topic);
-      logger.info("Table {} does not exist, auto-created table for topic {}", tableId, topic);
+      schemaManager.createTable(tableId, records);
+      logger.info("Table {} does not exist, auto-created table ", tableId);


Log message should come before schemaManager.createTable

will update it.

criccomini · 2020-07-29T22:12:38Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/write/batch/TableWriter.java

-import java.util.ArrayList;
-import java.util.List;
-import java.util.Map;
+import java.util.*;


criccomini · 2020-07-29T22:16:54Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/write/row/BigQueryWriter.java

-import java.util.Map;
-import java.util.Random;
-import java.util.Set;
+import java.util.*;


criccomini · 2020-07-29T22:16:59Z

...nnector/src/main/java/com/wepay/kafka/connect/bigquery/write/row/AdaptiveBigQueryWriter.java

-import java.util.HashMap;
-import java.util.List;
-import java.util.Map;
+import java.util.*;


criccomini · 2020-07-29T22:17:12Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/write/batch/TableWriter.java

    try {
      while (currentIndex < rows.size()) {
-        List<RowToInsert> currentBatch =
-            rows.subList(currentIndex, Math.min(currentIndex + currentBatchSize, rows.size()));
+        List<Map.Entry<SinkRecord,RowToInsert>> currentBatchList =


Nit: SinkRecord, RowToInsert (space)

...nnector/src/main/java/com/wepay/kafka/connect/bigquery/write/row/AdaptiveBigQueryWriter.java

codecov-commenter · 2020-08-05T21:39:15Z

Codecov Report

Merging #291 into generalization-feature will decrease coverage by 4.76%.
The diff coverage is 45.68%.

@@                     Coverage Diff                      @@
##             generalization-feature     #291      +/-   ##
============================================================
- Coverage                     70.87%   66.10%   -4.77%     
+ Complexity                      301      267      -34     
============================================================
  Files                            32       32              
  Lines                          1538     1484      -54     
  Branches                        164      152      -12     
============================================================
- Hits                           1090      981     -109     
- Misses                          390      450      +60     
+ Partials                         58       53       -5

Impacted Files	Coverage Δ	Complexity Δ
.../kafka/connect/bigquery/BigQuerySinkConnector.java	`78.37% <ø> (+10.45%)`	`11.00 <0.00> (-3.00)`	⬆️
...ect/bigquery/retrieve/IdentitySchemaRetriever.java	`0.00% <0.00%> (ø)`	`0.00 <0.00> (?)`
...om/wepay/kafka/connect/bigquery/SchemaManager.java	`28.28% <4.28%> (-48.19%)`	`6.00 <0.00> (-3.00)`
...afka/connect/bigquery/write/row/GCSToBQWriter.java	`80.00% <25.00%> (ø)`	`13.00 <1.00> (ø)`
...ect/bigquery/write/row/AdaptiveBigQueryWriter.java	`39.68% <40.00%> (-0.95%)`	`6.00 <1.00> (ø)`
...wepay/kafka/connect/bigquery/BigQuerySinkTask.java	`57.21% <48.57%> (-1.74%)`	`27.00 <7.00> (ø)`
...ka/connect/bigquery/utils/SinkRecordConverter.java	`66.66% <66.66%> (ø)`	`3.00 <3.00> (?)`
...onnect/bigquery/config/BigQuerySinkTaskConfig.java	`95.38% <75.00%> (-0.21%)`	`14.00 <0.00> (ø)`
...ka/connect/bigquery/config/BigQuerySinkConfig.java	`78.94% <90.47%> (-2.83%)`	`16.00 <1.00> (-13.00)`
...nect/bigquery/write/batch/GCSBatchTableWriter.java	`86.20% <100.00%> (-0.46%)`	`3.00 <2.00> (ø)`
... and 5 more

stoynov96 · 2020-08-06T23:05:53Z

kcbq-connector/quickstart/properties/connector.properties


 bufferSize=100000
 maxWriteSize=10000
 tableWriteWait=1000

+transforms=RegexTransformation
+transforms.RegexTransformation.type=org.apache.kafka.connect.transforms.RegexRouter
+# .* is a placeholder for the actual regex


Maybe rephrase as something like "A placeholder regex router SMT that does nothing. Replace with relevant transformation SMT"

stoynov96 · 2020-08-06T23:06:22Z

kcbq-connector/quickstart/properties/connector.properties

 ########################################### Fill me in! ###########################################
 # The name of the BigQuery project to write to
 project=
 # The name of the BigQuery dataset to write to (leave the '.*=' at the beginning, enter your
 # dataset after it)
 datasets=.*=
 # The location of a BigQuery service account JSON key file
-keyfile=
+keyfile=


nit: newline at the end of the file

stoynov96 · 2020-08-06T23:06:49Z

...egration-test/java/com/wepay/kafka/connect/bigquery/it/BigQueryConnectorIntegrationTest.java

 import java.util.List;
 import java.util.Properties;

+


nit: redundant new line

stoynov96 · 2020-08-06T23:12:59Z

...egration-test/java/com/wepay/kafka/connect/bigquery/it/BigQueryConnectorIntegrationTest.java

@@ -176,8 +173,15 @@ private Object convertField(Field fieldSchema, FieldValue field) {
    List<Object> result = new ArrayList<>();
    assert (rowSchema.size() == row.size());

+    for (int i=0; i < rowSchema.size(); i++) {
+      if (rowSchema.get(i).getName().equals("row")) {


Not sure how generalized indexOf() can be in java, can we use this to make this cleaner?

Edit: looks like we can't specify a comparator for it actually

stoynov96 · 2020-08-06T23:21:24Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/BigQuerySinkTask.java

+  private void maybeEnsureExistingTable(TableId table) {
+    BigQuery bigQuery = getBigQuery();
+    if (bigQuery.getTable(table) == null && !config.getBoolean(config.TABLE_CREATE_CONFIG)) {
+      logger.warn("You may want to enable auto table creation by setting {}=true in the properties file",


Can we append this warning to the exception message instead?

stoynov96 · 2020-08-06T23:24:45Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/BigQuerySinkTask.java

+      tableName = smtReplacement[0];
+    } else {
+      throw new ConfigException("Incorrect regex replacement format. " +
+              "SMT replacement should either follow <dataset>:<tableName> format or replace by <tableName> only.");


This message could be a bit unclear. I think it'd be clearer like
SMT replacement should either produce the <dataset>:<tableName> format or just the <tablename> format.

criccomini · 2020-08-11T20:31:51Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/SchemaManager.java

-  // package private for testing.
-  TableInfo constructTableInfo(TableId table, Schema kafkaKeySchema, Schema kafkaValueSchema) {
-    com.google.cloud.bigquery.Schema bigQuerySchema = getBigQuerySchema(kafkaKeySchema, kafkaValueSchema);
+  private TableInfo getTableInfo(TableId table, Set<SinkRecord> records) {


Could you add some Javadocs on these new private methods? I think they're sufficiently complicated to warrant some docs.

Sure. Will update.

criccomini · 2020-08-11T20:46:30Z

kcbq-connector/src/main/java/com/wepay/kafka/connect/bigquery/config/BigQuerySinkConfig.java

+  public static final Boolean ADD_NEW_BQ_FIELDS_DEFAULT =               false;
+  private static final ConfigDef.Importance ADD_NEW_BQ_FIELDS_IMPORTANCE = ConfigDef.Importance.MEDIUM;
+  private static final String ADD_NEW_BQ_FIELDS_DOC =
+          "If true, new fields can be added to BigQuery tables during subsequent schema updates"


Hmm. Thought on this a bit more. I believe the update logic is configurable and off by default now.

We have allowNewBigQueryFields and allowBigQueryRequiredFieldRelaxation now. These both default to false. I think these effectively covers the unionization logic, since the if is:

if (!currentFields.containsKey(entry.getKey())) { if (allowNewBQFields && (entry.getValue().getMode().equals(Field.Mode.NULLABLE) || (entry.getValue().getMode().equals(Field.Mode.REQUIRED) && allowBQRequiredFieldRelaxation))) { ... } else { if (currentFields.get(entry.getKey()).getMode().equals(Field.Mode.REQUIRED) && newFields.get(entry.getKey()).getMode().equals(Field.Mode.NULLABLE)) { if (allowBQRequiredFieldRelaxation) { currentFields.put(entry.getKey(), entry.getValue().toBuilder().setMode(Field.Mode.NULLABLE).build()); ...

Since both default to false, schemas won't get updated by default.

stoynov96 · 2020-08-12T22:19:32Z

kcbq-connector/quickstart/properties/connector.properties

@@ -39,6 +41,6 @@ transforms.RegexTransformation.replacement=$0
 project=
 # The name of the BigQuery dataset to write to (leave the '.*=' at the beginning, enter your


This comment seems to be outdated now. We should update it

Will update

* Change Bucket Clearer * Add RegexRouter Transform * Change topicToTableResolver Methods * Change EnsureExistingTables Method * Change SchemaRegistrySchemaRetriever * Add RecordNameStrategy * Change standalone properties * Delete TopicToTableResolver and TopicToTableResolverTest class * Generalise schemaRetriever interface * Change schemaRegistrySchemaRetriever * Change TableClearer Class to clear all tables in a Dataset * Add IdentitySchemaRetriever Class * Change BigQuerySinkTask * Remove TOPIC_TO_TABLES config from BigQuerySinkConfig class * Modify BigQuerySinkConnectorTest and BigQuerySinkTaskTest * Delete SchemaRegistrySchemaRetriever Class * Delete MemorySchemaRetriever Class * Change Writer Classes * Modify bigQuerySinkTaskTest * Change SchemaManagerTest Class * Modify SMT Co-authored-by: Sahithi Reddy Velma <[email protected]>

sahithi03 added 21 commits July 1, 2020 16:28

Change Bucket Clearer

b72811f

Add RegexRouter Transform

fc92520

Add regexrouter transform

f18d496

Change topicToTableResolver Methods

9f01b43

Change EnsureExistingTables Method

6a12bcd

Change SchemaRegistrySchemaRetriever

7dd01e8

Add RecordNameStrategy

6d809f9

change standalone properties

2193cd0

Delete TopicToTableResolver and TopicToTableResolverTest class

4e570a5

Generalise schemaRetriever interface

6afd852

Change schemaRegistrySchemaRetriever

b3911ac

Change TableClearer Class to clear all tables in a Dataset

d829078

Add IdentitySchemaRetriever Class

7ed8347

Change BigQuerySinkTask

117935e

Remove TOPIC_TO_TABLES config from BigQuerySinkConfig class

181a3b0

Modify BigQuerySinkConnectorTest and BigQuerySinkTaskTest

dbc11a7

Delete SchemaRegistrySchemaRetriever Class

1ffc6e7

Delete MemorySchemaRetriever Class

de7e6ff

Change Writer Classes

cde998f

Remove extra space

1e519b0

Modify bigQuerySinkTaskTest

afb453f

Change SchemaManagerTest Class

d7cf6f9

C0urante reviewed Jul 29, 2020

View reviewed changes

kcbq-connector/quickstart/properties/connector.properties Show resolved Hide resolved

C0urante reviewed Jul 29, 2020

View reviewed changes

criccomini suggested changes Jul 29, 2020

View reviewed changes

Fix unit tests, make changes acc to code review

a42fabe

Minor fixes

bd7ae8a

stoynov96 reviewed Aug 6, 2020

View reviewed changes

Fixes according to code review

5c5b14c

criccomini reviewed Aug 11, 2020

View reviewed changes

stoynov96 reviewed Aug 12, 2020

View reviewed changes

sahithi03 added 2 commits August 13, 2020 12:13

Add javadoc

17ff5a4

Modify SMT

c5d4991

criccomini approved these changes Aug 13, 2020

View reviewed changes

stoynov96 approved these changes Aug 13, 2020

View reviewed changes

stoynov96 merged commit 6622158 into wepay:generalization-feature Aug 14, 2020

stoynov96 mentioned this pull request Aug 25, 2020

Generalize schema retriever (#291) #294

Merged

C0urante mentioned this pull request Sep 15, 2020

Disable schema unionization by default confluentinc/kafka-connect-bigquery#27

Closed

C0urante mentioned this pull request Sep 21, 2021

Deprecate and remove SchemaRetriever interface confluentinc/kafka-connect-bigquery#148

Open

	else throw new ConfigException("Bucket does not exist. Set "+ config.AUTO_CREATE_BUCKET_CONFIG + " to true");
	else throw new ConnectException("Bucket '" + bucketName + "' does not exist; create the bucket manually, or set '" + config.AUTO_CREATE_BUCKET_CONFIG + "' to true.");

	tableDescription = (kafkaValueSchema.doc() != null) ? kafkaValueSchema.doc() : tableDescription;
	tableDescription = kafkaValueSchema.doc() != null ? kafkaValueSchema.doc() : tableDescription;

		@@ -0,0 +1,28 @@
		package com.wepay.kafka.connect.bigquery.retrieve;

		@@ -39,6 +41,6 @@ transforms.RegexTransformation.replacement=$0
		project=
		# The name of the BigQuery dataset to write to (leave the '.*=' at the beginning, enter your

Generalize schema retriever #291

Generalize schema retriever #291

Conversation

sahithi03 commented Jul 28, 2020

CLAassistant commented Jul 28, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Aug 5, 2020 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CLAassistant commented Jul 28, 2020 •

edited

Loading

codecov-commenter commented Aug 5, 2020 •

edited

Loading