Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalization: handle non-object top-level schemas; treat binary data as string #22165

Merged
merged 8 commits into from
Jan 31, 2023

Conversation

edgao
Copy link
Contributor

@edgao edgao commented Jan 31, 2023

old schema

{type: [object, string], properties: {...}}

new schema

{oneOf: [{type: object, properties: {...}}, {ref: string}]}

normalization doesn't like this.


also, treat binary data as string for now - this is failing in a mysql->bigquery and mssql->snowflake sync.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 31, 2023

Affected Connector Report

NOTE ⚠️ Changes in this PR affect the following connectors. Make sure to do the following as needed:

  • Run integration tests
  • Bump connector or module version
  • Add changelog
  • Publish the new version

✅ Sources (0)

Connector Version Changelog Publish
  • See "Actionable Items" below for how to resolve warnings and errors.

❌ Destinations (16)

Connector Version Changelog Publish
destination-bigquery 1.2.13
destination-bigquery-denormalized 1.2.12
(diff seed version)
destination-clickhouse 0.2.2
(changelog missing)
destination-clickhouse-strict-encrypt 0.2.2 🔵
(ignored)
🔵
(ignored)
destination-jdbc 0.3.14 🔵
(ignored)
🔵
(ignored)
destination-mssql 0.1.22
destination-mssql-strict-encrypt 0.1.22 🔵
(ignored)
🔵
(ignored)
destination-mysql 0.1.20
destination-mysql-strict-encrypt 0.1.21
(mismatch: 0.1.20)
🔵
(ignored)
🔵
(ignored)
destination-oracle 0.1.19
destination-oracle-strict-encrypt 0.1.19 🔵
(ignored)
🔵
(ignored)
destination-postgres 0.3.26
destination-postgres-strict-encrypt 0.3.26 🔵
(ignored)
🔵
(ignored)
destination-redshift 0.3.56
destination-snowflake 0.4.47
destination-tidb 0.1.0
  • See "Actionable Items" below for how to resolve warnings and errors.

👀 Other Modules (1)

  • base-normalization

Actionable Items

(click to expand)

Category Status Actionable Item
Version
mismatch
The version of the connector is different from its normal variant. Please bump the version of the connector.

doc not found
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug.
Changelog
doc not found
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug.

changelog missing
There is no chnagelog for the current version of the connector. If you are the author of the current version, please add a changelog.
Publish
not in seed
The connector is not in the seed file (e.g. source_definitions.yaml), so its publication status cannot be checked. This can be normal (e.g. some connectors are cloud-specific, and only listed in the cloud seed file). Please double-check to make sure that it is not a bug.

diff seed version
The connector exists in the seed file, but the latest version is not listed there. This usually means that the latest version is not published. Please use the /publish command to publish the latest version.

@octavia-squidington-iv octavia-squidington-iv added the area/documentation Improvements or additions to documentation label Jan 31, 2023
@edgao edgao temporarily deployed to more-secrets January 31, 2023 20:00 — with GitHub Actions Inactive
@edgao edgao temporarily deployed to more-secrets January 31, 2023 20:00 — with GitHub Actions Inactive
@edgao edgao changed the title Normalization: handle dumb top-level schemas Normalization: handle non-object top-level schemas Jan 31, 2023
@edgao edgao changed the title Normalization: handle non-object top-level schemas Normalization: handle non-object top-level schemas; treat binary data as string Jan 31, 2023
@github-actions
Copy link
Contributor

Airbyte Code Coverage

There is no coverage information present for the Files changed

Total Project Coverage 24.51%

@edgao edgao temporarily deployed to more-secrets January 31, 2023 20:09 — with GitHub Actions Inactive
@edgao edgao temporarily deployed to more-secrets January 31, 2023 20:09 — with GitHub Actions Inactive
@edgao edgao temporarily deployed to more-secrets January 31, 2023 20:14 — with GitHub Actions Inactive
Copy link
Contributor

@gosusnp gosusnp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change about binary looks reasonable to me. About the oneOf, I'd defer to Greg or someone with more experience with json schemas / data types.

@edgao edgao temporarily deployed to more-secrets January 31, 2023 20:14 — with GitHub Actions Inactive
@edgao
Copy link
Contributor Author

edgao commented Jan 31, 2023

I'm going to publish this, I think the worst case is no worse than current behavior.

@edgao
Copy link
Contributor Author

edgao commented Jan 31, 2023

/publish connector=bases/base-normalization run-tests=false

@edgao edgao temporarily deployed to more-secrets January 31, 2023 20:28 — with GitHub Actions Inactive
@edgao edgao temporarily deployed to more-secrets January 31, 2023 20:29 — with GitHub Actions Inactive
@edgao
Copy link
Contributor Author

edgao commented Jan 31, 2023

/publish connector=bases/base-normalization run-tests=false

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@edgao edgao temporarily deployed to more-secrets January 31, 2023 20:44 — with GitHub Actions Inactive
@edgao edgao temporarily deployed to more-secrets January 31, 2023 20:44 — with GitHub Actions Inactive
@edgao
Copy link
Contributor Author

edgao commented Jan 31, 2023

/publish connector=bases/base-normalization run-tests=false

🕑 Publishing the following connectors:
bases/base-normalization
https://github.com/airbytehq/airbyte/actions/runs/4058215908


Connector Did it publish? Were definitions generated?
bases/base-normalization

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@github-actions
Copy link
Contributor

github-actions bot commented Jan 31, 2023

Platform Test Results

   243 files  +  5     243 suites  +5   23m 55s ⏱️ + 8m 52s
1 661 tests +40  1 650 ✔️ +36  11 💤 +5  0  - 1 
1 680 runs  +44  1 669 ✔️ +40  11 💤 +5  0  - 1 

Results for commit cf2a30d. ± Comparison against base commit 1ebd913.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 31, 2023

Kube Test Results

  47 files  ±0    47 suites  ±0   20m 57s ⏱️ +12s
208 tests ±0  202 ✔️ ±0  6 💤 ±0  0 ±0 
226 runs  ±0  220 ✔️ ±0  6 💤 ±0  0 ±0 

Results for commit cf2a30d. ± Comparison against base commit 1ebd913.

This pull request removes 4 and adds 4 tests. Note that renamed tests count towards both.
io.airbyte.workers.temporal.scheduling.ConnectionManagerWorkflowTest$FailedActivityWorkflow ‑ [1] Thread[Thread-8,5,main]
io.airbyte.workers.temporal.scheduling.ConnectionManagerWorkflowTest$FailedActivityWorkflow ‑ [2] Thread[Thread-9,5,main]
io.airbyte.workers.temporal.scheduling.ConnectionManagerWorkflowTest$FailedActivityWorkflow ‑ [3] Thread[Thread-10,5,main]
io.airbyte.workers.temporal.scheduling.ConnectionManagerWorkflowTest$FailedActivityWorkflow ‑ [4] Thread[Thread-11,5,main]
io.airbyte.workers.temporal.scheduling.ConnectionManagerWorkflowTest$FailedActivityWorkflow ‑ [1] Thread[Thread-6,5,main]
io.airbyte.workers.temporal.scheduling.ConnectionManagerWorkflowTest$FailedActivityWorkflow ‑ [2] Thread[Thread-7,5,main]
io.airbyte.workers.temporal.scheduling.ConnectionManagerWorkflowTest$FailedActivityWorkflow ‑ [3] Thread[Thread-8,5,main]
io.airbyte.workers.temporal.scheduling.ConnectionManagerWorkflowTest$FailedActivityWorkflow ‑ [4] Thread[Thread-9,5,main]

♻️ This comment has been updated with latest results.

@edgao
Copy link
Contributor Author

edgao commented Jan 31, 2023

/publish connector=bases/base-normalization run-tests=false

🕑 Publishing the following connectors:
bases/base-normalization
https://github.com/airbytehq/airbyte/actions/runs/4058487740


Connector Did it publish? Were definitions generated?
bases/base-normalization

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@edgao edgao temporarily deployed to more-secrets January 31, 2023 21:27 — with GitHub Actions Inactive
@edgao edgao temporarily deployed to more-secrets January 31, 2023 21:27 — with GitHub Actions Inactive
@edgao edgao enabled auto-merge (squash) January 31, 2023 21:40
@edgao
Copy link
Contributor Author

edgao commented Jan 31, 2023

/approve-and-merge reason="OC change, ran a subset of normalizaton integration tests locally"

@octavia-approvington
Copy link
Contributor

What are we doing again?
merge and squash

@octavia-approvington octavia-approvington merged commit 8276d03 into master Jan 31, 2023
@octavia-approvington octavia-approvington deleted the edgao/fixes branch January 31, 2023 21:59
edgao added a commit that referenced this pull request Feb 2, 2023
edgao added a commit that referenced this pull request Feb 6, 2023
* Revert "Normalization: handle non-object top-level schemas; treat binary data as string (#22165)"

This reverts commit 8276d03.

* Revert "Normalization: check for ref type existence (#22161)"

This reverts commit dbe56d6.

* Revert "🎉Updated normalization to handle new datatypes (#19721)"

This reverts commit c1d7736.

* revert dest definitions

* also dockerfile

* re-add to changelog

* add comment in dockerfile
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation Improvements or additions to documentation normalization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants