Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery SCD tables don't reset #15097

Closed
isaacharrisholt opened this issue Jul 28, 2022 · 1 comment
Closed

BigQuery SCD tables don't reset #15097

isaacharrisholt opened this issue Jul 28, 2022 · 1 comment
Labels
autoteam community team/tse Technical Support Engineers type/bug Something isn't working

Comments

@isaacharrisholt
Copy link
Contributor

Environment

  • Airbyte version: 0.39.37-alpha
  • OS Version / Instance: AWS EC2 (Linux AMI)
  • Deployment: Docker
  • Destination Connector and version: BigQuery 1.1.11
  • Step where error happened: Reset

Current Behavior

When running a reset against our BigQuery tables, the _scd tables used for the 'Incremental deduped + history' mode aren't cleared. This means that, when the next sync completes, all the old data is added back to the reset tables, which can lead to dirty data etc. if there have been issues previously.

Expected Behavior

The _scd tables should also be reset so that old history is not included in further syncs.

Steps to Reproduce

  1. Run an incremental deduped sync into BigQuery
  2. Delete/change the source data
  3. Reset the destination tables
  4. Sync again
  5. Old data will be present in BigQuery

Are you willing to submit a PR?

No. From this Discourse thread, it seems the same happens with Snowflake, so it's possibly a larger problem.

@marcosmarxm
Copy link
Member

Duplicate of #5417

@marcosmarxm marcosmarxm marked this as a duplicate of #5417 Aug 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autoteam community team/tse Technical Support Engineers type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants