Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Ensure raw_schema in stream mapper is immutable #1962

Merged
merged 1 commit into from
Sep 15, 2023

Conversation

pnadolny13
Copy link
Contributor

@pnadolny13 pnadolny13 commented Sep 15, 2023

Closes #1961

When trying to figure out how to fix #1961 I realized it was caused by the raw_schema in the mapper being a shared object with the transformed_schema which is passed to get_sink then add_sink, ultimately to the sink itself where SDC columns are added. The rest of the codebase expects transformed_schema to be mutable but raw_schema to be immutable (or at least it reads that way to me). They both share the same reference to the original schema dict so assumptions break. The main case I ran into was in #1961 where target-postgres was constantly creating new sink instances for me when the schema wasnt actually changing.

@codecov
Copy link

codecov bot commented Sep 15, 2023

Codecov Report

Merging #1962 (9197015) into main (97d97de) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #1962   +/-   ##
=======================================
  Coverage   87.08%   87.08%           
=======================================
  Files          59       59           
  Lines        5112     5112           
  Branches      827      827           
=======================================
  Hits         4452     4452           
  Misses        465      465           
  Partials      195      195           
Files Changed Coverage Δ
singer_sdk/mapper.py 80.67% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@edgarrmondragon
Copy link
Collaborator

This reduced running time from 15s to 3s in my very simple load test 😮

@edgarrmondragon edgarrmondragon changed the title fix: raw_schema expected to be immutable but isnt perf: Ensure raw_schema in stream mapper is immutable Sep 15, 2023
@edgarrmondragon edgarrmondragon merged commit 0da7a56 into main Sep 15, 2023
29 checks passed
@edgarrmondragon edgarrmondragon deleted the fix_raw_schema_transformed branch September 15, 2023 23:01
@edgarrmondragon
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: schema comparison before adding metadata columns
2 participants