Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destination BigQuery: configurable value for file buffer #20975

Closed
1 of 2 tasks
ryankfu opened this issue Jan 3, 2023 · 3 comments · Fixed by #25287
Closed
1 of 2 tasks

Destination BigQuery: configurable value for file buffer #20975

ryankfu opened this issue Jan 3, 2023 · 3 comments · Fixed by #25287
Assignees
Labels
team/destinations Destinations team's backlog type/enhancement New feature or request

Comments

@ryankfu
Copy link
Contributor

ryankfu commented Jan 3, 2023

Tell us about the problem you're trying to solve

Introduces a configurable value for increasing the number of file buffers. The main premise for this functionality is that by increasing the number of file buffers for sources with interleaved data (e.g. Change Data Capture aka CDC) there will be an increase in performance due to a decrease in the number of buffer thrashing. This also will be important once parallel processing gets introduced since data will become interleaved at that point

Reference PR and PR that fixes bug which introduces the same configurability for Destination Redshift. Note the value has been selected with the understanding that currently Airbyte supports 1 GB of available memory for the destination connector. If this value changes then the range of the number of file buffers should also be adjusted

Additional context

Describe the solution you’d like

Introduce a configurable parameter within the spec, logic to retrieve the user-configured parameter, guard rails to prevent file buffers from exceeding a fixed limit, and tests to verify the number of file buffers falls within this range and does not drop below the previous default number of 10 file buffers

  • BigQuery
  • Snowflake

Describe the alternative you’ve considered or used

A clear and concise description of any alternative solutions or features you've considered or are using today.

Additional context

Add any other context or screenshots about the feature request here.

Are you willing to submit a PR?

Nope

@ryankfu ryankfu added type/enhancement New feature or request good first issue team/destinations Destinations team's backlog labels Jan 3, 2023
@natalyjazzviolin natalyjazzviolin self-assigned this Jan 17, 2023
@ryankfu
Copy link
Contributor Author

ryankfu commented Mar 30, 2023

NOTE:

  • look for if existing PRs (BigQuery) or keep as good first issue

PR for Snowflake here

@evantahler
Copy link
Contributor

evantahler commented Apr 3, 2023

This is a great first story for Cynthia.
Here is some user context on the story.

@evantahler
Copy link
Contributor

Snowflake has this now as well... so this story is just for BQ!

@evantahler evantahler changed the title Destination BigQuery/Snowflake: configurable value for file buffer Destination BigQuery: configurable value for file buffer Apr 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team/destinations Destinations team's backlog type/enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants