Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose ROW_BATCH_SIZE variable to google sheets connector #15038

Closed
akulgoel96 opened this issue Jul 26, 2022 · 4 comments
Closed

Expose ROW_BATCH_SIZE variable to google sheets connector #15038

akulgoel96 opened this issue Jul 26, 2022 · 4 comments

Comments

@akulgoel96
Copy link
Contributor

akulgoel96 commented Jul 26, 2022

Tell us about the problem you're trying to solve

Hi all, I am currently exploring airbyte for our ETL use cases, and while checking out the Google Sheets connector as a source I kept running into frequent rate limit problems. Keep in mind we have google sheets that can have upto 200,000 records in one sheet.

Describe the solution you’d like

After some digging into the codebase, I found that this chunk value (called as ROW_BATCH_SIZE is defined as a static value inside the connector. To get around the rate limit issue, I could use a bigger value of ROW_BATCH_SIZE. So I was thinking, if we expose this variable to the connector config, that would get around my problem along with not breaking any of the existing flows.

Describe the alternative you’ve considered or used

We have an internal service that currently calls the Google API and pulls the whole data of the sheet at once which does not cause any issues. However, to not have an overhead of maintaining this service and make it more self-serve, we are planning to shift to airbyte.

Additional context

I am using service account to authenticate with google API's

Are you willing to submit a PR?

Yes! Actually I have done (and tested) the changes as well, but I am not able to push the changes due to access issues, so would need help with that as well.

@sajarin
Copy link
Contributor

sajarin commented Jul 27, 2022

Hey @akulgoel96, thanks for making this issue. You mentioned access issues are preventing you from pushing changes, could you talk more about that on how that's preventing you from making a pr? Thanks!

@akulgoel96
Copy link
Contributor Author

akulgoel96 commented Jul 27, 2022

Hey @sajarin so I created a new branch on local, committed my changes to it, and when I try to push the branch to remote, I get this error

image

@sajarin
Copy link
Contributor

sajarin commented Jul 27, 2022

Hey @akulgoel96, you'd have to fork the airbyte repo first and then push your local branch to your fork. After which, you can make a pr against the main repo. Instructions on how to contribute to Airbyte can be found here: https://docs.airbyte.com/contributing-to-airbyte/#areas-for-contributing

Let me know if that makes sense!

@akulgoel96
Copy link
Contributor Author

akulgoel96 commented Jul 28, 2022

@sajarin I have raised a PR here: #15107. Please help reviewing it. Let me know if I missed something as this is my first PR

@sajarin sajarin closed this as completed Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants