Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

airbyte-cdk offset pagination strategy: page_size to be interpolated … #19646

Merged
merged 6 commits into from
Dec 6, 2022

Conversation

roman-yermilov-gl
Copy link
Contributor

@roman-yermilov-gl roman-yermilov-gl commented Nov 21, 2022

It is often happen that page_size need to be specified per stream. With current changes we can set the page_size in stream options while define it only once:

page_size: "{{ options['items_per_page'] }}"

@roman-yermilov-gl roman-yermilov-gl requested a review from a team as a code owner November 21, 2022 10:24
@octavia-squidington-iv octavia-squidington-iv added the CDK Connector Development Kit label Nov 21, 2022
Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roman-yermilov-gl could you indicate why you're making this change, include a snippet of how this changes the input YAML, and add tests that prove the new functionality works? It's hard to give useful feedback on a PR without that information :)

@roman-yermilov-gl
Copy link
Contributor Author

@roman-yermilov-gl could you indicate why you're making this change, include a snippet of how this changes the input YAML, and add tests that prove the new functionality works? It's hard to give useful feedback on a PR without that information :)

Thanks for comments. I updated description and made fixes in tests

Copy link
Contributor

@girarda girarda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

# InterpolatedString page_size may contain one of int/str types,
# so we need to ensure that its `.string` attribute is of *string* type
self.page_size.string = str(self.page_size.string)
self.page_size = self.page_size.eval(self.config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When possible, we try to perform the interpolation evluation eval() at runtime instead of parse time in post_init. So in this case, self.page_size would be the interpolated string (as you mentioned above) but we should move evaluation of it into the get_page_size().

Now that this could be an interpolated string or just a normal string, the type hints in the getter method get_page_size() -> Optional[int] is now a bit misleading. I think we should also add some error handling in the event that this does not evaluate into an integer since it won't behave correctly when get_page_size() gets invoked by parent classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

@brianjlai brianjlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! just one comment about reusing some code


def next_page_token(self, response: requests.Response, last_records: List[Mapping[str, Any]]) -> Optional[Any]:
if len(last_records) < self.page_size:
if len(last_records) < self.page_size.eval(self.config):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of calling self.page_size.eval(self.config) we can replace this with self.get_page_size() so that we can reuse the method without needing to duplicate the type check

@roman-yermilov-gl roman-yermilov-gl force-pushed the ryermilov/airbyte-cdk-offset-pagination branch from 155eaef to 45adef5 Compare December 6, 2022 13:37
@roman-yermilov-gl
Copy link
Contributor Author

roman-yermilov-gl commented Dec 6, 2022

/publish-cdk dry-run=true

🕑 https://github.com/airbytehq/airbyte/actions/runs/3630043147
https://github.com/airbytehq/airbyte/actions/runs/3630043147

@roman-yermilov-gl
Copy link
Contributor Author

roman-yermilov-gl commented Dec 6, 2022

/publish-cdk dry-run=false

🕑 https://github.com/airbytehq/airbyte/actions/runs/3630121054
https://github.com/airbytehq/airbyte/actions/runs/3630121054

@roman-yermilov-gl
Copy link
Contributor Author

roman-yermilov-gl commented Dec 6, 2022

/publish-cdk dry-run=false

🕑 https://github.com/airbytehq/airbyte/actions/runs/3630341175
https://github.com/airbytehq/airbyte/actions/runs/3630341175

@roman-yermilov-gl
Copy link
Contributor Author

roman-yermilov-gl commented Dec 6, 2022

/publish-cdk dry-run=false

🕑 https://github.com/airbytehq/airbyte/actions/runs/3630469690
https://github.com/airbytehq/airbyte/actions/runs/3630469690

@roman-yermilov-gl roman-yermilov-gl merged commit bedc3b9 into master Dec 6, 2022
@roman-yermilov-gl roman-yermilov-gl deleted the ryermilov/airbyte-cdk-offset-pagination branch December 6, 2022 14:48
@girarda
Copy link
Contributor

girarda commented Dec 6, 2022

gentle reminder to run SUB_BUILD=CONNECTORS_BASE ./gradlew format to make sure CI passes before merging https://github.com/airbytehq/airbyte/actions/runs/3630457570/jobs/6123859471.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDK Connector Development Kit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

airbyte-cdk offset pagination strategy: page_size to be interpolated
5 participants