Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source gitlab: enable httpavailabilitystrategy for supported streams #22238

Closed
erohmensing opened this issue Feb 1, 2023 · 1 comment
Closed
Assignees
Labels

Comments

@erohmensing
Copy link
Contributor

erohmensing commented Feb 1, 2023

Follow-up issue to #21888

Tell us about the problem you're trying to solve

Right now, the default availability strategy is turned off for source gitlab. This is because it will skip syncing the entire stream if it hits a 403 error on the first record, indicating that the user doesn't have permissions to access the stream. Gitlab has some streams that could run into 403s on a per-record/per-slice basis, rather than a per stream basis. You can see the logic of this being noticed and turning off the default happening the first time here.

Describe the solution you’d like

As a first step, identify any streams for which you can't end up getting 403s on a per-record/per-slice bases (e.g. the repositories stream itself I don't think follows the logic of that comment). Enable the default for these streams. Then we can look into a more customized availabilitystrategy for streams for which it is the case that you can get different permissions for different slices/records in the stream.

@davydov-d
Copy link
Collaborator

The Gitlab source connector includes streams that may raise 401/403/404 errors on per slice (project/group) basis. Any of these does not mean the stream is unavailable. In my humble opinion, implementing availability strategy for the Gitlab Source connector has no sense as it's quite a unique behavior and we can not make generic handling of these errors to apply it to other connectors. Besides, the logic describing what the connector should do when faces one of this errors lives inside the connector code. Also, it to would require making changes to the CDK code in the place where the strategy is called from. I can't recall any other connector that would need to have similar behavior of per slice availability. This means it's likely not worth implementing a custom availability strategy as it would be not scalable and bring in new, not obvious and rarely used changes to the CDK (or connector base class). Therefore I'm closing this issue, if anyone finds a strong reason or a better way to satisfy requirements of this task - please reopen and let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants