Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 Source Github: bugfix schemas for streams deployments, workflow_runs, teams #15049

Merged
merged 9 commits into from
Jul 29, 2022
Original file line number Diff line number Diff line change
Expand Up @@ -303,7 +303,7 @@
- name: GitHub
sourceDefinitionId: ef69ef6e-aa7f-4af1-a01d-ef775033524e
dockerRepository: airbyte/source-github
dockerImageTag: 0.2.42
dockerImageTag: 0.2.43
documentationUrl: https://docs.airbyte.io/integrations/sources/github
icon: github.svg
sourceType: api
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2595,7 +2595,7 @@
supportsNormalization: false
supportsDBT: false
supported_destination_sync_modes: []
- dockerImage: "airbyte/source-github:0.2.42"
- dockerImage: "airbyte/source-github:0.2.43"
spec:
documentationUrl: "https://docs.airbyte.com/integrations/sources/github"
connectionSpecification:
Expand Down
2 changes: 1 addition & 1 deletion airbyte-integrations/connectors/source-github/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,5 @@ RUN pip install .
ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py"
ENTRYPOINT ["python", "/airbyte/integration_code/main.py"]

LABEL io.airbyte.version=0.2.42
LABEL io.airbyte.version=0.2.43
LABEL io.airbyte.name=airbyte/source-github
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,19 @@
"type": ["null", "string"]
},
"payload": {
"type": ["null", "string", "object"]
"oneOf": [
{
"type": "object",
"properties": {},
"additionalProperties": true
},
{
"type": "string"
},
{
"type": "null"
}
]
Comment on lines +106 to +118
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From https://github.com/airbytehq/airbyte/issues/15021, oneOf, anyOf, and not should not be allowed in SAT

@sherifnada what's the actual effect of doing this?

This should probably just be a string type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I suspected this question
The problem is if remove this to multi-type, normalization will start to fail (at least destination-snowflake)

I need your approval to remove this multi-type !!!

The second point why I have re-implemented using oneOf instead of just simple ["string", "object"] because it's better handled by python Transformer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other words you can see here just refactoring of previous schema
but making it more compatible with python Transformer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grubberr whats the failure in normalization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like this
https://github.com/airbytehq/airbyte/issues/15021#issuecomment-1196320638

In short, all multi-type(s) combinations usually stored as pure-JSON on destination
on some engine it's VARCHAR but it's not always

for example:

on snowflake - VARIANT type
on redshift - SUPER type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Got it - sending a string would actually cause an error since they're currently already being handled as objects. Then in this case to avoid the breaking schema change, since payload is an arbitrary object I think it's ok to make our own object with the string value.

If we confirm that normalization/destinations are just handling this as an object, then we could update the schema here to be ["null", "object"] (remove string). @grubberr you mentioned earlier that for validation this wouldn't be ok - what issues do we have with validation if we do this?

In practice this should only be:

object -> object
"string" -> {"value": "string"}
null -> null

(if github docs are true to their word)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pedroslopez if we do such transformation
"string" -> {"value": "string"}
no issues with validation

Copy link
Contributor

@pedroslopez pedroslopez Jul 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked earlier whether we were seeing an issue with this currently since, while we should fix these multi-type schemas, it seems like a separate issue from what's being addressed in this PR for the on call issue. Since it's on call, I think we should move forward with the current changes since it's behavior is equivaltent and address the schema issue / coercing strings into objects separately

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argh github keeps hiding previous comments before I post 😛
@grubberr if you have those changes made then feel free to include here or separately, up to you

Copy link
Contributor Author

@grubberr grubberr Jul 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pedroslopez no I did not do such changes
"string" -> {"value": "string"}
it's was only my discussion about possible solutions how to move forward to be compliant with this PR https://github.com/airbytehq/airbyte/issues/15021

},
"transient_environment": {
"type": ["null", "boolean"]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,9 @@
"type": ["null", "string"]
},
"parent": {
"type": ["null", "object"]
"type": ["null", "object"],
"properties": {},
"additionalProperties": true
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,11 @@
},
"pull_requests": {
"type": "array",
"items": {}
"items": {
"type": ["null", "object"],
"properties": {},
"additionalProperties": true
}
},
"created_at": {
"type": ["null", "string"]
Expand Down
1 change: 1 addition & 0 deletions docs/integrations/sources/github.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,7 @@ The GitHub connector should not run into GitHub API limitations under normal usa

| Version | Date | Pull Request | Subject |
|:--------|:-----------| :--- |:-------------------------------------------------------------------------------------------------------------|
| 0.2.43 | 2022-07-26 | [15049](https://github.com/airbytehq/airbyte/pull/15049) | Bugfix schemas for streams `deployments`, `workflow_runs`, `teams` |
| 0.2.42 | 2022-07-12 | [14613](https://github.com/airbytehq/airbyte/pull/14613) | Improve schema for stream `pull_request_commits` added "null" |
| 0.2.41 | 2022-07-03 | [14376](https://github.com/airbytehq/airbyte/pull/14376) | Add Retry for GraphQL API Resource limitations |
| 0.2.40 | 2022-07-01 | [14338](https://github.com/airbytehq/airbyte/pull/14338) | Revert: "Rename field `mergeable` to `is_mergeable`" |
Expand Down