return better structured error logs to connector builder #46963

lmossman · 2024-10-17T00:34:12Z

What

Partially resolves https://github.com/airbytehq/airbyte-internal-issues/issues/6977

This is the first of 3 PRs to have better structured error messages in the Connector Builder.
See the other 2 PRs that build on top of this one:

How

This PR changes how the connector_builder module of the CDK returns trace messages back to the connector builder server in a few key ways:

Breaks apart the error message and stacktrace into separate fields
Adds the internal_message of the trace message to another field in the returned Log
Suppresses the final raised AirbyteTracedException which is only meant to cause a non-zero return value to result in a sync failure, but does not provide any useful information beyond the other traced exceptions that are already yielded

Testing

Follow the Testing instructions in the description of this PR to test this end-to-end:

https://github.com/airbytehq/airbyte-platform-internal/pull/14400

Can this PR be safely reverted and rolled back?

YES 💚
NO ❌

vercel · 2024-10-17T00:34:16Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment

Name	Status	Preview	Comments	Updated (UTC)
airbyte-docs	⬜️ Ignored (Inspect)	Visit Preview		Oct 23, 2024 8:06pm

lmossman · 2024-10-18T19:06:16Z

airbyte-cdk/python/airbyte_cdk/sources/streams/http/error_handlers/json_error_message_parser.py

+            try:
+                return response.content.decode("utf-8")
+            except Exception:
+                return None


This change allows this method to also properly handle response bodies which are just string values, since that currently causes a JSONDecodeError

natikgadzhi

The code change itself seems perfectly fine. This will have longer-term compatibility concern (log error message model should map to the protocol better), but this can be solved later. Left a comment on that.

natikgadzhi · 2024-10-18T20:23:28Z

airbyte-cdk/python/airbyte_cdk/connector_builder/models.py

@@ -39,6 +39,8 @@ class StreamReadSlices:
 class LogMessage:
    message: str
    level: str
+    internal_message: Optional[str] = None


The trade off here is that since this does not map to the protocol error message model, when we move to using source-declarative-manifest container as manifest runner for Builder, we'd need to make sure that the protocol has these fields, otherwise error reporting will break.

Should we make the work and align these models in this file with protocol-level models now instead? Or ship the small improvement and record a todo for later?
/cc @bnchrch

Thats a good callout Natik!

Looking into it, this does already align with the protocol model AirbyteTraceMessage

https://github.com/airbytehq/airbyte-protocol/blob/e24ee151574a7f0f4e997b90063d7c9ffed2b158/protocol-models/src/main/resources/airbyte_protocol/v0/airbyte_protocol.yaml#L256

❓ @lmossman can you confirm Im reading this as this is already a protocol blessed type?

If so should LogMessage become a union type of AirbyteTraceMessage and AirbyteLogMessage

The trade off here is that since this does not map to the protocol error message model, when we move to using source-declarative-manifest container as manifest runner for Builder, we'd need to make sure that the protocol has these fields, otherwise error reporting will break.

This LogMessage class is only used in the message_grouper logic which wraps a call to the read protocol method call.

Even if we swap out the runtime to use source-declarative-manifest to execute the read, we will still need all of this extra message_grouper logic around it to create the StreamRead output object that the connector builder server expects, and it shouldn't need to be changed because the get_message_groups function does operate on the airbyte protocol already, and is just normalizing those airbyte protocol messages into a form that is easier understood by the connector builder server.

If we don't want to directly execute any python anymore at all and therefore delete this message_grouper logic, then the connector builder server will require a bunch of other changes anyway to do similar logic to what that python code is doing now.

So, I think I'd prefer to keep this PR simple right now, especially because I don't think we can easily reference the protocol AirbyteLogMessage and AirbyteTraceMessage in our connector-builder-server and airbyte-server openapi specs, and I'd rather not repeat those entire schemas as they are more complicated than the LogMessage schema I've defined here.

But let me know if anything I said above sounds off @natikgadzhi @bnchrch !

bnchrch

Two Questions! Should pass once those are addressed

bnchrch · 2024-10-18T22:05:37Z

airbyte-cdk/python/airbyte_cdk/connector_builder/message_grouper.py

+            # is that this message will be shown in the Builder.
+            if (
+                traced_exception.message is not None
+                and "During the sync, the following streams did not sync successfully" in traced_exception.message


❓ Can we modify AbstractSource to indicate its a "Final Exception"?

@bnchrch Do you have a specific approach in mind? We could set internal_message to something for example to indicate this but this would still rely on comparing strings in that approach.

I'm wary to raise a different exception in abstract_source here because I'm not sure what relies on this being an AirbyteTracedException and I don't feel familiar enough with that fairly central CDK code to know what is safe to change

I would also add that any raised AirbyteTracedException is sort of already implied to be the final exception. Within abstract_source.py, https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/airbyte_cdk/sources/abstract_source.py#L171-L177 , during a read of a stream always try to catch any form of Exception or AirbyteTracedException thrown by a stream and wrap it into an emitted trace message. We only formally raise the "final" AirbyteTracedException because we need to end the process with a non-zero error code.

Granted the small gap is if we get an exception outside of that try/except block. But from my perspective, us trying to adjust the protocol or create new abstractions to capture what is effectively a work around feels like we're trying to over engineer a bit. The ideal design is that the platform can identify that merely receiving any AirbyteTracedException is enough to know that the sync was unsuccessful. And then we could get rid of this work around.

So TLDR: I think how we do it here is fine.

bnchrch · 2024-10-18T22:11:04Z

airbyte-cdk/python/airbyte_cdk/connector_builder/models.py

@@ -39,6 +39,8 @@ class StreamReadSlices:
 class LogMessage:
    message: str
    level: str
+    internal_message: Optional[str] = None


Thats a good callout Natik!

Looking into it, this does already align with the protocol model AirbyteTraceMessage

https://github.com/airbytehq/airbyte-protocol/blob/e24ee151574a7f0f4e997b90063d7c9ffed2b158/protocol-models/src/main/resources/airbyte_protocol/v0/airbyte_protocol.yaml#L256

❓ @lmossman can you confirm Im reading this as this is already a protocol blessed type?

If so should LogMessage become a union type of AirbyteTraceMessage and AirbyteLogMessage

…ception

brianjlai

i know there are other questions of the marketplace side, but not concerns by me

brianjlai · 2024-10-21T18:26:07Z

airbyte-cdk/python/airbyte_cdk/connector_builder/message_grouper.py

+            # is that this message will be shown in the Builder.
+            if (
+                traced_exception.message is not None
+                and "During the sync, the following streams did not sync successfully" in traced_exception.message


I would also add that any raised AirbyteTracedException is sort of already implied to be the final exception. Within abstract_source.py, https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/airbyte_cdk/sources/abstract_source.py#L171-L177 , during a read of a stream always try to catch any form of Exception or AirbyteTracedException thrown by a stream and wrap it into an emitted trace message. We only formally raise the "final" AirbyteTracedException because we need to end the process with a non-zero error code.

Granted the small gap is if we get an exception outside of that try/except block. But from my perspective, us trying to adjust the protocol or create new abstractions to capture what is effectively a work around feels like we're trying to over engineer a bit. The ideal design is that the platform can identify that merely receiving any AirbyteTracedException is enough to know that the sync was unsuccessful. And then we could get rid of this work around.

So TLDR: I think how we do it here is fine.

lmossman · 2024-10-23T18:54:43Z

/approve-regression-tests

Only changes the connector_builder wrapper around airbyte-cdk, so shouldn't affect any actual connectors. I have tested this locally against the connector builder (along with the platform changes linked in the PRs in the description)

Check job output.

✅ Approving regression tests

lmossman · 2024-10-23T22:46:47Z

/approve-regression-tests

Only changes the connector_builder wrapper around airbyte-cdk, so shouldn't affect any actual connectors. I have tested this locally against the connector builder (along with the platform changes linked in the PRs in the description)

Check job output.

✅ Approving regression tests

octavia-squidington-iii added the CDK Connector Development Kit label Oct 17, 2024

lmossman changed the title ~~return better structured error logs, and don't return raised trace ex…~~ return better structured error logs, and don't return raised trace exception Oct 18, 2024

lmossman changed the title ~~return better structured error logs, and don't return raised trace exception~~ return better structured error logs to connector builder Oct 18, 2024

lmossman commented Oct 18, 2024

View reviewed changes

lmossman requested a review from brianjlai October 18, 2024 19:08

lmossman marked this pull request as ready for review October 18, 2024 19:08

natikgadzhi approved these changes Oct 18, 2024

View reviewed changes

bnchrch reviewed Oct 18, 2024

View reviewed changes

lmossman added 4 commits October 18, 2024 17:36

return better structured error logs, and don't return raised trace ex…

850b682

…ception

formatting

d3f0a4b

check for None

eccf502

formatting again

0200577

lmossman force-pushed the lmossman/better-builder-errors branch from a93d329 to 0200577 Compare October 19, 2024 00:36

brianjlai approved these changes Oct 21, 2024

View reviewed changes

lmossman added 3 commits October 21, 2024 16:19

fix unit test

c0d6c24

look for traceback in stacktrace field

2ede4dc

look for error in stacktrace field

818e843

check invalid json message

a4946f3

lmossman merged commit b590a21 into master Oct 23, 2024
35 checks passed

lmossman deleted the lmossman/better-builder-errors branch October 23, 2024 22:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

return better structured error logs to connector builder #46963

return better structured error logs to connector builder #46963

lmossman commented Oct 17, 2024 •

edited

Loading

vercel bot commented Oct 17, 2024 •

edited

Loading

lmossman Oct 18, 2024

natikgadzhi left a comment

natikgadzhi Oct 18, 2024

bnchrch Oct 18, 2024

lmossman Oct 18, 2024

bnchrch left a comment

bnchrch Oct 18, 2024

lmossman Oct 18, 2024

brianjlai Oct 21, 2024

bnchrch Oct 18, 2024

brianjlai left a comment

brianjlai Oct 21, 2024

lmossman commented Oct 23, 2024 •

edited by github-actions bot

Loading

lmossman commented Oct 23, 2024 •

edited by github-actions bot

Loading

return better structured error logs to connector builder #46963

return better structured error logs to connector builder #46963

Conversation

lmossman commented Oct 17, 2024 • edited Loading

What

How

Testing

Can this PR be safely reverted and rolled back?

vercel bot commented Oct 17, 2024 • edited Loading

Choose a reason for hiding this comment

natikgadzhi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bnchrch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brianjlai left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmossman commented Oct 23, 2024 • edited by github-actions bot Loading

lmossman commented Oct 23, 2024 • edited by github-actions bot Loading

lmossman commented Oct 17, 2024 •

edited

Loading

vercel bot commented Oct 17, 2024 •

edited

Loading

lmossman commented Oct 23, 2024 •

edited by github-actions bot

Loading

lmossman commented Oct 23, 2024 •

edited by github-actions bot

Loading