Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

destination-s3: add file transfer #46302

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

stephane-airbyte
Copy link
Contributor

@stephane-airbyte stephane-airbyte commented Oct 1, 2024

adding file transfer to destinaiton-s3

file transfer and record-based sync are exclusive. The platform will set the environment variables USE_FILE_TRANSFER to true and AIRBYTE_STAGING_DIRECTORY to the mounting point of the staging directory when the destination supports file transfer and the source enabled it in its config.
destination-s3 will check the USE_FILE_TRANSFER to decide whether to enable file transfer or record-based sync.
Record-based integration tests are all passing, and there's an extra test that makes sure file-based transfer is disabled.

Copy link

vercel bot commented Oct 1, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Oct 23, 2024 11:36pm

Copy link
Contributor Author

stephane-airbyte commented Oct 1, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @stephane-airbyte and the rest of your teammates on Graphite Graphite

@octavia-squidington-iii octavia-squidington-iii added area/connectors Connector related issues CDK Connector Development Kit labels Oct 1, 2024
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-01-destination-s3_add_file_transfer branch from 5649507 to 0a94310 Compare October 1, 2024 23:51
@@ -36,6 +36,11 @@ object UploadFormatConfigFactory {
FileUploadFormat.PARQUET -> {
UploadParquetFormatConfig(formatConfig)
}
FileUploadFormat.RAW_FILES ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice. This sidesteps any questions w/r/t conversion.

@stephane-airbyte stephane-airbyte force-pushed the stephane/10-01-destination-s3_add_file_transfer branch from 0a94310 to 7baeb75 Compare October 3, 2024 00:31
@stephane-airbyte stephane-airbyte changed the base branch from stephane/09-30-cdk-java_add_file_transfer_mount_to_destinationacceptancetest to stephane/10-02-cdk-java_reorganize_the_destinationaccptancetest_to_split_out_the_actual_tests_from_all_the_util_methods October 3, 2024 00:31
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-02-cdk-java_reorganize_the_destinationaccptancetest_to_split_out_the_actual_tests_from_all_the_util_methods branch from 6f87147 to 52c0fe1 Compare October 3, 2024 15:59
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-01-destination-s3_add_file_transfer branch from 7baeb75 to 2dd1a87 Compare October 3, 2024 15:59
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-02-cdk-java_reorganize_the_destinationaccptancetest_to_split_out_the_actual_tests_from_all_the_util_methods branch from 52c0fe1 to 95a7d03 Compare October 7, 2024 22:05
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-01-destination-s3_add_file_transfer branch from 2dd1a87 to 75cc841 Compare October 7, 2024 22:05
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-02-cdk-java_reorganize_the_destinationaccptancetest_to_split_out_the_actual_tests_from_all_the_util_methods branch from 95a7d03 to 721ddfa Compare October 8, 2024 18:27
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-01-destination-s3_add_file_transfer branch from 75cc841 to f0f2536 Compare October 8, 2024 18:28
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-01-destination-s3_add_file_transfer branch 8 times, most recently from e2bb0c0 to e1dd9ce Compare October 9, 2024 01:29
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-02-cdk-java_reorganize_the_destinationaccptancetest_to_split_out_the_actual_tests_from_all_the_util_methods branch from 721ddfa to 8a78c22 Compare October 9, 2024 17:19
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-01-destination-s3_add_file_transfer branch 2 times, most recently from a371928 to cd74813 Compare October 9, 2024 18:57
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-02-cdk-java_reorganize_the_destinationaccptancetest_to_split_out_the_actual_tests_from_all_the_util_methods branch from 8a78c22 to 48a9e9e Compare October 9, 2024 20:51
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-01-destination-s3_add_file_transfer branch from 9456da1 to 76dcbe6 Compare October 23, 2024 20:55
Copy link
Contributor Author

stephane-airbyte commented Oct 23, 2024

/publish-java-cdk --force=true

Error: Unexpected inputs provided: ["--force"]

@stephane-airbyte
Copy link
Contributor Author

stephane-airbyte commented Oct 23, 2024

/publish-java-cdk force=true

🕑 https://github.com/airbytehq/airbyte/actions/runs/11488030425
✅ Successfully published Java CDK version=0.48.0!

@stephane-airbyte stephane-airbyte force-pushed the stephane/10-01-destination-s3_add_file_transfer branch from 76dcbe6 to f654a2b Compare October 23, 2024 21:56
@stephane-airbyte stephane-airbyte marked this pull request as ready for review October 23, 2024 21:58
@stephane-airbyte stephane-airbyte requested review from a team as code owners October 23, 2024 21:58
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-09-simple_split_of_destinationacceptancetest branch from eaee8a1 to 8fa5bff Compare October 23, 2024 21:59
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-01-destination-s3_add_file_transfer branch 2 times, most recently from 010e67d to 25ad892 Compare October 23, 2024 22:03
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-09-simple_split_of_destinationacceptancetest branch from 8fa5bff to ba25b14 Compare October 23, 2024 22:26
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-01-destination-s3_add_file_transfer branch from 25ad892 to 8921fd6 Compare October 23, 2024 22:26
@stephane-airbyte stephane-airbyte changed the base branch from stephane/10-09-simple_split_of_destinationacceptancetest to graphite-base/46302 October 23, 2024 23:06
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-01-destination-s3_add_file_transfer branch from 8921fd6 to cc6fccc Compare October 23, 2024 23:07
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-01-destination-s3_add_file_transfer branch from cc6fccc to 5abe9d2 Compare October 23, 2024 23:18
@stephane-airbyte stephane-airbyte force-pushed the stephane/10-01-destination-s3_add_file_transfer branch from 5abe9d2 to c66040f Compare October 23, 2024 23:25
@stephane-airbyte stephane-airbyte changed the base branch from graphite-base/46302 to master October 23, 2024 23:26
}
val flushFunction =
if (featureFlags.useFileTransfer()) {
FileTransferDestinationFlushFunction(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What determines whether the feature flag is set? The fact that the source is flagged as a file source? Explicit opt-in at the sync level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair question.

Basically, we need the source configuration to have a specific parameter enabled (I don't know the details of the parameter) AND the destination needs to have supportsFileTransfer set to true in its metadata.yaml. If those 2 conditions are true, then the 2 variables are set accordingly, a common volume is mounted on both containers, and it's expected that all records are file-based instead of record-based.
If the source config has the parameter set to true and the destination doesn't support file transfer, the platform will throw an exception

Copy link
Contributor

@johnny-schmidt johnny-schmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. The shim seems like it's in the best place, and the file flush function is straightforward. I didn't have enough time to go over the tests in detail, but high-level how we're adding the file option to the docker env is clear.

One question about the env variables just to help me plan for the new CDK, but that's my own curiosity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation CDK Connector Development Kit connectors/destination/s3-glue connectors/destination/s3-v2 connectors/destination/s3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants