Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass allowedHosts to container runners #21676

Merged
merged 9 commits into from
Jan 25, 2023
Merged

Conversation

evantahler
Copy link
Contributor

@evantahler evantahler commented Jan 20, 2023

This PR follows #21363 and closes #21797, passing the AllowedHosts to the docker/K8s container executors. This PR stops at logging the AllowedHost information as the container is started, showing that the information is being passed though the Temporal jobs properly. Usage of this information in the K8s launcher will be done in a later story by @git-phu.

As part of this work, we needed to resolve dynamic allowed hosts from the connector's configuration. I went with StringSubstitutor for this purpose. As an example, you can see how source-postgres now has an allowed host limitation of "${host}" which is the value of "host" from the configuration of that instance of the connector. this can be verified via the logs, as "host.docker.internal" is my configuration:

Screenshot 2023-01-20 at 3 42 29 PM

airbyte-worker | 2023-01-20 23:35:35 INFO i.a.w.p.DockerProcessFactory(create):122 - Creating docker container = source-postgres-read-8-0-evhis with resources io.airbyte.config.ResourceRequirements@5f1218b0[cpuRequest=,cpuLimit=,memoryRequest=,memoryLimit=] and allowed hosts io.airbyte.config.AllowedHosts@4d32a0ff[hosts=[host.docker.internal],additionalProperties={}]

@octavia-squidington-iv octavia-squidington-iv added area/platform issues related to the platform area/worker Related to worker labels Jan 20, 2023
@evantahler evantahler temporarily deployed to more-secrets January 20, 2023 20:14 — with GitHub Actions Inactive
@evantahler evantahler temporarily deployed to more-secrets January 20, 2023 20:14 — with GitHub Actions Inactive
@github-actions
Copy link
Contributor

github-actions bot commented Jan 20, 2023

Affected Connector Report

NOTE ⚠️ Changes in this PR affect the following connectors. Make sure to do the following as needed:

  • Run integration tests
  • Bump connector or module version
  • Add changelog
  • Publish the new version

✅ Sources (30)

Connector Version Changelog Publish
source-alloydb 1.0.35
source-alloydb-strict-encrypt 1.0.35 🔵
(ignored)
🔵
(ignored)
source-bigquery 0.2.3
source-clickhouse 0.1.15
source-clickhouse-strict-encrypt 0.1.15 🔵
(ignored)
🔵
(ignored)
source-cockroachdb 0.1.19
source-cockroachdb-strict-encrypt 0.1.19 🔵
(ignored)
🔵
(ignored)
source-db2 0.1.17
source-db2-strict-encrypt 0.1.17 🔵
(ignored)
🔵
(ignored)
source-dynamodb 0.1.0
source-e2e-test 2.1.3
source-e2e-test-cloud 2.1.1 🔵
(ignored)
🔵
(ignored)
source-elasticsearch 0.1.1
source-jdbc 0.3.5 🔵
(ignored)
🔵
(ignored)
source-kafka 0.2.3
source-mongodb-strict-encrypt 0.1.19 🔵
(ignored)
🔵
(ignored)
source-mongodb-v2 0.1.19
source-mssql 0.4.28
source-mssql-strict-encrypt 0.4.28 🔵
(ignored)
🔵
(ignored)
source-mysql 1.0.19
source-mysql-strict-encrypt 1.0.19 🔵
(ignored)
🔵
(ignored)
source-oracle 0.3.22
source-oracle-strict-encrypt 0.3.22 🔵
(ignored)
🔵
(ignored)
source-postgres 1.0.39
source-postgres-strict-encrypt 1.0.39 🔵
(ignored)
🔵
(ignored)
source-redshift 0.3.16
source-scaffold-java-jdbc 0.1.0 🔵
(ignored)
🔵
(ignored)
source-sftp 0.1.2
source-snowflake 0.1.29
source-tidb 0.2.2
  • See "Actionable Items" below for how to resolve warnings and errors.

❌ Destinations (48)

Connector Version Changelog Publish
destination-aws-datalake 0.1.1
destination-azure-blob-storage 0.1.6
destination-bigquery 1.2.12
destination-bigquery-denormalized 1.2.12
(diff seed version)
destination-cassandra 0.1.4
destination-clickhouse 0.2.2
(changelog missing)
destination-clickhouse-strict-encrypt 0.2.2 🔵
(ignored)
🔵
(ignored)
destination-csv 1.0.0
(changelog missing)
destination-databricks 0.3.1
destination-dev-null 0.2.7 🔵
(ignored)
🔵
(ignored)
destination-doris 0.1.0
destination-dynamodb 0.1.7
destination-e2e-test 0.2.4
destination-elasticsearch 0.1.6
destination-elasticsearch-strict-encrypt 0.1.6 🔵
(ignored)
🔵
(ignored)
destination-gcs 0.2.13
destination-iceberg 0.1.0
destination-jdbc 0.3.14 🔵
(ignored)
🔵
(ignored)
destination-kafka 0.1.10
destination-keen 0.2.4
destination-kinesis 0.1.5
destination-local-json 0.2.11
destination-mariadb-columnstore 0.1.7
destination-mongodb 0.1.9
destination-mongodb-strict-encrypt 0.1.9 🔵
(ignored)
🔵
(ignored)
destination-mqtt 0.1.3
destination-mssql 0.1.22
destination-mssql-strict-encrypt 0.1.22 🔵
(ignored)
🔵
(ignored)
destination-mysql 0.1.20
destination-mysql-strict-encrypt 0.1.21
(mismatch: 0.1.20)
🔵
(ignored)
🔵
(ignored)
destination-oracle 0.1.19
destination-oracle-strict-encrypt 0.1.19 🔵
(ignored)
🔵
(ignored)
destination-postgres 0.3.26
destination-postgres-strict-encrypt 0.3.26 🔵
(ignored)
🔵
(ignored)
destination-pubsub 0.2.0
destination-pulsar 0.1.3
destination-r2 0.1.0
destination-redis 0.1.4
destination-redpanda 0.1.0
destination-redshift 0.3.54
destination-rockset 0.1.4
destination-s3 0.3.19
destination-s3-glue 0.1.1
destination-scylla 0.1.3
destination-snowflake 0.4.44
destination-teradata 0.1.0
destination-tidb 0.1.0
destination-yugabytedb 0.1.0
  • See "Actionable Items" below for how to resolve warnings and errors.

✅ Other Modules (0)

Actionable Items

(click to expand)

Category Status Actionable Item
Version
mismatch
The version of the connector is different from its normal variant. Please bump the version of the connector.

doc not found
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug.
Changelog
doc not found
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug.

changelog missing
There is no chnagelog for the current version of the connector. If you are the author of the current version, please add a changelog.
Publish
not in seed
The connector is not in the seed file (e.g. source_definitions.yaml), so its publication status cannot be checked. This can be normal (e.g. some connectors are cloud-specific, and only listed in the cloud seed file). Please double-check to make sure that it is not a bug.

diff seed version
The connector exists in the seed file, but the latest version is not listed there. This usually means that the latest version is not published. Please use the /publish command to publish the latest version.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 20, 2023

@evantahler evantahler temporarily deployed to more-secrets January 20, 2023 23:37 — with GitHub Actions Inactive
@evantahler evantahler temporarily deployed to more-secrets January 20, 2023 23:37 — with GitHub Actions Inactive
@evantahler evantahler temporarily deployed to more-secrets January 20, 2023 23:45 — with GitHub Actions Inactive
@evantahler evantahler temporarily deployed to more-secrets January 20, 2023 23:45 — with GitHub Actions Inactive
@evantahler
Copy link
Contributor Author

evantahler commented Jan 20, 2023

This work seems to have introduced a bug. @benmoriceau or @jdpgrailsdev do you happen to see what I might have done wrong? Any sync I run on this branch produces an error like this:

2023-01-20 23:52:39 ERROR i.a.s.h.JobHistoryHandler(listJobsFor):138 - Missing stats for job 1 attempt 0
airbyte-worker                    | java.lang.NullPointerException: Cannot invoke "Object.hashCode()" because "pk" is null
airbyte-worker                    | 	at java.util.ImmutableCollections$MapN.probe(ImmutableCollections.java:1321) ~[?:?]
airbyte-worker                    | 	at java.util.ImmutableCollections$MapN.get(ImmutableCollections.java:1235) ~[?:?]
airbyte-worker                    | 	at java.util.Comparator.lambda$comparingInt$7b0bb60$1(Comparator.java:494) ~[?:?]
airbyte-worker                    | 	at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355) ~[?:?]
airbyte-worker                    | 	at java.util.TimSort.sort(TimSort.java:220) ~[?:?]
airbyte-worker                    | 	at java.util.Arrays.sort(Arrays.java:1307) ~[?:?]
airbyte-worker                    | 	at java.util.ArrayList.sort(ArrayList.java:1721) ~[?:?]
airbyte-worker                    | 	at io.airbyte.workers.temporal.scheduling.activities.JobCreationAndStatusUpdateActivityImpl.orderByReleaseStageAsc(JobCreationAndStatusUpdateActivityImpl.java:488) ~[io.airbyte-airbyte-workers-0.40.28.jar:?]
airbyte-worker                    | 	at io.airbyte.workers.temporal.scheduling.activities.JobCreationAndStatusUpdateActivityImpl.emitAttemptEvent(JobCreationAndStatusUpdateActivityImpl.java:517) ~[io.airbyte-airbyte-workers-0.40.28.jar:?]
airbyte-worker                    | 	at io.airbyte.workers.temporal.scheduling.activities.JobCreationAndStatusUpdateActivityImpl.emitAttemptCompletedEvent(JobCreationAndStatusUpdateActivityImpl.java:558) ~[io.airbyte-airbyte-workers-0.40.28.jar:?]
airbyte-worker                    | 	at io.airbyte.workers.temporal.scheduling.activities.JobCreationAndStatusUpdateActivityImpl.emitAttemptCompletedEventIfAttemptPresent(JobCreationAndStatusUpdateActivityImpl.java:587) ~[io.airbyte-airbyte-workers-0.40.28.jar:?]
airbyte-worker                    | 	at io.airbyte.workers.temporal.scheduling.activities.JobCreationAndStatusUpdateActivityImpl.trackCompletion(JobCreationAndStatusUpdateActivityImpl.java:576) ~[io.airbyte-airbyte-workers-0.40.28.jar:?]
airbyte-worker                    | 	at io.airbyte.workers.temporal.scheduling.activities.JobCreationAndStatusUpdateActivityImpl.jobSuccess(JobCreationAndStatusUpdateActivityImpl.java:260) ~[io.airbyte-airbyte-workers-0.40.28.jar:?]
airbyte-worker                    | 	at io.airbyte.workers.temporal.scheduling.activities.JobCreationAndStatusUpdateActivityImpl.jobSuccessWithAttemptNumber(JobCreationAndStatusUpdateActivityImpl.java:272) ~[io.airbyte-airbyte-workers-0.40.28.jar:?]
airbyte-worker                    | 	at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) ~[?:?]
airbyte-worker                    | 	at java.lang.reflect.Method.invoke(Method.java:578) ~[?:?]
airbyte-worker                    | 	at io.temporal.internal.activity.RootActivityInboundCallsInterceptor$POJOActivityInboundCallsInterceptor.executeActivity(RootActivityInboundCallsInterceptor.java:64) ~[temporal-sdk-1.17.0.jar:?]
airbyte-worker                    | 	at io.temporal.internal.activity.RootActivityInboundCallsInterceptor.execute(RootActivityInboundCallsInterceptor.java:43) ~[temporal-sdk-1.17.0.jar:?]
airbyte-worker                    | 	at io.temporal.internal.activity.ActivityTaskExecutors$BaseActivityTaskExecutor.execute(ActivityTaskExecutors.java:95) ~[temporal-sdk-1.17.0.jar:?]
airbyte-worker                    | 	at io.temporal.internal.activity.ActivityTaskHandlerImpl.handle(ActivityTaskHandlerImpl.java:92) ~[temporal-sdk-1.17.0.jar:?]
airbyte-worker                    | 	at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handleActivity(ActivityWorker.java:241) ~[temporal-sdk-1.17.0.jar:?]
airbyte-worker                    | 	at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:206) ~[temporal-sdk-1.17.0.jar:?]
airbyte-worker                    | 	at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:179) ~[temporal-sdk-1.17.0.jar:?]
airbyte-worker                    | 	at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:93) ~[temporal-sdk-1.17.0.jar:?]
airbyte-worker                    | 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
airbyte-worker                    | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
airbyte-worker                    | 	at java.lang.Thread.run(Thread.java:1589) ~[?:?]

@benmoriceau
Copy link
Contributor

benmoriceau commented Jan 21, 2023

@evantahler I don't think it comes from you, it has the place where the NPE happens has been added in: #21286, Will try to come up with a fix.

@benmoriceau
Copy link
Contributor

benmoriceau commented Jan 21, 2023

#21685 this should the sort NPE issue. It did on my local.

@benmoriceau
Copy link
Contributor

@evantahler Just merged the fix, merging master should fix the issue.

@evantahler evantahler temporarily deployed to more-secrets January 23, 2023 18:30 — with GitHub Actions Inactive
@evantahler evantahler temporarily deployed to more-secrets January 23, 2023 18:30 — with GitHub Actions Inactive
@evantahler evantahler temporarily deployed to more-secrets January 23, 2023 18:36 — with GitHub Actions Inactive
@evantahler evantahler temporarily deployed to more-secrets January 23, 2023 18:37 — with GitHub Actions Inactive
@evantahler evantahler marked this pull request as ready for review January 23, 2023 20:20
@evantahler evantahler requested a review from a team as a code owner January 23, 2023 20:20
@evantahler evantahler temporarily deployed to more-secrets January 23, 2023 20:22 — with GitHub Actions Inactive
@evantahler evantahler temporarily deployed to more-secrets January 23, 2023 20:22 — with GitHub Actions Inactive
@evantahler
Copy link
Contributor Author

Ping @benmoriceau @pedroslopez @git-phu - I'd love a review please!

Copy link
Contributor

@benmoriceau benmoriceau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +1349 to +1351
allowedHosts:
hosts:
- "${host}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@git-phu I'd love your 👍 on how I ended up enabling interpolation. By the time we get to the KubeProcessFactory (or DockerProcessFactory), these strings are all resolved, but another pair of 👀 can't hurt.

Copy link
Contributor

@pedroslopez pedroslopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Added a note on the new dependency of actor configuration for the IntegrationLauncherConfigs.

I'm assuming we decided we won't be needing to use any source other than actor configuration for the variables since the substitution isn't "namespaced" (e.g. just host rrather than something like config.host)

@@ -109,7 +116,8 @@ public GeneratedJobInput getSyncWorkflowInput(final SyncInput input) {
.withAttemptId((long) attempt)
.withDockerImage(config.getSourceDockerImage())
.withProtocolVersion(config.getSourceProtocolVersion())
.withIsCustomConnector(config.getIsSourceCustomConnector());
.withIsCustomConnector(config.getIsSourceCustomConnector())
.withAllowedHosts(configReplacer.getAllowedHosts(sourceDefinition.getAllowedHosts(), config.getSourceConfiguration()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: this affects the intended plan for part 2 of the configuration update project. (it was proposed to build IntegrationLauncherConfigs separately at the beginning to re-use for the check and sync, but because this now depends on the source/destination configuration we should keep building the IntegrationLauncherConfigs whenever we generate the check and sync inputs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform issues related to the platform area/worker Related to worker
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pass interpolated allowedHosts to container runners and log
5 participants