-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add additional sync timing information #17643
Conversation
airbyte-workers/src/main/java/io/airbyte/workers/general/DefaultReplicationWorker.java
Outdated
Show resolved
Hide resolved
long replicationEndTime = -1; | ||
long sourceReadStartTime = -1; | ||
long destinationWriteStartTime = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that these longs needed to be initialized or else they couldn't be used below in withSourceReadStartTime()
- I'll be there's an Optional<>
Java thing I don't know about that's better than all the ternaries...
airbyte-config/config-models/src/main/resources/types/ReplicationAttemptSummary.yaml
Outdated
Show resolved
Hide resolved
airbyte-config/config-models/src/main/resources/types/StandardSyncSummary.yaml
Outdated
Show resolved
Hide resolved
airbyte-workers/src/main/java/io/airbyte/workers/general/DefaultReplicationWorker.java
Outdated
Show resolved
Hide resolved
airbyte-workers/src/main/java/io/airbyte/workers/general/DefaultReplicationWorker.java
Outdated
Show resolved
Hide resolved
Tests are finally green! Now ready for review! |
airbyte-commons-worker/src/main/java/io/airbyte/workers/general/DefaultReplicationWorker.java
Outdated
Show resolved
Hide resolved
airbyte-commons-worker/src/main/java/io/airbyte/workers/general/DefaultReplicationWorker.java
Show resolved
Hide resolved
airbyte-commons-worker/src/main/java/io/airbyte/workers/general/DefaultReplicationWorker.java
Show resolved
Hide resolved
airbyte-config/config-models/src/main/resources/types/SyncStats.yaml
Outdated
Show resolved
Hide resolved
airbyte-workers/src/main/java/io/airbyte/workers/temporal/sync/ReplicationActivityImpl.java
Outdated
Show resolved
Hide resolved
airbyte-config/config-models/src/main/resources/types/SyncStats.yaml
Outdated
Show resolved
Hide resolved
LGTM, not too familiar with the whole java threaded synchronization stuff though |
airbyte-commons-worker/src/main/java/io/airbyte/workers/general/DefaultReplicationWorker.java
Outdated
Show resolved
Hide resolved
airbyte-commons-worker/src/main/java/io/airbyte/workers/general/DefaultReplicationWorker.java
Outdated
Show resolved
Hide resolved
airbyte-commons-worker/src/main/java/io/airbyte/workers/general/DefaultReplicationWorker.java
Outdated
Show resolved
Hide resolved
...tence/job-persistence/src/main/java/io/airbyte/persistence/job/tracker/TrackingMetadata.java
Show resolved
Hide resolved
...tence/job-persistence/src/main/java/io/airbyte/persistence/job/tracker/TrackingMetadata.java
Show resolved
Hide resolved
airbyte-commons-worker/src/main/java/io/airbyte/workers/general/DefaultReplicationWorker.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
Most of my comments are suggestions/nits for improving readability.
Approving since I don't think these are blocking comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The names look good to me and make sense. Thanks @evantahler !
…vation * master: (98 commits) 🐛 Source Bing Ads - Fix Campaigns stream misses Audience and Shopping (#17873) Source S3 - fix schema inference (#17991) 🎉 JDBC sources: store cursor record count in db state (#15535) Introduce webhook configs into workspace api and persistence (#17950) ci: upload test results to github for analysis (#17953) Trigger the connectors build if there are worker changes. (#17976) Add additional sync timing information (#17643) Use page_token_option instead of page_token (#17892) capture metrics around json messages size (#17973) 🐛 Correct kube annotations variable as per the docs. (#17972) 🪟 🎉 Add /connector-builder page with embedded YAML editor (#17482) fix `est_num_metrics_emitted_by_reporter` not being emitted (#17929) Update schema dumps (#17960) Remove the bump in the value.yml (#17959) Ensure database initialization in test container (#17697) Remove typo line from incremental reads docs (#17920) DocS: Update authentication.md (#17931) Use MessageMigration for Source Connection Check. (#17656) fixed links (#17949) remove usages of YamlSeedConfigPersistence (#17895) ...
* WIP - Add additional sync timing information * Fixup tests * fix PMD problem * send data to segment * Test JobTracker * respond to PR suggestions * fixup test * formatting * fix initializer for stats * Make thread-safe with synchronized * Don't clobber syncStats on init * add comments and fix init * Do what Pedro says * Extract timeTracker pojo
Closes #17407
whimsical link
This PR adds a plethora of additional time information for each part of the replication workflow so we can better analyze performance.
The times that are added are different from the existing startTime and endTime in subtle but interesting ways:
While it might appear that
startTime
,replication_start_time
, andsource_read_start_time
should all be the same, there is often a lag, which is due to things like downloading the docker image, starting the K8s pod, etc.Confirming that the data appears in BiqQuery:
Produces