-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix!: concurrent worker agent processes #160
Merged
jusiskin
merged 1 commit into
aws-deadline:mainline
from
jusiskin:fix_concurrent_worker_agent_processes
Oct 30, 2024
Merged
fix!: concurrent worker agent processes #160
jusiskin
merged 1 commit into
aws-deadline:mainline
from
jusiskin:fix_concurrent_worker_agent_processes
Oct 30, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jusiskin
force-pushed
the
fix_concurrent_worker_agent_processes
branch
from
October 28, 2024 16:09
6a41adb
to
44eceae
Compare
leongdl
previously approved these changes
Oct 29, 2024
jusiskin
force-pushed
the
fix_concurrent_worker_agent_processes
branch
from
October 30, 2024 17:42
44eceae
to
55a5078
Compare
jusiskin
changed the title
fix: concurrent worker agent processes
fix!: concurrent worker agent processes
Oct 30, 2024
jusiskin
commented
Oct 30, 2024
AWS-Samuel
previously approved these changes
Oct 30, 2024
jusiskin
force-pushed
the
fix_concurrent_worker_agent_processes
branch
from
October 30, 2024 18:29
55a5078
to
0e5fa2a
Compare
BREAKING CHANGE: the default worker config now starts the worker agent service Signed-off-by: Josh Usiskin <[email protected]>
jusiskin
force-pushed
the
fix_concurrent_worker_agent_processes
branch
from
October 30, 2024 18:31
0e5fa2a
to
72bf9c3
Compare
Quality Gate passedIssues Measures |
AWS-Samuel
approved these changes
Oct 30, 2024
YutongLi291
approved these changes
Oct 30, 2024
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What was the problem/requirement? (What/Why)
The
PosixInstanceWorkerBase
class, which forms the base class ofPosixInstanceBuildWorker
had a bug causing multiple worker agent processes to be running simultaneously.There was code in one location that was starting the systemd unit for the worker agent:
deadline-cloud-test-fixtures/src/deadline_test_fixtures/deadline/worker.py
Lines 784 to 785 in 84592ad
and code in another place launching the worker agent in an SSM command:
deadline-cloud-test-fixtures/src/deadline_test_fixtures/deadline/worker.py
Line 644 in 84592ad
What was the solution? (How)
Removed the code that launches the worker agent through an SSM command, since the officially supported and recommended way to run the worker agent is using systemd. We want tests to reflect this setup.
The way configuration is applied to the worker agent had to be changed. We were previously setting environment variables in the worker agent user's bash startup script, but this script is not loaded by systemd services since it does not use a shell. Instead we use a supplementary systemd drop-in config file (see https://wiki.archlinux.org/title/Systemd#Drop-in_files)
I've also modified the default for
DeadlineWorkerConfiguration.start_service
fromfalse
→true
since most tests relied on the worker being started. Downstream consumers that relied on the worker agent being started by the SSM command will not need to modify their test code and this is the sensible default.What is the impact of this change?
There will not be issues encountered in tests using this library where there are multiple worker agents running for the same worker ID, causing conflicts.
How was this change tested?
Checked out the
mainline
branch of https://github.com/aws-deadline/deadline-cloud-worker-agentCreated the hatch environments (e.g. ran
hatch run fmt
)Installed the modified version of
dealdine-cloud-test-fixtures
from this branch using:Ran the E2E tests using the instructions from
DEVELOPMENT.md
Was this change documented?
No
Is this a breaking change?
Yes:
DeadlineWorkerConfiguration.start_service
was changed fromFalse
→True
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.