-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelize Synapse worker template render #2759
base: master
Are you sure you want to change the base?
Parallelize Synapse worker template render #2759
Conversation
This is quite interesting and it seems nicely done! I'm just worried about the additional complication it introduces. Seems like this is mostly helpful for very large deployments. And also possibly with a high latency between the Ansible controller and remote server. BenchmarksI'll add my own benchmark as well to serve as extra data points. My Ansible controller is my mini PC, which has an AMD Ryzen 9 6900HX CPU. I'm testing against 2 different servers - doing the tests one by one. Command: This is like
The After (async, It seems like when the total number of workers is low and the server latency is tiny, doing things asynchronously does not help much. I expect that this PR has some complexity cost which won't help most users of the playbook, but may certainly help some large-scale deployments. As your benchmarks indicate, when targeting a server with 130 workers, it makes a big difference. My benchmarks do not target such a large large deployment, but I trust your results. It's probably a good idea to confirm if your benchmarks used Areas for improvementSome areas for improvement for this PR might be:
|
Thanks Slavi, and appreciate the detailed feedback! Yeah, most of the benefit comes from large deployments that have a lot of generic workers or event stream workers. I think smaller deployments might also benefit from having a couple more generic workers than using the I re-ran one of the benchmarks with the suggested bugfix , and here is what I got:
In regards to areas of improvement, I agree with your suggestions and will follow up later to add them along with including changes to documentation for the relevant docs articles. Thanks again for the feedback! |
Here's a different crazy idea for parallelizing Synapse workers. I've seen some Ansible resources suggest creating multiple entries in your inventory for a single machine. For example, maybe you want to run two different instances of a given service, with two totally different configurations. Apparently one way to do it is to create two inventory entries and pretend that those are two totally separate hosts. (I don't know enough about Ansible to know whether this is an accepted best practice or not.) So, along those lines: Might it be possible to create multiple fake "hosts" under In the limit, you could create one inventory entry for each Synapse worker, and then your parallelism would be limited only by the number of SSH connections that Ansible is willing to run. In reality that's probably crazy. But for ~128ish workers, it might be feasible to have 16 "virtual" hosts running 8 workers each. |
Splitting one host into multiple "virtual" hosts has the following problems:
In step 2 of setting up workers, we delete anything we don't recognize as a valid worker for the current configuration. If there are multiple Ansible instances operating on the machine, some will stop & delete worker configuration done by other instances. |
I noticed there are a few areas in the playbook that could benefit from running tasks in parallel rather than serially, and I've had significant time savings running the roles from doing so . I started with updating the Synapse install role by rendering worker configuration and service files in parallel rather than serially and wanted to share my changes for feedback and/or merging in the playbook if this is of interest.
Some quick benchmarks for running the
install-synapse
tag with enabled workers:Some things to note regarding the implementation:
ansible.builtin.template
unfortunately does not support running the task asynchronously directly in the playbook, so the logic for launching the templating tasks are required to use shell commands and passing in the required environment variables.MaxSessions
configuration option in your server if you run into such errors.If there is interest in parallelizing more roles in the playbook, let me know as I think there are other roles I could try running tasks in parallel that could benefit and share further contributions.