Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools for managing a 'fleet' of processes #150

Open
jrochkind opened this issue Sep 21, 2020 · 13 comments
Open

tools for managing a 'fleet' of processes #150

jrochkind opened this issue Sep 21, 2020 · 13 comments

Comments

@jrochkind
Copy link
Contributor

Hi, were talking on reddit some time ago and I suggested it would be useful to have tools for managing a fleet or cluster separate worker processes -- since on MRI that's the only way to take advantage of multiple cores, which you probably want to be doing when you have a separate host just for bg workers, as is usually what you want at even moderate scale.

We agreed it's a bit tricky to figure out how to implement that, especially for those of us not experienced in "systems programming"

Recently someone brought this project to my attention, which hypothetically takes care of it for you! https://github.com/stripe/einhorn

It's a bit under-documented (and the README basically says "you're welcome to use this, but don't ask us questions or bother filing bug reports without PRs"), but I've been playing with it a bit and looking at the code, and it looks really nice!

The only real requirement it has is that your worker processes catch a USR2 signal as a message to do a graceful shutdown. So mentioning this in part to get it in there, so you don't accidentally use USR2 for anything else requiring a backwards incompat change to be compatible with einhorn. :( (resque uses USR2 for something else, alas. Sidekiq uses USR2 appropriately for einhorn, I think because sidekiq-enterprise actually uses einhorn).

@bensheldon
Copy link
Owner

I've been thinking more about this lately.

I was searching for projects and forked gem looks like it might fit the bill, though I didn't see anything particularly about zombie management, which is something that I would like to trust is taken care of (and the complexity of that is also why I'm eager to find a maintained gem that can do that for me).

I also like the look of https://github.com/salsify/delayed_job_worker_pool

@jrochkind
Copy link
Contributor Author

@bensheldon Einhorn's lack of maintenance makes you reluctant? It does seem to be some pretty sophisticated code. It is too bad that I can't find as high quality an option that is maintained either.

@sandstrom
Copy link

If this isn't an issue, maybe we could close or move to discussion.

@bensheldon
Copy link
Owner

I'm going to close this Issue for now, but am open to continuing the conversation. I do think that having a complete Puma-like fork+multithreaded executable would be really nice, but don't plan to implement that myself in the near future.

@rgaufman
Copy link

Why not just use systemd for this? - I've played around with a lot of different tools for forking and managing processes, including Bluepill, God and Eye. Eye was the best but it was still significantly more resource intensive. Even with sidekiq, I just do systemctl start sidekiq which starts all my sidekiq processes (e.g. sidekiq@worker1 sidekiq@worker2, etc), stop conversely stops them all.

It's not like there is a need for a shared socket with job processing.

@bensheldon
Copy link
Owner

Memory. Precious memory, especially in containerized environments.

Also, I agree on systemd, but people want to daemonize 🤷‍♀️

@rgaufman
Copy link

rgaufman commented Jul 13, 2022

How would forked gem save memory vs starting 2 processes with systemd?

Hmm, in dev I "daemonize" with foreman, in prod, systemd :) - I can understand why this is useful when you need to share a single socket (at the expense of wasting memory!) - but I still don't see how this would save memory in this case?

For example with Puma, you start a single worker, it takes 7% ram, you start a 2 worker cluster, it takes 21% (!!):

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
21339 deployer  20   0  568076 279440  13304 S  35.9 7.0 628:27.11 puma: cluster worker 0: 4963 [current]
21342 deployer  20   0  548076 278540  13328 S  34.6 7.0 632:45.05 puma: cluster worker 1: 4963 [current]
 4963 deployer  20   0  558076 278440  22048 S   1.0 7.0 881:10.39 puma 5.6.4 (tcp://192.168.187.71:3000) [current]

An extra 7% wasted ram for the process manager

@bensheldon
Copy link
Owner

Are you using Puma's preload_app!

Puma has a lot of other interesting copy-on-write optimizations: https://github.com/puma/puma/blob/master/docs/fork_worker.md

@rgaufman
Copy link

rgaufman commented Jul 13, 2022

Yes, I am. Interesting, will have a read.

@bensheldon bensheldon reopened this Jul 13, 2022
@rgaufman
Copy link

"fork_worker option and refork command for reduced memory usage by forking from a worker process instead of the master process. " - Ah, ok, so no more master process, saves 7% ram, but will still take the equivalent of starting 2 processes, so no saving in the case of good_job from what I understand?

@bensheldon
Copy link
Owner

Sorry, I meant to emphasize preload_app!. That's what saves memory through copy-on-write. The different forking strategies I linked to are further attempts to optimize loading as many Ruby constants as possible before forking.

@rgaufman
Copy link

Interesting, just having a read through this: https://shopify.engineering/ruby-execution-models

@jrochkind
Copy link
Contributor Author

jrochkind commented Jul 22, 2022

Note that the einhorn ruby tool to "run (and keep alive) multiple copies of a single long-lived process", originally from stripe, for a long time basically unmaintained, has now been adopted by mperham of sidekiq.

I believe einhorn is used by sidekiq pro for managing multiple sidekiq worker processes, and probably could be by good_job as well. Perhaps with a few tweaks to good_job, like interpreting SIGUSR2 as a graceful shutdown request. Possibly more to take full advantage of things like pre-forking management built into einhorn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Prioritized Backlog
Development

No branches or pull requests

4 participants