NodeJS Cluster support #431

mupperton · 2024-09-19T14:10:23Z

I regularly make use of the NodeJS cluster module in "normal" API services, as JS is single threaded, and we want to make use of all available parallelism on servers that have multiple cores/threads

This however appears to not work as expected for a restate-registered service

My observations are that requests from the restate server to the NodeJS service appear to have a "sticky session" or using some kind of Keep-Alive, as they appear to always use the same worker process for requests in a short time span, and it requires no requests for approx ~90 seconds (from basic testing) before another worker process will be used instead, and of course then that becomes the sticky worker until another ~90 seconds have passed

However this ultimately defeats the point of the cluster module, as it's designed to improve concurrency, but all concurrent requests will be handled by the same worker currently

Likely this is a side effect HTTP2 being used?

I haven't tried this with another runtime like Bun

igalshilman · 2024-09-25T07:25:37Z

I'm not familiar with the cluster model for node, i'll take a look!

Likely this is a side effect HTTP2 being used?

This could be the case, as a single TCP connection is established, and then invocations are multiplexed within in as h2 streams.

Meanwhile I'd like to propose some alternatives:

More pods

if you are running on k8s, is it possible to deploy more pods instead? i
f you want to get fancy you can even combine restate with knative (here is a blog post that describes this approach https://knative.dev/blog/articles/building-stateful-applications-with-knative-and-restate/) this will get you out of the box scale out/down and load balancing even with http2.
Add more pods and run restate + envoy (single pod) for http2 load balancing to your node applications (ask me more if this is relevant)

If you are running on a bare metal, consider using

alternatively if you are running on a bare metal, consider deploying more NodeJS process with Ngnix/Caddy as a reverse proxy in front of them (all in the same box reverse proxying to local host)

igalshilman · 2024-09-25T07:37:04Z

One additional tought:

Is your use case CPU bound? if so, maybe consider the Worker API (https://nodejs.org/api/worker_threads.html)
to isolate the CPU intensive parts from the business logic?

igalshilman · 2024-09-25T09:33:53Z

Can confirm that this is indeed due to HTTP2, as the cluster module load balances per physical TCP connection, while HTTP2 keeps a single TCP connection but multiplex the streams on a single connection.

I've tried to look at ways to deal with this, and it seems that they require application side (pretty complicated) load balancing. Let me know if the alternative approaches are enough.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NodeJS Cluster support #431

NodeJS Cluster support #431

mupperton commented Sep 19, 2024

igalshilman commented Sep 25, 2024

igalshilman commented Sep 25, 2024

igalshilman commented Sep 25, 2024

NodeJS Cluster support #431

NodeJS Cluster support #431

Comments

mupperton commented Sep 19, 2024

igalshilman commented Sep 25, 2024

More pods

If you are running on a bare metal, consider using

igalshilman commented Sep 25, 2024

igalshilman commented Sep 25, 2024