Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🐛 BUG]: Jobs plugin stucks at many workers and many pollers #1568

Closed
1 task done
Kaspiman opened this issue May 20, 2023 · 9 comments · Fixed by roadrunner-server/jobs#78
Closed
1 task done
Assignees
Labels
B-bug Bug: bug, exception
Milestone

Comments

@Kaspiman
Copy link
Sponsor

No duplicates 🥲.

  • I have searched for a similar issue in our bug tracker and didn't find any solutions.

What happened?

A bug happened!

I am running a Jobs plugin with a lot of workers and num_pollers, for example num_workers: 256 and num_pollers: 512.

Next, I send messages to several queues and wait for the maximum work. Instead, only a couple of workers are working, the rest remain in the "ready" state. This continues indefinitely and does not improve over time (see metrics plugin output). Restarting multiple times doesn't change the workers behavior.

Big log output of RR. Pay attention to the received messages in the pipeline, the number of running workers and the number of completed tasks.
rr.log

Version (rr --version)

rr2 version 2023.1.3 (build time: 2023-05-11T12:32:12+0000, go1.20.4), OS: linux, arch: amd64

How to reproduce the issue?

Create dummy worker with queue handlers with 7-5-10 sleep per queue. Add many tasks to queue test-1, test-2 and test-3.

Use config like this:

version: '3'
rpc:
    listen: 'tcp://127.0.0.1:6001'
metrics:
    address: '0.0.0.0:9254'
server:
    command: '/usr/local/bin/php public/worker.php'
amqp:
    addr: 'amqp://guest:guest@rabbitmq:5672'

endure:
    grace_period: 600s

logs:
    level: debug
    mode: development
    output: stdout
jobs:
    num_pollers: 512
    pool:
        command: '/usr/local/bin/php public/worker.php'
        num_workers: 256
        max_jobs: 1000
        destroy_timeout: 300s
    consume:
        - test-1
        - test-2
        - test-3
    pipelines:
        test-1:
            driver: amqp
            config:
                queue: test-1
                prefetch: 256
                priority: 10
                exchange: default
                redial_timeout: 60
                routing_key: test-1
                durable: true
                exchange_durable: true
                consume_all: true
        test-2:
            driver: amqp
            config:
                queue: test-2
                prefetch: 256
                priority: 10
                exchange: default
                redial_timeout: 30
                routing_key: test-2
                durable: true
                exchange_durable: true
                consume_all: true
        test-3:
            driver: amqp
            config:
                queue: test-3
                prefetch: 256
                priority: 10
                exchange: default
                redial_timeout: 30
                routing_key: test-3
                durable: true
                exchange_durable: true
                consume_all: true

Relevant log output

# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 6.0929e-05
go_gc_duration_seconds{quantile="0.25"} 7.721e-05
go_gc_duration_seconds{quantile="0.5"} 0.000257334
go_gc_duration_seconds{quantile="0.75"} 0.001486797
go_gc_duration_seconds{quantile="1"} 0.002248016
go_gc_duration_seconds_sum 0.004268292
go_gc_duration_seconds_count 6
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 554
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.20.4"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 4.2721888e+07
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 6.79442e+07
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.46069e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 191500
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 9.737296e+06
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 4.2721888e+07
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 1.5179776e+07
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 4.6882816e+07
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 81140
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes gauge
go_memstats_heap_released_bytes 1.851392e+06
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 6.2062592e+07
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.6845995197687607e+09
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 0
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 272640
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 24000
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 31200
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 577440
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 767040
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 7.0220144e+07
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 4.732118e+06
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 1.343488e+07
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 1.343488e+07
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 9.2225816e+07
# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 271
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0.49
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 782
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 1.016832e+08
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.68459944851e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 8.50771968e+08
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes 1.8446744073709552e+19
# HELP rr_http_requests_queue Total number of queued requests.
# TYPE rr_http_requests_queue gauge
rr_http_requests_queue 0
# HELP rr_http_uptime_seconds Uptime in seconds
# TYPE rr_http_uptime_seconds counter
rr_http_uptime_seconds 131
# HELP rr_jobs_jobs_err Number of jobs error while processing in the worker
# TYPE rr_jobs_jobs_err gauge
rr_jobs_jobs_err 0
# HELP rr_jobs_jobs_ok Number of successfully processed jobs
# TYPE rr_jobs_jobs_ok gauge
rr_jobs_jobs_ok 27
# HELP rr_jobs_push_err Number of jobs push which was failed
# TYPE rr_jobs_push_err gauge
rr_jobs_push_err 0
# HELP rr_jobs_push_ok Number of job push
# TYPE rr_jobs_push_ok gauge
rr_jobs_push_ok 0
# HELP rr_jobs_total_workers Total number of workers used by the plugin
# TYPE rr_jobs_total_workers gauge
rr_jobs_total_workers 256
# HELP rr_jobs_worker_memory_bytes Worker current memory usage
# TYPE rr_jobs_worker_memory_bytes gauge
rr_jobs_worker_memory_bytes{pid="7770"} 5.3936128e+07
rr_jobs_worker_memory_bytes{pid="7782"} 5.3919744e+07
rr_jobs_worker_memory_bytes{pid="7783"} 5.3805056e+07
rr_jobs_worker_memory_bytes{pid="7784"} 5.4079488e+07
rr_jobs_worker_memory_bytes{pid="7785"} 5.4214656e+07
rr_jobs_worker_memory_bytes{pid="7786"} 5.394432e+07
rr_jobs_worker_memory_bytes{pid="7787"} 5.3850112e+07
rr_jobs_worker_memory_bytes{pid="7788"} 5.5431168e+07
rr_jobs_worker_memory_bytes{pid="7789"} 5.402624e+07
rr_jobs_worker_memory_bytes{pid="7790"} 5.3977088e+07
rr_jobs_worker_memory_bytes{pid="7791"} 5.3792768e+07
rr_jobs_worker_memory_bytes{pid="7792"} 5.5513088e+07
rr_jobs_worker_memory_bytes{pid="7793"} 5.3968896e+07
rr_jobs_worker_memory_bytes{pid="7794"} 5.5427072e+07
rr_jobs_worker_memory_bytes{pid="7795"} 5.400576e+07
rr_jobs_worker_memory_bytes{pid="7796"} 5.4259712e+07
rr_jobs_worker_memory_bytes{pid="7797"} 5.3878784e+07
rr_jobs_worker_memory_bytes{pid="7798"} 5.521408e+07
rr_jobs_worker_memory_bytes{pid="7799"} 5.3641216e+07
rr_jobs_worker_memory_bytes{pid="7800"} 5.5332864e+07
rr_jobs_worker_memory_bytes{pid="7801"} 5.3907456e+07
rr_jobs_worker_memory_bytes{pid="7802"} 5.4165504e+07
rr_jobs_worker_memory_bytes{pid="7803"} 5.4079488e+07
rr_jobs_worker_memory_bytes{pid="7804"} 5.3866496e+07
rr_jobs_worker_memory_bytes{pid="7805"} 5.3919744e+07
rr_jobs_worker_memory_bytes{pid="7806"} 5.4329344e+07
rr_jobs_worker_memory_bytes{pid="7807"} 5.412864e+07
rr_jobs_worker_memory_bytes{pid="7808"} 5.5484416e+07
rr_jobs_worker_memory_bytes{pid="7809"} 5.4161408e+07
rr_jobs_worker_memory_bytes{pid="7810"} 5.5054336e+07
rr_jobs_worker_memory_bytes{pid="7811"} 5.3997568e+07
rr_jobs_worker_memory_bytes{pid="7812"} 5.4960128e+07
rr_jobs_worker_memory_bytes{pid="7813"} 5.376e+07
rr_jobs_worker_memory_bytes{pid="7814"} 5.4145024e+07
rr_jobs_worker_memory_bytes{pid="7815"} 5.5554048e+07
rr_jobs_worker_memory_bytes{pid="7816"} 5.3927936e+07
rr_jobs_worker_memory_bytes{pid="7817"} 5.5570432e+07
rr_jobs_worker_memory_bytes{pid="7818"} 5.3977088e+07
rr_jobs_worker_memory_bytes{pid="7819"} 5.3690368e+07
rr_jobs_worker_memory_bytes{pid="7820"} 5.4161408e+07
rr_jobs_worker_memory_bytes{pid="7821"} 5.5291904e+07
rr_jobs_worker_memory_bytes{pid="7822"} 5.3723136e+07
rr_jobs_worker_memory_bytes{pid="7823"} 5.3993472e+07
rr_jobs_worker_memory_bytes{pid="7824"} 5.3911552e+07
rr_jobs_worker_memory_bytes{pid="7825"} 5.5533568e+07
rr_jobs_worker_memory_bytes{pid="7826"} 5.3927936e+07
rr_jobs_worker_memory_bytes{pid="7827"} 5.3592064e+07
rr_jobs_worker_memory_bytes{pid="7828"} 5.3899264e+07
rr_jobs_worker_memory_bytes{pid="7829"} 5.3948416e+07
rr_jobs_worker_memory_bytes{pid="7830"} 5.404672e+07
rr_jobs_worker_memory_bytes{pid="7831"} 5.4140928e+07
rr_jobs_worker_memory_bytes{pid="7832"} 5.3874688e+07
rr_jobs_worker_memory_bytes{pid="7833"} 5.3866496e+07
rr_jobs_worker_memory_bytes{pid="7834"} 5.4263808e+07
rr_jobs_worker_memory_bytes{pid="7835"} 5.359616e+07
rr_jobs_worker_memory_bytes{pid="7836"} 5.4120448e+07
rr_jobs_worker_memory_bytes{pid="7837"} 5.412864e+07
rr_jobs_worker_memory_bytes{pid="7838"} 5.4321152e+07
rr_jobs_worker_memory_bytes{pid="7839"} 5.400576e+07
rr_jobs_worker_memory_bytes{pid="7840"} 5.36576e+07
rr_jobs_worker_memory_bytes{pid="7841"} 5.3936128e+07
rr_jobs_worker_memory_bytes{pid="7842"} 5.5341056e+07
rr_jobs_worker_memory_bytes{pid="7843"} 5.4153216e+07
rr_jobs_worker_memory_bytes{pid="7844"} 5.3993472e+07
rr_jobs_worker_memory_bytes{pid="7845"} 5.4206464e+07
rr_jobs_worker_memory_bytes{pid="7846"} 5.4837248e+07
rr_jobs_worker_memory_bytes{pid="7847"} 5.4321152e+07
rr_jobs_worker_memory_bytes{pid="7848"} 5.3592064e+07
rr_jobs_worker_memory_bytes{pid="7849"} 5.4132736e+07
rr_jobs_worker_memory_bytes{pid="7850"} 5.3850112e+07
rr_jobs_worker_memory_bytes{pid="7851"} 5.4001664e+07
rr_jobs_worker_memory_bytes{pid="7852"} 5.3653504e+07
rr_jobs_worker_memory_bytes{pid="7853"} 5.38624e+07
rr_jobs_worker_memory_bytes{pid="7854"} 5.4091776e+07
rr_jobs_worker_memory_bytes{pid="7855"} 5.4112256e+07
rr_jobs_worker_memory_bytes{pid="7856"} 5.4153216e+07
rr_jobs_worker_memory_bytes{pid="7857"} 5.3772288e+07
rr_jobs_worker_memory_bytes{pid="7858"} 5.392384e+07
rr_jobs_worker_memory_bytes{pid="7859"} 5.3993472e+07
rr_jobs_worker_memory_bytes{pid="7860"} 5.394432e+07
rr_jobs_worker_memory_bytes{pid="7861"} 5.3723136e+07
rr_jobs_worker_memory_bytes{pid="7862"} 5.4198272e+07
rr_jobs_worker_memory_bytes{pid="7863"} 5.4079488e+07
rr_jobs_worker_memory_bytes{pid="7864"} 5.4050816e+07
rr_jobs_worker_memory_bytes{pid="7865"} 5.3653504e+07
rr_jobs_worker_memory_bytes{pid="7866"} 5.4095872e+07
rr_jobs_worker_memory_bytes{pid="7867"} 5.4308864e+07
rr_jobs_worker_memory_bytes{pid="7868"} 5.4112256e+07
rr_jobs_worker_memory_bytes{pid="7869"} 5.3653504e+07
rr_jobs_worker_memory_bytes{pid="7870"} 5.4239232e+07
rr_jobs_worker_memory_bytes{pid="7871"} 5.3899264e+07
rr_jobs_worker_memory_bytes{pid="7872"} 5.4235136e+07
rr_jobs_worker_memory_bytes{pid="7873"} 5.3993472e+07
rr_jobs_worker_memory_bytes{pid="7874"} 5.3837824e+07
rr_jobs_worker_memory_bytes{pid="7875"} 5.394432e+07
rr_jobs_worker_memory_bytes{pid="7876"} 5.3714944e+07
rr_jobs_worker_memory_bytes{pid="7877"} 5.388288e+07
rr_jobs_worker_memory_bytes{pid="7878"} 5.3633024e+07
rr_jobs_worker_memory_bytes{pid="7879"} 5.3604352e+07
rr_jobs_worker_memory_bytes{pid="7880"} 5.421056e+07
rr_jobs_worker_memory_bytes{pid="7881"} 5.3772288e+07
rr_jobs_worker_memory_bytes{pid="7882"} 5.3542912e+07
rr_jobs_worker_memory_bytes{pid="7883"} 5.40672e+07
rr_jobs_worker_memory_bytes{pid="7884"} 5.4095872e+07
rr_jobs_worker_memory_bytes{pid="7885"} 5.361664e+07
rr_jobs_worker_memory_bytes{pid="7886"} 5.5345152e+07
rr_jobs_worker_memory_bytes{pid="7887"} 5.3710848e+07
rr_jobs_worker_memory_bytes{pid="7888"} 5.4063104e+07
rr_jobs_worker_memory_bytes{pid="7889"} 5.5382016e+07
rr_jobs_worker_memory_bytes{pid="7890"} 5.404672e+07
rr_jobs_worker_memory_bytes{pid="7891"} 5.3600256e+07
rr_jobs_worker_memory_bytes{pid="7892"} 5.4054912e+07
rr_jobs_worker_memory_bytes{pid="7893"} 5.431296e+07
rr_jobs_worker_memory_bytes{pid="7894"} 5.4099968e+07
rr_jobs_worker_memory_bytes{pid="7895"} 5.4042624e+07
rr_jobs_worker_memory_bytes{pid="7896"} 5.3936128e+07
rr_jobs_worker_memory_bytes{pid="7897"} 5.38624e+07
rr_jobs_worker_memory_bytes{pid="7898"} 5.3948416e+07
rr_jobs_worker_memory_bytes{pid="7899"} 5.4001664e+07
rr_jobs_worker_memory_bytes{pid="7900"} 5.4050816e+07
rr_jobs_worker_memory_bytes{pid="7901"} 5.3927936e+07
rr_jobs_worker_memory_bytes{pid="7902"} 5.4177792e+07
rr_jobs_worker_memory_bytes{pid="7903"} 5.3895168e+07
rr_jobs_worker_memory_bytes{pid="7904"} 5.3981184e+07
rr_jobs_worker_memory_bytes{pid="7905"} 5.41696e+07
rr_jobs_worker_memory_bytes{pid="7906"} 5.4145024e+07
rr_jobs_worker_memory_bytes{pid="7907"} 5.3551104e+07
rr_jobs_worker_memory_bytes{pid="7908"} 5.5422976e+07
rr_jobs_worker_memory_bytes{pid="7909"} 5.4018048e+07
rr_jobs_worker_memory_bytes{pid="7910"} 5.3661696e+07
rr_jobs_worker_memory_bytes{pid="7911"} 5.4124544e+07
rr_jobs_worker_memory_bytes{pid="7912"} 5.4235136e+07
rr_jobs_worker_memory_bytes{pid="7913"} 5.3837824e+07
rr_jobs_worker_memory_bytes{pid="7914"} 5.4263808e+07
rr_jobs_worker_memory_bytes{pid="7915"} 5.3764096e+07
rr_jobs_worker_memory_bytes{pid="7916"} 5.4239232e+07
rr_jobs_worker_memory_bytes{pid="7917"} 5.4091776e+07
rr_jobs_worker_memory_bytes{pid="7918"} 5.3899264e+07
rr_jobs_worker_memory_bytes{pid="7919"} 5.4030336e+07
rr_jobs_worker_memory_bytes{pid="7920"} 5.4267904e+07
rr_jobs_worker_memory_bytes{pid="7921"} 5.4018048e+07
rr_jobs_worker_memory_bytes{pid="7922"} 5.5046144e+07
rr_jobs_worker_memory_bytes{pid="7923"} 5.4222848e+07
rr_jobs_worker_memory_bytes{pid="7924"} 5.4083584e+07
rr_jobs_worker_memory_bytes{pid="7925"} 5.3997568e+07
rr_jobs_worker_memory_bytes{pid="7926"} 5.4018048e+07
rr_jobs_worker_memory_bytes{pid="7927"} 5.4267904e+07
rr_jobs_worker_memory_bytes{pid="7928"} 5.5406592e+07
rr_jobs_worker_memory_bytes{pid="7929"} 5.4308864e+07
rr_jobs_worker_memory_bytes{pid="7930"} 5.3993472e+07
rr_jobs_worker_memory_bytes{pid="7931"} 5.3743616e+07
rr_jobs_worker_memory_bytes{pid="7932"} 5.371904e+07
rr_jobs_worker_memory_bytes{pid="7933"} 5.3723136e+07
rr_jobs_worker_memory_bytes{pid="7934"} 5.3895168e+07
rr_jobs_worker_memory_bytes{pid="7935"} 5.3653504e+07
rr_jobs_worker_memory_bytes{pid="7936"} 5.4145024e+07
rr_jobs_worker_memory_bytes{pid="7937"} 5.3649408e+07
rr_jobs_worker_memory_bytes{pid="7938"} 5.400576e+07
rr_jobs_worker_memory_bytes{pid="7939"} 5.392384e+07
rr_jobs_worker_memory_bytes{pid="7940"} 5.4308864e+07
rr_jobs_worker_memory_bytes{pid="7941"} 5.3977088e+07
rr_jobs_worker_memory_bytes{pid="7942"} 5.5291904e+07
rr_jobs_worker_memory_bytes{pid="7943"} 5.3960704e+07
rr_jobs_worker_memory_bytes{pid="7944"} 5.5246848e+07
rr_jobs_worker_memory_bytes{pid="7945"} 5.3895168e+07
rr_jobs_worker_memory_bytes{pid="7946"} 5.4161408e+07
rr_jobs_worker_memory_bytes{pid="7947"} 5.4038528e+07
rr_jobs_worker_memory_bytes{pid="7948"} 5.36576e+07
rr_jobs_worker_memory_bytes{pid="7949"} 5.3993472e+07
rr_jobs_worker_memory_bytes{pid="7950"} 5.419008e+07
rr_jobs_worker_memory_bytes{pid="7951"} 5.3891072e+07
rr_jobs_worker_memory_bytes{pid="7952"} 5.3874688e+07
rr_jobs_worker_memory_bytes{pid="7953"} 5.3620736e+07
rr_jobs_worker_memory_bytes{pid="7954"} 5.5402496e+07
rr_jobs_worker_memory_bytes{pid="7955"} 5.4259712e+07
rr_jobs_worker_memory_bytes{pid="7956"} 5.4214656e+07
rr_jobs_worker_memory_bytes{pid="7957"} 5.4116352e+07
rr_jobs_worker_memory_bytes{pid="7958"} 5.3723136e+07
rr_jobs_worker_memory_bytes{pid="7959"} 5.380096e+07
rr_jobs_worker_memory_bytes{pid="7960"} 5.3891072e+07
rr_jobs_worker_memory_bytes{pid="7961"} 5.4247424e+07
rr_jobs_worker_memory_bytes{pid="7962"} 5.5447552e+07
rr_jobs_worker_memory_bytes{pid="7963"} 5.4050816e+07
rr_jobs_worker_memory_bytes{pid="7964"} 5.3686272e+07
rr_jobs_worker_memory_bytes{pid="7965"} 5.4042624e+07
rr_jobs_worker_memory_bytes{pid="7966"} 5.3981184e+07
rr_jobs_worker_memory_bytes{pid="7967"} 5.4104064e+07
rr_jobs_worker_memory_bytes{pid="7968"} 5.3604352e+07
rr_jobs_worker_memory_bytes{pid="7969"} 5.3706752e+07
rr_jobs_worker_memory_bytes{pid="7970"} 5.384192e+07
rr_jobs_worker_memory_bytes{pid="7971"} 5.4140928e+07
rr_jobs_worker_memory_bytes{pid="7972"} 5.4235136e+07
rr_jobs_worker_memory_bytes{pid="7973"} 5.4095872e+07
rr_jobs_worker_memory_bytes{pid="7974"} 5.410816e+07
rr_jobs_worker_memory_bytes{pid="7975"} 5.388288e+07
rr_jobs_worker_memory_bytes{pid="7976"} 5.523456e+07
rr_jobs_worker_memory_bytes{pid="7977"} 5.3809152e+07
rr_jobs_worker_memory_bytes{pid="7978"} 5.3997568e+07
rr_jobs_worker_memory_bytes{pid="7979"} 5.3948416e+07
rr_jobs_worker_memory_bytes{pid="7980"} 5.4112256e+07
rr_jobs_worker_memory_bytes{pid="7981"} 5.5431168e+07
rr_jobs_worker_memory_bytes{pid="7982"} 5.41696e+07
rr_jobs_worker_memory_bytes{pid="7983"} 5.4247424e+07
rr_jobs_worker_memory_bytes{pid="7984"} 5.3551104e+07
rr_jobs_worker_memory_bytes{pid="7985"} 5.4779904e+07
rr_jobs_worker_memory_bytes{pid="7986"} 5.3612544e+07
rr_jobs_worker_memory_bytes{pid="7987"} 5.3813248e+07
rr_jobs_worker_memory_bytes{pid="7988"} 5.4267904e+07
rr_jobs_worker_memory_bytes{pid="7989"} 5.4116352e+07
rr_jobs_worker_memory_bytes{pid="7990"} 5.392384e+07
rr_jobs_worker_memory_bytes{pid="7991"} 5.4083584e+07
rr_jobs_worker_memory_bytes{pid="7992"} 5.4095872e+07
rr_jobs_worker_memory_bytes{pid="7993"} 5.3936128e+07
rr_jobs_worker_memory_bytes{pid="7994"} 5.3837824e+07
rr_jobs_worker_memory_bytes{pid="7995"} 5.410816e+07
rr_jobs_worker_memory_bytes{pid="7996"} 5.3940224e+07
rr_jobs_worker_memory_bytes{pid="7997"} 5.4071296e+07
rr_jobs_worker_memory_bytes{pid="7998"} 5.5341056e+07
rr_jobs_worker_memory_bytes{pid="7999"} 5.4206464e+07
rr_jobs_worker_memory_bytes{pid="8000"} 5.3723136e+07
rr_jobs_worker_memory_bytes{pid="8001"} 5.412864e+07
rr_jobs_worker_memory_bytes{pid="8002"} 5.4136832e+07
rr_jobs_worker_memory_bytes{pid="8003"} 5.5386112e+07
rr_jobs_worker_memory_bytes{pid="8004"} 5.4145024e+07
rr_jobs_worker_memory_bytes{pid="8005"} 5.3809152e+07
rr_jobs_worker_memory_bytes{pid="8006"} 5.4157312e+07
rr_jobs_worker_memory_bytes{pid="8007"} 5.392384e+07
rr_jobs_worker_memory_bytes{pid="8008"} 5.4054912e+07
rr_jobs_worker_memory_bytes{pid="8009"} 5.4099968e+07
rr_jobs_worker_memory_bytes{pid="8010"} 5.4132736e+07
rr_jobs_worker_memory_bytes{pid="8011"} 5.3993472e+07
rr_jobs_worker_memory_bytes{pid="8012"} 5.4104064e+07
rr_jobs_worker_memory_bytes{pid="8013"} 5.4132736e+07
rr_jobs_worker_memory_bytes{pid="8014"} 5.4296576e+07
rr_jobs_worker_memory_bytes{pid="8015"} 5.3870592e+07
rr_jobs_worker_memory_bytes{pid="8016"} 5.4329344e+07
rr_jobs_worker_memory_bytes{pid="8017"} 5.359616e+07
rr_jobs_worker_memory_bytes{pid="8018"} 5.3747712e+07
rr_jobs_worker_memory_bytes{pid="8019"} 5.4198272e+07
rr_jobs_worker_memory_bytes{pid="8020"} 5.4013952e+07
rr_jobs_worker_memory_bytes{pid="8021"} 5.4075392e+07
rr_jobs_worker_memory_bytes{pid="8022"} 5.3600256e+07
rr_jobs_worker_memory_bytes{pid="8023"} 5.4181888e+07
rr_jobs_worker_memory_bytes{pid="8024"} 5.4177792e+07
rr_jobs_worker_memory_bytes{pid="8025"} 5.5345152e+07
rr_jobs_worker_memory_bytes{pid="8026"} 5.3948416e+07
rr_jobs_worker_memory_bytes{pid="8027"} 5.4034432e+07
rr_jobs_worker_memory_bytes{pid="8028"} 5.3948416e+07
rr_jobs_worker_memory_bytes{pid="8029"} 5.400576e+07
rr_jobs_worker_memory_bytes{pid="8030"} 5.3972992e+07
rr_jobs_worker_memory_bytes{pid="8031"} 5.4030336e+07
rr_jobs_worker_memory_bytes{pid="8032"} 5.4116352e+07
rr_jobs_worker_memory_bytes{pid="8033"} 5.4075392e+07
rr_jobs_worker_memory_bytes{pid="8034"} 5.4255616e+07
rr_jobs_worker_memory_bytes{pid="8035"} 5.3952512e+07
rr_jobs_worker_memory_bytes{pid="8036"} 5.3682176e+07
# HELP rr_jobs_worker_state Worker current state
# TYPE rr_jobs_worker_state gauge
rr_jobs_worker_state{pid="7770",state="ready"} 0
rr_jobs_worker_state{pid="7782",state="ready"} 0
rr_jobs_worker_state{pid="7783",state="ready"} 0
rr_jobs_worker_state{pid="7784",state="ready"} 0
rr_jobs_worker_state{pid="7785",state="ready"} 0
rr_jobs_worker_state{pid="7786",state="ready"} 0
rr_jobs_worker_state{pid="7787",state="ready"} 0
rr_jobs_worker_state{pid="7788",state="ready"} 0
rr_jobs_worker_state{pid="7789",state="ready"} 0
rr_jobs_worker_state{pid="7790",state="ready"} 0
rr_jobs_worker_state{pid="7791",state="ready"} 0
rr_jobs_worker_state{pid="7792",state="ready"} 0
rr_jobs_worker_state{pid="7793",state="ready"} 0
rr_jobs_worker_state{pid="7794",state="ready"} 0
rr_jobs_worker_state{pid="7795",state="ready"} 0
rr_jobs_worker_state{pid="7796",state="ready"} 0
rr_jobs_worker_state{pid="7797",state="ready"} 0
rr_jobs_worker_state{pid="7798",state="ready"} 0
rr_jobs_worker_state{pid="7799",state="ready"} 0
rr_jobs_worker_state{pid="7800",state="ready"} 0
rr_jobs_worker_state{pid="7801",state="ready"} 0
rr_jobs_worker_state{pid="7802",state="ready"} 0
rr_jobs_worker_state{pid="7803",state="ready"} 0
rr_jobs_worker_state{pid="7804",state="ready"} 0
rr_jobs_worker_state{pid="7805",state="ready"} 0
rr_jobs_worker_state{pid="7806",state="ready"} 0
rr_jobs_worker_state{pid="7807",state="ready"} 0
rr_jobs_worker_state{pid="7808",state="ready"} 0
rr_jobs_worker_state{pid="7809",state="ready"} 0
rr_jobs_worker_state{pid="7810",state="ready"} 0
rr_jobs_worker_state{pid="7811",state="ready"} 0
rr_jobs_worker_state{pid="7812",state="ready"} 0
rr_jobs_worker_state{pid="7813",state="ready"} 0
rr_jobs_worker_state{pid="7814",state="ready"} 0
rr_jobs_worker_state{pid="7815",state="ready"} 0
rr_jobs_worker_state{pid="7816",state="ready"} 0
rr_jobs_worker_state{pid="7817",state="ready"} 0
rr_jobs_worker_state{pid="7818",state="ready"} 0
rr_jobs_worker_state{pid="7819",state="ready"} 0
rr_jobs_worker_state{pid="7820",state="ready"} 0
rr_jobs_worker_state{pid="7821",state="ready"} 0
rr_jobs_worker_state{pid="7822",state="ready"} 0
rr_jobs_worker_state{pid="7823",state="ready"} 0
rr_jobs_worker_state{pid="7824",state="ready"} 0
rr_jobs_worker_state{pid="7825",state="ready"} 0
rr_jobs_worker_state{pid="7826",state="ready"} 0
rr_jobs_worker_state{pid="7827",state="ready"} 0
rr_jobs_worker_state{pid="7828",state="ready"} 0
rr_jobs_worker_state{pid="7829",state="ready"} 0
rr_jobs_worker_state{pid="7830",state="ready"} 0
rr_jobs_worker_state{pid="7831",state="ready"} 0
rr_jobs_worker_state{pid="7832",state="ready"} 0
rr_jobs_worker_state{pid="7833",state="ready"} 0
rr_jobs_worker_state{pid="7834",state="ready"} 0
rr_jobs_worker_state{pid="7835",state="ready"} 0
rr_jobs_worker_state{pid="7836",state="ready"} 0
rr_jobs_worker_state{pid="7837",state="ready"} 0
rr_jobs_worker_state{pid="7838",state="ready"} 0
rr_jobs_worker_state{pid="7839",state="ready"} 0
rr_jobs_worker_state{pid="7840",state="ready"} 0
rr_jobs_worker_state{pid="7841",state="ready"} 0
rr_jobs_worker_state{pid="7842",state="ready"} 0
rr_jobs_worker_state{pid="7843",state="ready"} 0
rr_jobs_worker_state{pid="7844",state="ready"} 0
rr_jobs_worker_state{pid="7845",state="ready"} 0
rr_jobs_worker_state{pid="7846",state="working"} 0
rr_jobs_worker_state{pid="7847",state="ready"} 0
rr_jobs_worker_state{pid="7848",state="ready"} 0
rr_jobs_worker_state{pid="7849",state="ready"} 0
rr_jobs_worker_state{pid="7850",state="ready"} 0
rr_jobs_worker_state{pid="7851",state="ready"} 0
rr_jobs_worker_state{pid="7852",state="ready"} 0
rr_jobs_worker_state{pid="7853",state="ready"} 0
rr_jobs_worker_state{pid="7854",state="ready"} 0
rr_jobs_worker_state{pid="7855",state="ready"} 0
rr_jobs_worker_state{pid="7856",state="ready"} 0
rr_jobs_worker_state{pid="7857",state="ready"} 0
rr_jobs_worker_state{pid="7858",state="ready"} 0
rr_jobs_worker_state{pid="7859",state="ready"} 0
rr_jobs_worker_state{pid="7860",state="ready"} 0
rr_jobs_worker_state{pid="7861",state="ready"} 0
rr_jobs_worker_state{pid="7862",state="ready"} 0
rr_jobs_worker_state{pid="7863",state="ready"} 0
rr_jobs_worker_state{pid="7864",state="ready"} 0
rr_jobs_worker_state{pid="7865",state="ready"} 0
rr_jobs_worker_state{pid="7866",state="ready"} 0
rr_jobs_worker_state{pid="7867",state="ready"} 0
rr_jobs_worker_state{pid="7868",state="ready"} 0
rr_jobs_worker_state{pid="7869",state="ready"} 0
rr_jobs_worker_state{pid="7870",state="ready"} 0
rr_jobs_worker_state{pid="7871",state="ready"} 0
rr_jobs_worker_state{pid="7872",state="ready"} 0
rr_jobs_worker_state{pid="7873",state="ready"} 0
rr_jobs_worker_state{pid="7874",state="ready"} 0
rr_jobs_worker_state{pid="7875",state="ready"} 0
rr_jobs_worker_state{pid="7876",state="ready"} 0
rr_jobs_worker_state{pid="7877",state="ready"} 0
rr_jobs_worker_state{pid="7878",state="ready"} 0
rr_jobs_worker_state{pid="7879",state="ready"} 0
rr_jobs_worker_state{pid="7880",state="ready"} 0
rr_jobs_worker_state{pid="7881",state="ready"} 0
rr_jobs_worker_state{pid="7882",state="ready"} 0
rr_jobs_worker_state{pid="7883",state="ready"} 0
rr_jobs_worker_state{pid="7884",state="ready"} 0
rr_jobs_worker_state{pid="7885",state="ready"} 0
rr_jobs_worker_state{pid="7886",state="ready"} 0
rr_jobs_worker_state{pid="7887",state="ready"} 0
rr_jobs_worker_state{pid="7888",state="ready"} 0
rr_jobs_worker_state{pid="7889",state="ready"} 0
rr_jobs_worker_state{pid="7890",state="ready"} 0
rr_jobs_worker_state{pid="7891",state="ready"} 0
rr_jobs_worker_state{pid="7892",state="ready"} 0
rr_jobs_worker_state{pid="7893",state="ready"} 0
rr_jobs_worker_state{pid="7894",state="ready"} 0
rr_jobs_worker_state{pid="7895",state="ready"} 0
rr_jobs_worker_state{pid="7896",state="ready"} 0
rr_jobs_worker_state{pid="7897",state="ready"} 0
rr_jobs_worker_state{pid="7898",state="ready"} 0
rr_jobs_worker_state{pid="7899",state="ready"} 0
rr_jobs_worker_state{pid="7900",state="ready"} 0
rr_jobs_worker_state{pid="7901",state="ready"} 0
rr_jobs_worker_state{pid="7902",state="ready"} 0
rr_jobs_worker_state{pid="7903",state="ready"} 0
rr_jobs_worker_state{pid="7904",state="ready"} 0
rr_jobs_worker_state{pid="7905",state="ready"} 0
rr_jobs_worker_state{pid="7906",state="ready"} 0
rr_jobs_worker_state{pid="7907",state="ready"} 0
rr_jobs_worker_state{pid="7908",state="ready"} 0
rr_jobs_worker_state{pid="7909",state="ready"} 0
rr_jobs_worker_state{pid="7910",state="ready"} 0
rr_jobs_worker_state{pid="7911",state="ready"} 0
rr_jobs_worker_state{pid="7912",state="ready"} 0
rr_jobs_worker_state{pid="7913",state="ready"} 0
rr_jobs_worker_state{pid="7914",state="ready"} 0
rr_jobs_worker_state{pid="7915",state="ready"} 0
rr_jobs_worker_state{pid="7916",state="ready"} 0
rr_jobs_worker_state{pid="7917",state="ready"} 0
rr_jobs_worker_state{pid="7918",state="ready"} 0
rr_jobs_worker_state{pid="7919",state="ready"} 0
rr_jobs_worker_state{pid="7920",state="ready"} 0
rr_jobs_worker_state{pid="7921",state="ready"} 0
rr_jobs_worker_state{pid="7922",state="ready"} 0
rr_jobs_worker_state{pid="7923",state="ready"} 0
rr_jobs_worker_state{pid="7924",state="ready"} 0
rr_jobs_worker_state{pid="7925",state="ready"} 0
rr_jobs_worker_state{pid="7926",state="ready"} 0
rr_jobs_worker_state{pid="7927",state="ready"} 0
rr_jobs_worker_state{pid="7928",state="ready"} 0
rr_jobs_worker_state{pid="7929",state="ready"} 0
rr_jobs_worker_state{pid="7930",state="ready"} 0
rr_jobs_worker_state{pid="7931",state="ready"} 0
rr_jobs_worker_state{pid="7932",state="ready"} 0
rr_jobs_worker_state{pid="7933",state="ready"} 0
rr_jobs_worker_state{pid="7934",state="ready"} 0
rr_jobs_worker_state{pid="7935",state="ready"} 0
rr_jobs_worker_state{pid="7936",state="ready"} 0
rr_jobs_worker_state{pid="7937",state="ready"} 0
rr_jobs_worker_state{pid="7938",state="ready"} 0
rr_jobs_worker_state{pid="7939",state="ready"} 0
rr_jobs_worker_state{pid="7940",state="ready"} 0
rr_jobs_worker_state{pid="7941",state="ready"} 0
rr_jobs_worker_state{pid="7942",state="ready"} 0
rr_jobs_worker_state{pid="7943",state="ready"} 0
rr_jobs_worker_state{pid="7944",state="ready"} 0
rr_jobs_worker_state{pid="7945",state="ready"} 0
rr_jobs_worker_state{pid="7946",state="ready"} 0
rr_jobs_worker_state{pid="7947",state="ready"} 0
rr_jobs_worker_state{pid="7948",state="ready"} 0
rr_jobs_worker_state{pid="7949",state="ready"} 0
rr_jobs_worker_state{pid="7950",state="ready"} 0
rr_jobs_worker_state{pid="7951",state="ready"} 0
rr_jobs_worker_state{pid="7952",state="ready"} 0
rr_jobs_worker_state{pid="7953",state="ready"} 0
rr_jobs_worker_state{pid="7954",state="ready"} 0
rr_jobs_worker_state{pid="7955",state="ready"} 0
rr_jobs_worker_state{pid="7956",state="ready"} 0
rr_jobs_worker_state{pid="7957",state="ready"} 0
rr_jobs_worker_state{pid="7958",state="ready"} 0
rr_jobs_worker_state{pid="7959",state="ready"} 0
rr_jobs_worker_state{pid="7960",state="ready"} 0
rr_jobs_worker_state{pid="7961",state="ready"} 0
rr_jobs_worker_state{pid="7962",state="ready"} 0
rr_jobs_worker_state{pid="7963",state="ready"} 0
rr_jobs_worker_state{pid="7964",state="ready"} 0
rr_jobs_worker_state{pid="7965",state="ready"} 0
rr_jobs_worker_state{pid="7966",state="ready"} 0
rr_jobs_worker_state{pid="7967",state="ready"} 0
rr_jobs_worker_state{pid="7968",state="ready"} 0
rr_jobs_worker_state{pid="7969",state="ready"} 0
rr_jobs_worker_state{pid="7970",state="ready"} 0
rr_jobs_worker_state{pid="7971",state="ready"} 0
rr_jobs_worker_state{pid="7972",state="ready"} 0
rr_jobs_worker_state{pid="7973",state="ready"} 0
rr_jobs_worker_state{pid="7974",state="ready"} 0
rr_jobs_worker_state{pid="7975",state="ready"} 0
rr_jobs_worker_state{pid="7976",state="ready"} 0
rr_jobs_worker_state{pid="7977",state="ready"} 0
rr_jobs_worker_state{pid="7978",state="ready"} 0
rr_jobs_worker_state{pid="7979",state="ready"} 0
rr_jobs_worker_state{pid="7980",state="ready"} 0
rr_jobs_worker_state{pid="7981",state="ready"} 0
rr_jobs_worker_state{pid="7982",state="ready"} 0
rr_jobs_worker_state{pid="7983",state="ready"} 0
rr_jobs_worker_state{pid="7984",state="ready"} 0
rr_jobs_worker_state{pid="7985",state="working"} 0
rr_jobs_worker_state{pid="7986",state="ready"} 0
rr_jobs_worker_state{pid="7987",state="ready"} 0
rr_jobs_worker_state{pid="7988",state="ready"} 0
rr_jobs_worker_state{pid="7989",state="ready"} 0
rr_jobs_worker_state{pid="7990",state="ready"} 0
rr_jobs_worker_state{pid="7991",state="ready"} 0
rr_jobs_worker_state{pid="7992",state="ready"} 0
rr_jobs_worker_state{pid="7993",state="ready"} 0
rr_jobs_worker_state{pid="7994",state="ready"} 0
rr_jobs_worker_state{pid="7995",state="ready"} 0
rr_jobs_worker_state{pid="7996",state="ready"} 0
rr_jobs_worker_state{pid="7997",state="ready"} 0
rr_jobs_worker_state{pid="7998",state="ready"} 0
rr_jobs_worker_state{pid="7999",state="ready"} 0
rr_jobs_worker_state{pid="8000",state="ready"} 0
rr_jobs_worker_state{pid="8001",state="ready"} 0
rr_jobs_worker_state{pid="8002",state="ready"} 0
rr_jobs_worker_state{pid="8003",state="ready"} 0
rr_jobs_worker_state{pid="8004",state="ready"} 0
rr_jobs_worker_state{pid="8005",state="ready"} 0
rr_jobs_worker_state{pid="8006",state="ready"} 0
rr_jobs_worker_state{pid="8007",state="ready"} 0
rr_jobs_worker_state{pid="8008",state="ready"} 0
rr_jobs_worker_state{pid="8009",state="ready"} 0
rr_jobs_worker_state{pid="8010",state="ready"} 0
rr_jobs_worker_state{pid="8011",state="ready"} 0
rr_jobs_worker_state{pid="8012",state="ready"} 0
rr_jobs_worker_state{pid="8013",state="ready"} 0
rr_jobs_worker_state{pid="8014",state="ready"} 0
rr_jobs_worker_state{pid="8015",state="ready"} 0
rr_jobs_worker_state{pid="8016",state="ready"} 0
rr_jobs_worker_state{pid="8017",state="ready"} 0
rr_jobs_worker_state{pid="8018",state="ready"} 0
rr_jobs_worker_state{pid="8019",state="ready"} 0
rr_jobs_worker_state{pid="8020",state="ready"} 0
rr_jobs_worker_state{pid="8021",state="ready"} 0
rr_jobs_worker_state{pid="8022",state="ready"} 0
rr_jobs_worker_state{pid="8023",state="ready"} 0
rr_jobs_worker_state{pid="8024",state="ready"} 0
rr_jobs_worker_state{pid="8025",state="ready"} 0
rr_jobs_worker_state{pid="8026",state="ready"} 0
rr_jobs_worker_state{pid="8027",state="ready"} 0
rr_jobs_worker_state{pid="8028",state="ready"} 0
rr_jobs_worker_state{pid="8029",state="ready"} 0
rr_jobs_worker_state{pid="8030",state="ready"} 0
rr_jobs_worker_state{pid="8031",state="ready"} 0
rr_jobs_worker_state{pid="8032",state="ready"} 0
rr_jobs_worker_state{pid="8033",state="ready"} 0
rr_jobs_worker_state{pid="8034",state="ready"} 0
rr_jobs_worker_state{pid="8035",state="ready"} 0
rr_jobs_worker_state{pid="8036",state="ready"} 0
# HELP rr_jobs_workers_invalid Workers currently in invalid,killing,destroyed,errored,inactive states
# TYPE rr_jobs_workers_invalid gauge
rr_jobs_workers_invalid 0
# HELP rr_jobs_workers_memory_bytes Memory usage by workers.
# TYPE rr_jobs_workers_memory_bytes gauge
rr_jobs_workers_memory_bytes 1.3857636352e+10
# HELP rr_jobs_workers_ready Workers currently in ready state
# TYPE rr_jobs_workers_ready gauge
rr_jobs_workers_ready 254
# HELP rr_jobs_workers_working Workers currently in working state
# TYPE rr_jobs_workers_working gauge
rr_jobs_workers_working 2
@rustatian
Copy link
Member

Hey @embargo2710 👋🏻
I'm not quite sure what does that means, could you please rephrase:

Next, I send messages to several queues and wait for the maximum work.

I tested this case when I sent more than 256 messages and all workers are busy with work.

Please, attach your worker to the report.

@Kaspiman
Copy link
Sponsor Author

I send a lot of tasks in the queues, RR receives 256x3 of them and stores in priority queue.

wait for the maximum work

I expect all 256 workers to be working. In fact, only two are worked. This can be seen in the logs (count "Finished task" message = 31 times in 2 and a half minutes) and metrics (rr_jobs_workers_working 2)

Initially, I posted a log from the product code of our framework. For the purity of the experiment, I made a very simple php file with no logic at all. The result is the same - the workers do not work.

My worker:

<?php

use Spiral\RoadRunner\Jobs\Consumer;
use Spiral\RoadRunner\Jobs\Serializer\JsonSerializer;

require_once __DIR__ . '/../vendor/autoload.php';

$consumer = new Consumer(serializer: new JsonSerializer());

while (($task = $consumer->waitTask())) {
    try {
        sleep(10);

        $task->complete();
    } catch (\Throwable $e) {
        $task->fail($e, requeue: false);
    }
}

@Kaspiman
Copy link
Sponsor Author

It is possible that this behavior occurs on certain computer hardware. This definitely happens on my powerful laptop, as well as on a cloud production server with "32 CPU, 128GB RAM x4".

@rustatian
Copy link
Member

Oh, so, the problem is, that I used u8 for the number of listeners. I didn't expect, that someout would allocate a 512 pollers with 256 workers...
So, this is classic integer overflow. Could you please try to use 200 workers with 210 pollers?

@Kaspiman
Copy link
Sponsor Author

Works good.

Could you please try to use 200 workers with 210 pollers?

The maximum working values is: num_pollers: 512 and num_workers: 252

@rustatian
Copy link
Member

For the num_workers you might use basically any int64 number;
For the num_pollers 512 would overflow the u8 type, max allowed number is 255. I'll fix that, will be released this Thursday (2023.1.4).

@Kaspiman
Copy link
Sponsor Author

Kaspiman commented May 20, 2023

I just checked num_pollers 512 and num_workers: 252 and it works, that is my metric - rr_jobs_workers_working 252.

Something is broken on num_workers greater or equal then 253

@rustatian
Copy link
Member

All values more than 255 would lead to the u8 overflow. They would be 0.
Magic gets around the number 253 because RR uses the number of workers + 3 if the num_pollers is eq to 0 (overflow -> 0). 253 + 3 -> overflow = magic 😃

@rustatian
Copy link
Member

Thanks @embargo2710 👍🏻 Fixed 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-bug Bug: bug, exception
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants