OOM killed when using JetStream WorkQueue [v2.10.18] #5739

kohlisid · 2024-08-02T01:37:26Z

Observed behavior

Running 2 streams on a jetstream cluster with 3 nodes

The server config are as follows


Process Details:

         Version: 2.10.18
      Git Commit: 57d23ac
      Go Version: go1.22.5
      Start Time: 2024-08-02 01:00:45.487109016 +0000 UTC
          Uptime: 34m59s

Connection Details:

   Auth Required: true
    TLS Required: false
            Host: 0.0.0.0:4222
     Client URLs: 10.244.0.91:4222
                  10.244.0.92:4222
                  10.244.0.90:4222

JetStream:

              Domain: 
   Storage Directory: /data/jetstream/store/jetstream
          Max Memory: 768 MiB
            Max File: 1.0 TiB
      Active Acconts: 1
       Memory In Use: 0 B
         File In Use: 738 B
        API Requests: 7,948
          API Errors: 0

Limits:

        Max Conn: 65536
        Max Subs: 0
     Max Payload: 64 MiB
     TLS Timeout: 2s
  Write Deadline: 10s

Statistics:

       CPU Cores: 32 218.00%
          Memory: 162 MiB
     Connections: 8
   Subscriptions: 540
            Msgs: 682,490 in 730,334 out
           Bytes: 1.0 TiB in 1.2 TiB out
  Slow Consumers: 0

Cluster:

            Name: default
            Host: 0.0.0.0:6222
            URLs: isbsvc-default-js-0.isbsvc-default-js-svc.default.svc:6222
                  isbsvc-default-js-1.isbsvc-default-js-svc.default.svc:6222
                  isbsvc-default-js-2.isbsvc-default-js-svc.default.svc:6222

Stream configs are as follows ->

~ # js s info default-simple-pipeline-out-0
Information for Stream default-simple-pipeline-out-0 created 2024-08-02 01:02:04

             Subjects: default-simple-pipeline-out-0
             Replicas: 3
              Storage: File

Options:

            Retention: WorkQueue
     Acknowledgements: true
       Discard Policy: New
     Duplicate Window: 1m0s
    Allows Msg Delete: true
         Allows Purge: true
       Allows Rollups: false

Limits:

     Maximum Messages: 10,000
  Maximum Per Subject: unlimited
        Maximum Bytes: 100 MiB
          Maximum Age: 3d0h0m0s
 Maximum Message Size: unlimited
    Maximum Consumers: unlimited


Cluster Information:

                 Name: default
               Leader: isbsvc-default-js-2
              Replica: isbsvc-default-js-0, current, seen 0.01s ago
              Replica: isbsvc-default-js-1, outdated, seen 0.00s ago, 1 operation behind

State:

             Messages: 7
                Bytes: 44 MiB
             FirstSeq: 9,422 @ 2024-08-02T01:23:27 UTC
              LastSeq: 9,428 @ 2024-08-02T01:23:27 UTC
     Active Consumers: 1
   Number of Subjects: 1

Information for Stream default-simple-pipeline-cat-0 created 2024-08-02 01:02:03

             Subjects: default-simple-pipeline-cat-0
             Replicas: 3
              Storage: File

Options:

            Retention: WorkQueue
     Acknowledgements: true
       Discard Policy: New
     Duplicate Window: 1m0s
    Allows Msg Delete: true
         Allows Purge: true
       Allows Rollups: false

Limits:

     Maximum Messages: 10,000
  Maximum Per Subject: unlimited
        Maximum Bytes: 100 MiB
          Maximum Age: 3d0h0m0s
 Maximum Message Size: unlimited
    Maximum Consumers: unlimited


Cluster Information:

                 Name: default
               Leader: isbsvc-default-js-0
              Replica: isbsvc-default-js-1, current, seen 0.00s ago, 2 operations behind
              Replica: isbsvc-default-js-2, current, seen 0.00s ago

State:

             Messages: 5
                Bytes: 32 MiB
             FirstSeq: 3,897 @ 2024-08-02T01:10:42 UTC
              LastSeq: 4,220 @ 2024-08-02T01:11:26 UTC
     Deleted Messages: 319
     Active Consumers: 1
   Number of Subjects: 1

I am running a test where publish messages with payload of size 5mb to the streams, but I saw that the servers got terminated due to OOM killed

    State:        Running
    Started:      Fri, 02 Aug 2024 01:05:07 +0000
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Fri, 02 Aug 2024 01:00:45 +0000
      Finished:     Fri, 02 Aug 2024 01:05:06 +0000
    Ready:          True
    Restart Count:  1
    Limits:
      memory:  1Gi
    Requests:
      cpu:     1
      memory:  1Gi


    State:          Running
    Started:      Fri, 02 Aug 2024 01:04:24 +0000
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Fri, 02 Aug 2024 01:00:45 +0000
      Finished:     Fri, 02 Aug 2024 01:04:23 +0000
    Ready:          True
    Restart Count:  1

Expected behavior

I am running a test with payload size of 5Mb with 50msgs/sec

[55] 2024/08/02 01:00:45.460331 [INF] Starting nats-server
[55] 2024/08/02 01:00:45.460447 [INF]   Version:  2.10.18

[55] 2024/08/02 01:00:45.462256 [INF] ---------------- JETSTREAM ----------------
[55] 2024/08/02 01:00:45.462259 [INF]   Max Memory:      768.00 MB
[55] 2024/08/02 01:00:45.462261 [INF]   Max Storage:     1.00 TB
[55] 2024/08/02 01:00:45.462262 [INF]   Store Directory: "/data/jetstream/store/jetstream"
[55] 2024/08/02 01:00:45.462263 [INF] -------------------------------------------

There are 2 streams with replication factor of 3

I see this OOM killed error in 2 servers, I want to ask why would I be seeing this error when I'm using filestore as storage mode. Will the streams be pulled in-memory at runtime?

Is this high memory requirement expected? If yes, then what will be a good way to detemine what should be the max resource allocation to the Jetstream servers.

Server and client version

Version: 2.10.18

Host environment

No response

Steps to reproduce

No response

The text was updated successfully, but these errors were encountered:

kohlisid · 2024-08-02T03:11:52Z

Perhaps related to #5673
But we are setting stream limits of maxBytes and maxMessages

neilalexander · 2024-08-02T14:12:02Z

How much memory are you giving each NATS node?

kohlisid · 2024-08-02T14:20:27Z

@neilalexander

[55] 2024/08/02 01:00:45.462256 [INF] ---------------- JETSTREAM ----------------
[55] 2024/08/02 01:00:45.462259 [INF]   Max Memory:      768.00 MB
[55] 2024/08/02 01:00:45.462261 [INF]   Max Storage:     1.00 TB

neilalexander · 2024-08-02T14:24:32Z

I asked specifically because sometimes the reporting doesn't always match cgroup restrictions, do you OOM as quickly if you set the GOMEMLIMIT environment variable to some floor below the available memory, i.e. GOMEMLIMIT=512MiB?

If you can capture some memory profiles when the memory usage is up but before OOM then that would be useful, either nats server request profile allocs from the system account or https://docs.nats.io/running-a-nats-service/nats_admin/profiling.

ripienaar · 2024-08-02T14:31:22Z

The Max Memory shown does not limit how much memory NATS uses. It limits how much memory storage a stream may use and has no impact on NATS overall memory usage for its normal running needs.

What are you giving the process from your OS perspective?

kohlisid · 2024-08-02T14:34:25Z

I'm running the the cluster on K8s, with each JS server as a pod giving a memory of
memory: 1Gi
@ripienaar

ripienaar · 2024-08-02T14:39:45Z

Huge payload messages would consume more ram, ime I wouldn't run nats with below 3GB memory when using Jetstream. Your big messages will make matters worse.

kohlisid · 2024-08-02T14:42:43Z

But on the flipside how would we estimate how much memory resources to allocate in such a scenario? Is there some guideline.
If there is a surge in message rate, we would need to ensure that it doesnt OOM
@ripienaar

ripienaar · 2024-08-02T14:46:47Z

It varies a lot by use case, message rates etc and changes from version to version - usually the needs go down. A workload that today uses about 1GB memory for me used to use 6GB some time ago.

You should set up monitoring and use nats bench to simulate various scenarios, there's some guidance on the website about memory usage but its definitely wrong so doing your own tests would be best.

kohlisid · 2024-08-02T14:57:36Z

@ripienaar I agree to trying to simulate the scenarios, but for a high load this memory was spiking up a lot.
I cannot run my nodes without memory limit guardrails, hence I was trying to mimic the scenario here using limited resources to see when would it OOM. I assumed even for a worst case scenario that the whole stream is pulled into the cache (even though I'm using filestore storage) having a maxBytes of 100Mb for 2 streams and replication=3 (600Mb).
But this ceiling is getting hit more often than not.

For high work loads giving unbounded memory isn't possible, I want to understand where and why would the server be using the excess memory, and thus how to decide the upper bounds in that case.

kohlisid · 2024-08-02T15:26:28Z

debug/pprof/allocs
mem_pprof.zip
@neilalexander

derekcollison · 2024-08-02T21:19:46Z

A good practice when limiting memory via cgroups and containers is to set the env GOMEMLIMIT to ~75% of actual limit. Sometimes the Go runtime only sees the host value of memory and does not feel pressure to GC and clean things up but the container and linux kernel will OOM it.

kohlisid · 2024-08-02T21:41:55Z

That could help in terms of the golang GC, but in terms of the nats what would be a good limit upper limit in the first place? How should we calculate that.
Even when I consider the worst case where all streams can reside in memory, I saw the OOM as stated in the above issue
For a filestore mode, we would not expect that to be the ideal case
@derekcollison

derekcollison · 2024-08-02T21:56:55Z

For any program written in Go that is to be run in a container its best practice these days to set GOMEMLIMIT.

We do plan on introducing a high bound on how much buffer memory for all filestore backed streams in a server. This will come probably in 2.12.

kohlisid · 2024-08-03T00:26:57Z

I agree on the GOMEMLIMIT, will definitely go ahead on changing that.
But as a practice at this point, can we estimate the upper bound memory for the container? Based on the TPS and message size etc. Or some experimental settings that can help to measure that
@derekcollison

derekcollison · 2024-08-03T01:01:10Z

Memory usage is quite dynamic based on connections, number of subscriptions for core. And for JetStream, number of streams and consumers, access patterns etc..

Best way is to model the upper bounds of what you expect on a larger system and monitor RSS peaks..

kohlisid · 2024-09-12T21:45:24Z

On a recent exploration with Limits policy, I was seeing the memory spiking up considerably again.

│                                                                                                      Stream Report                                                                                                      │
├──────────────────────────────────────────────────────────────────────────────┬─────────┬───────────┬───────────┬──────────┬─────────┬──────┬─────────┬──────────────────────────────────────────────────────────────────┤
│ Stream                                                                       │ Storage │ Placement │ Consumers │ Messages │ Bytes   │ Lost │ Deleted │ Replicas                                                         │
├──────────────────────────────────────────────────────────────────────────────┼─────────┼───────────┼───────────┼──────────┼─────────┼──────┼─────────┼──────────────────────────────────────────────────────────────────┤
│ KV_xxxx-simple-pipeline-out_SINK_OT          │ File    │           │ 0         │ 0        │ 0 B     │ 0    │ 0       │ isbsvc-default-js-1*, isbsvc-default-js-3, isbsvc-default-js-4!  │
│ KV_xxxx-simple-pipeline_SIDE_INPUTS          │ File    │           │ 0         │ 0        │ 0 B     │ 0    │ 0       │ isbsvc-default-js-1, isbsvc-default-js-2*, isbsvc-default-js-3   │
│ KV_xxxx-simple-pipeline-out_SINK_PROCESSORS  │ File    │           │ 0         │ 8        │ 1.1 KiB │ 0    │ 2002    │ isbsvc-default-js-1*, isbsvc-default-js-2, isbsvc-default-js-3   │
│ KV_xxxx-simple-pipeline-in-out_OT            │ File    │           │ 10        │ 10       │ 1.3 KiB │ 0    │ 9       │ isbsvc-default-js-0, isbsvc-default-js-1, isbsvc-default-js-3*   │
│ KV_xxxx-simple-pipeline-in-out_PROCESSORS    │ File    │           │ 7         │ 10       │ 1.3 KiB │ 0    │ 0       │ isbsvc-default-js-2, isbsvc-default-js-3*, isbsvc-default-js-4!  │
│ KV_xxxx-simple-pipeline-in_SOURCE_OT         │ File    │           │ 24        │ 10       │ 1.3 KiB │ 0    │ 1       │ isbsvc-default-js-0!, isbsvc-default-js-1!, isbsvc-default-js-4! │
│ KV_xxxx-simple-pipeline-in_SOURCE_PROCESSORS │ File    │           │ 10        │ 10       │ 1.3 KiB │ 0    │ 0       │ isbsvc-default-js-0!, isbsvc-default-js-2, isbsvc-default-js-3*  │
│ xxxx-simple-pipeline-out-1                   │ File    │           │ 1         │ 100,000  │ 6.2 GiB │ 0    │ 0       │ isbsvc-default-js-0!, isbsvc-default-js-2!, isbsvc-default-js-4! │
│ xxxx-simple-pipeline-out-0                   │ File    │           │ 1         │ 100,000  │ 6.2 GiB │ 0    │ 0       │ isbsvc-default-js-0!, isbsvc-default-js-1*, isbsvc-default-js-4  │
╰──────────────────────────────────────────────────────────────────────────────┴─────────┴───────────┴───────────┴──────────┴─────────┴──────┴─────────┴──────────────────────────────────────────────────────────────────╯

In this case I have two streams running with size of 6.2Gib running in filestore mode, but the memory usage on the servers were considerably high. More than having both the streams completely in memory even though we have filestore.

~ # njs server ls
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                           Server Overview                                                           │
├─────────────────────┬─────────┬──────┬─────────┬─────┬───────┬───────┬────────┬─────┬─────────┬───────┬───────┬──────┬────────┬─────┤
│ Name                │ Cluster │ Host │ Version │ JS  │ Conns │ Subs  │ Routes │ GWs │ Mem     │ CPU % │ Cores │ Slow │ Uptime │ RTT │
├─────────────────────┼─────────┼──────┼─────────┼─────┼───────┼───────┼────────┼─────┼─────────┼───────┼───────┼──────┼────────┼─────┤
│ isbsvc-default-js-2 │ default │ 0    │ 2.10.18 │ yes │ 13    │ 663   │ 16     │ 0   │ 447 MiB │ 3     │ 32    │ 7    │ 41m35s │ 1ms │
│ isbsvc-default-js-3 │ default │ 0    │ 2.10.18 │ yes │ 9     │ 663   │ 16     │ 0   │ 32 MiB  │ 2     │ 32    │ 0    │ 41m33s │ 1ms │
│ isbsvc-default-js-0 │ default │ 0    │ 2.10.18 │ yes │ 0     │ 663   │ 16     │ 0   │ 16 GiB  │ 18    │ 32    │ 0    │ 4m20s  │ 1ms │
│ isbsvc-default-js-4 │ default │ 0    │ 2.10.18 │ yes │ 0     │ 663   │ 16     │ 0   │ 16 GiB  │ 100   │ 32    │ 0    │ 2m49s  │ 2ms │
│ isbsvc-default-js-1 │ default │ 0    │ 2.10.18 │ yes │ 12    │ 663   │ 16     │ 0   │ 434 MiB │ 5     │ 32    │ 9    │ 41m31s │ 2ms │
├─────────────────────┼─────────┼──
────┼─────────┼─────┼───────┼───────┼────────┼─────┼─────────┼───────┼───────┼──────┼────────┼─────┤
│                     │ 1       │ 5    │         │ 5   │ 34    │ 3,315 │        │     │ 33 GiB  │       │       │ 16   │        │     │
╰─────────────────────┴─────────┴──────┴─────────┴─────┴───────┴───────┴────────┴─────┴─────────┴───────┴───────┴──────┴────────┴─────╯

Here are the memory profiles for the same
mem_prof.zip

@derekcollison @neilalexander

neilalexander · 2024-09-13T11:34:30Z

This looks like a build-up of Raft append entries. How are you publishing messages? Are you using core NATS publishes or JetStream publishes?

kohlisid · 2024-09-13T18:33:55Z

@neilalexander
We use the Jetstream publish

// PublishMsg publishes a Msg to JetStream.
PublishMsg(m *Msg, opts ...PubOpt) (*PubAck, error)

https://github.com/nats-io/nats.go/blob/main/js.go#L45

kohlisid · 2024-09-16T19:08:28Z

I see a buildup during the writeMsgRecord as well
mem_prof_2.zip
@neilalexander @derekcollison

kohlisid added the defect Suspected defect such as a bug or regression label Aug 2, 2024

roeschter mentioned this issue Aug 2, 2024

Make DefaultCacheBufferExpiration configurable to prevent OOM in small system at high pub and sub rates #5740

Open

wallyqs changed the title ~~OOM killed on 2.10.18 using Jetstream workqueue~~ OOM killed when using JetStream WorkQueue [v2.10.18] Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OOM killed when using JetStream WorkQueue [v2.10.18] #5739

OOM killed when using JetStream WorkQueue [v2.10.18] #5739

kohlisid commented Aug 2, 2024 •

edited

Loading

kohlisid commented Aug 2, 2024

neilalexander commented Aug 2, 2024

kohlisid commented Aug 2, 2024

neilalexander commented Aug 2, 2024

ripienaar commented Aug 2, 2024

kohlisid commented Aug 2, 2024 •

edited

Loading

ripienaar commented Aug 2, 2024

kohlisid commented Aug 2, 2024

ripienaar commented Aug 2, 2024

kohlisid commented Aug 2, 2024 •

edited

Loading

kohlisid commented Aug 2, 2024

derekcollison commented Aug 2, 2024

kohlisid commented Aug 2, 2024 •

edited

Loading

derekcollison commented Aug 2, 2024

kohlisid commented Aug 3, 2024

derekcollison commented Aug 3, 2024

kohlisid commented Sep 12, 2024

neilalexander commented Sep 13, 2024

kohlisid commented Sep 13, 2024

kohlisid commented Sep 16, 2024

OOM killed when using JetStream WorkQueue [v2.10.18] #5739

OOM killed when using JetStream WorkQueue [v2.10.18] #5739

Comments

kohlisid commented Aug 2, 2024 • edited Loading

Observed behavior

Expected behavior

Server and client version

Host environment

Steps to reproduce

kohlisid commented Aug 2, 2024

neilalexander commented Aug 2, 2024

kohlisid commented Aug 2, 2024

neilalexander commented Aug 2, 2024

ripienaar commented Aug 2, 2024

kohlisid commented Aug 2, 2024 • edited Loading

ripienaar commented Aug 2, 2024

kohlisid commented Aug 2, 2024

ripienaar commented Aug 2, 2024

kohlisid commented Aug 2, 2024 • edited Loading

kohlisid commented Aug 2, 2024

derekcollison commented Aug 2, 2024

kohlisid commented Aug 2, 2024 • edited Loading

derekcollison commented Aug 2, 2024

kohlisid commented Aug 3, 2024

derekcollison commented Aug 3, 2024

kohlisid commented Sep 12, 2024

neilalexander commented Sep 13, 2024

kohlisid commented Sep 13, 2024

kohlisid commented Sep 16, 2024

kohlisid commented Aug 2, 2024 •

edited

Loading

kohlisid commented Aug 2, 2024 •

edited

Loading

kohlisid commented Aug 2, 2024 •

edited

Loading

kohlisid commented Aug 2, 2024 •

edited

Loading