High RAM Consumption when activating the WAF on multiple sites #76

skixmix · 2023-06-20T16:15:28Z

Hello,

I'm encountering a problem with this module, specifically, the RAM usage keeps rising consistently whenever I incorporate a new Caddyfile configuration (or when I activate the WAF on an existing config).

My main configuration looks like this one:

{ Caddy global options... }
import *.caddy

In each Caddyfile, I manage a distinct site. I have approximately 200 Caddyfiles in total. Prior to installing the module, the loaded configuration utilized merely 1 GB of memory. However, once the module is installed and the WAF directive is enabled like this:

        coraza_waf {
			include /waf/coraza/coraza.conf-recommended
			include /waf/coreruleset/crs-setup.conf.example
			include /waf/coreruleset/rules/*.conf
			include /var/sites_waf/specific_site_configuration.conf
        }

I observe an increment of nearly 200 MB of RAM per site. Upon restarting the Caddy server, the RAM usage amplifies further and appears to remain unreleased. This does not occur when I deactivate the module and the WAF directive.

Do you have any idea on what is happening?

Thank you,
Simone

The text was updated successfully, but these errors were encountered:

jptosso · 2023-06-20T16:26:30Z

Hey! Could you provide other metrics, like concurrent users and traffic information? Each WAF instance is just a few KBs of memory, it shouldn't create that much overhead.

It could be related to the garbage collector.

skixmix · 2023-06-20T16:47:07Z

Hello,

Thank you, as of now I'm testing it on a VM running Ubuntu 20.04.6 LTS and not exposed to the internet, so there are no clients and no traffic. On the other VMs in production, I have Caddy without the WAF module installed and the same configurations occupy more or less 1,3 GB of RAM. Whenever I perform a Caddy reload or restart, it takes up some additional memory, goes up to 2GB and then releases it back to 1,3 GB.

This behaviour changes in the test VM with the coraza module installed, and what happens is what I described before.

jcchavezs · 2023-06-20T20:07:51Z

I've been checking this issue but could not get my head around it.

Locally

When I build the binary and run it in local with the example caddyfile

# run httpbin in terminal 1
go run github.com/mccutchen/go-httpbin/v2/cmd/[email protected] -port 8081

# run caddy in terminal 2, make sure you change the host in caddyfile from httpbin to localhost
mage buildCaddy
./build/caddy run --config example/Caddyfile --adapter caddyfile

# run ab in terminal 3
ab -n 100000 -c 100 http://localhost:8080/

I get a caddy server going from 40MB to 93MB (checked with Activity Monitor)

Docker

If I run the same ab command but running the example in a container the memory can get up from 40MB to 5GB and slowly go down to 2.87GB (checked with docker stats), actually every curl localhost:8080/ increments the memory usage in 1.1MB. Both tests are similar, however the local one built the mac version and docker one uses the linux version (GOOS=linux).

# run example
go run mage.go buildExample runExample

# run ab
ab -n 100000 -c 100 http://localhost:8080/

Any clue on why this could happen cc @anuraaga @mholt?

mholt · 2023-06-20T23:14:38Z

To know for sure, you'll need to capture a memory profile.

You can get one from :2019/debug/pprof on your server and viewing the allocations.

You can use go tool pprof to view them interactively, though sometimes it can be obvious just looking at the raw dump. I like generating an SVG to see what call stacks are allocating the most memory.

Here's an example tutorial (skip the top part that talks about adding the pprof handlers, Caddy already does this for you, hence the pprof endpoint you can load): https://www.freecodecamp.org/news/how-i-investigated-memory-leaks-in-go-using-pprof-on-a-large-codebase-4bec4325e192/

jcchavezs · 2023-06-20T23:17:45Z

Thanks @mholt, yeah I was doing profiling, just curious about the difference between local and docker in case something came from the top of your head.

…

On Wed, 21 Jun 2023, 01:14 Matt Holt, ***@***.***> wrote: To know for sure, you'll need to capture a memory profile. You can get one from :2019/debug/pprof on your server and viewing the allocations. You can use go tool pprof to view them interactively, though sometimes it can be obvious just looking at the raw dump. I like generating an SVG to see what call stacks are allocating the most memory. Here's an example tutorial (skip the top part that talks about adding the pprof handlers, Caddy already does this for you, hence the pprof endpoint you can load): https://www.freecodecamp.org/news/how-i-investigated-memory-leaks-in-go-using-pprof-on-a-large-codebase-4bec4325e192/ — Reply to this email directly, view it on GitHub <#76 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAXOYASAYRNEOLQ3R5IMGCDXMIVGTANCNFSM6AAAAAAZNQV54I> . You are receiving this because you commented.Message ID: ***@***.***>

mholt · 2023-06-20T23:22:21Z

Not really a fair comparison since the two are completely different platforms/OSes. Will need a profile to be sure.

skixmix · 2023-06-21T08:45:01Z

Hello,

Thank you for your support. If it might be useful, what I'm observing is that each time I execute a reboot of the machine or a systemctl restart of the Caddy service, the RAM usage reverts to approximately 1 GB. After this operation, even if I execute multiple 'curl' requests on various sites, it remains stable and the usage does not increase by 1MB as it does in your situation. Additionally, as mentioned previously, I'm observing a rise in RAM utilization whenever I add a new site (or enable the WAF on a site where the directive was previously absent) and subsequently execute the rebuild/reload of Caddy configurations.

Caddy version: v2.6.4 h1:2hwYqiRwk1tf3VruhMpLcYTg+11fCdr8S3jhNAdnPy8=
Coraza Caddy version: 1.2.2

The VM is operating on vMware, not Docker, and on the same host there are two additional VMs without the Coraza module installed. All VMs share the same Caddy configurations (excluding the WAF directive, which is absent in the production VMs), and the same system:

OS: Ubuntu 20.04
RAM: 4 GB (1.4-1.6 GB used in production, same on the test machine, but increasing like said before)
CPU: 4 vCores (0.15 - 0.20 load average on production, 0.01 on the test machine)

jcchavezs · 2023-06-21T08:46:46Z

Could you try only these directives and check the memory? ``` coraza_waf { include /waf/coraza/coraza.conf-recommended include /waf/coreruleset/crs-setup.conf.example include /waf/coreruleset/rules/*.conf } ```

…

On Wed, Jun 21, 2023 at 10:45 AM skixmix ***@***.***> wrote: Hello, Thank you for your support. If it might be useful, what I'm observing is that each time I execute a reboot of the machine or a systemctl restart of the Caddy service, the RAM usage reverts to approximately 1 GB. After this operation, even if I execute multiple 'curl' requests on various sites, it remains stable and the usage does not increase by 1MB as it does in your situation. Additionally, as mentioned previously, I'm observing a rise in RAM utilization whenever I add a new site (or enable the WAF on a site where the directive was previously absent) and subsequently execute the rebuild/reload of Caddy configurations. Caddy version: v2.6.4 h1:2hwYqiRwk1tf3VruhMpLcYTg+11fCdr8S3jhNAdnPy8= Coraza Caddy version: 1.2.2 The VM is operating on vMware, not Docker, and on the same host there are two additional VMs without the Coraza module installed. All VMs share the same Caddy configurations (excluding the WAF directive, which is absent in the production VMs), and the same system: - OS: Ubuntu 20.04 - RAM: 4 GB (1.4-1.6 GB used in production, same on the test machine, but increasing like said before) - CPU: 4 vCores (0.15 - 0.20 load average on production, 0.01 on the test machine) — Reply to this email directly, view it on GitHub <#76 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAXOYAR7T2MSG2WVK2CBY3DXMKYBRANCNFSM6AAAAAAZNQV54I> . You are receiving this because you commented.Message ID: ***@***.***>

skixmix · 2023-06-21T08:53:41Z

You mean without the site-specific configuration?

Currently, I have this configuration on every site (I'm not truly customizing it, it was simply to have something where I could place a custom configuration if necessary):

#   ==============================  WAF STATUS  ====================================

# Detection Only Mode
SecRuleEngine DetectionOnly

#   ==============================  LOGGING  =====================================

# Enable Audit Engine - use RelevantOnly to avoid logging everything
SecAuditEngine RelevantOnly
# Log file
SecAuditLogStorageDir /waf/coraza/audit/
SecAuditLogFormat JSON
SecAuditLog /waf/coraza/audit/202.log

# Log transaction
SecAuditLogParts ABIFHZ
# Use a single file for logging.
SecAuditLogType Serial

#   ======================== DISABLED OWASP RULES =============================
SecRuleRemoveByID 920430
SecRuleRemoveByID 932236
SecRuleRemoveByID 942421
SecRuleRemoveByID 920272
SecRuleRemoveByID 901340

skixmix · 2023-06-21T09:11:15Z

I tried replacing every site coraza_waf config with this one

        coraza_waf {
			include /waf/coraza/coraza.conf-recommended
			include /waf/coreruleset/crs-setup.conf.example
			include /waf/coreruleset/rules/*.conf
        }

without the site-specific configuration, and the problem persists even after reloading. The initial RAM usage was of 1.64 GB, which increased to 2.33 GB. After a few minutes, it released the memory, bringing it down to 1.88 GB. However, I'm unsure why it continues to retain that extra 200 MB of RAM.

skixmix · 2023-06-22T07:31:55Z

Hello, is there any update on this issue?

jcchavezs · 2023-06-22T08:45:08Z

Got the profiling (thanks @mholt for the heads up on the endpoint)

jcchavezs · 2023-06-22T08:54:59Z

I think one issue here is that we are initializing one instance of WAF per site, meaning that regexes and dicts (used by aho-corasick) are being initialized many times even when it is the same content and it is immutable. One idea to overcome this could be to use a regex map and dictionary map, however we need to be careful as tinygo won't support synchronization (hence non concurrent map for tinygo). This way, if you span 100 WAF with CRS, the regexes are going to use the same one in memory cc @anuraaga

anuraaga · 2023-06-22T10:14:14Z

I think a map for caching by pattern can work ok. Since TinyGo only runs on one thread, no worries about concurrency. One issue is all regex then are memory leaks - a pattern of replacing the waf instance with a different one with different rules becomes questionable. It's not common so maybe it doesn't need to be supported, but I guess making cache opt-in with a build tag is safer

skixmix · 2023-06-26T10:51:35Z

Hello,

If I got it right, there are two problems here:

When using the WAF module with multiple sites, it consumes a lot of memory because of regex allocation (duplicate rules saved in separate memory blocks for each site).
Every time we reload the Caddy configuration, the module appears to use up some RAM and then doesn't release all of it, even if nothing changed in the config.

Concerning the first problem, I can consider allocating additional RAM to the VM as a temporary solution. However, I'm unsure how to address the second issue. In the long run, it will take up all the RAM...
Do you have any updates or suggestions?

Thank you,
Simone

jptosso · 2023-06-26T10:54:43Z

Hello and thank you for the detailed analysis. Indeed we are working on a solution, as described by JC, we will implement cache mechanisms to avoid duplication on dictionaries for pm and regexes for rx. It will take some time to design this as it is a sensitive change. We will use this issue as reference so we will keep you posted.

Multiple hosts in a single web server is an important use case for coraza.

jcchavezs · 2023-06-26T10:54:55Z

@anuraaga how about we introduce a Close method on WAF where we delete all rules (and regexes) that can be used on reload. Is there any way to trigger a reload in a plugin @mholt ?

anuraaga · 2023-06-26T10:56:11Z

Having a Close is important, SGTM

jcchavezs · 2023-06-26T10:56:54Z

Do you consider it as a breaking change @anuraaga @jptosso?

anuraaga · 2023-06-26T10:58:17Z

No, perhaps we could document it better (similar to what wazero does) but our interfaces are not for outside implementations. It's fine to keep that up and we can make it more explicit it needed.

mholt · 2023-06-28T17:04:30Z

I don't know the exact situation in the code, of course, but there's two simple rules I try to abide by:

Store all state in the module's struct itself; this will get garbage collected after a config reload. If it doesn't, then not "all state" is being stored in the module's struct.
If you do need global state, or state that persists across reloads, but is based on configuration that may or may not be needed after a reload, use a UsagePool: https://pkg.go.dev/github.com/caddyserver/caddy/v2#UsagePool -- often in conjunction with a module implementing the CleanerUpper: https://pkg.go.dev/github.com/caddyserver/caddy/v2#CleanerUpper -- this allows modules to say when they're done with values as configs are cycled through, and Caddy & Go's GC will take care of the rest.

skixmix · 2023-07-04T13:44:20Z

Hello everyone,

thanks for the prompt and diligent responses. Will the upcoming 2.0 module version include a resolution for this issue? If yes, do you have any timeline in mind?

Currently, I have Caddy deployed in production for various sites, with the WAF temporarily disabled. As a workaround, I've implemented a server-side (Node.js Express) WAF. However, I'm eager to integrate Coraza on Caddy as soon as possible.

Thank you,
Simone

jcchavezs · 2023-07-04T13:57:06Z

Version 2 has indeed to address this I would first try with singleton for Regex before reusing a WAF because it only takes a single change in directives to duplicate the memory used in regexes. Given that coraza-caddy has no dedicated maintainer I would say the best timeline would be defined by community (otherwise it goes to a queue of things that we maintainers have to work on). This specific change is easy and we can get it under a build tag before releasing it, we just need to add a singleton for when a Regex is being compiled. Would you be up for such a change? Otherwise we will try to prioritize for the next two weeks but can't guarantee that.

…

On Tue, 4 Jul 2023, 15:44 skixmix, ***@***.***> wrote: Hello everyone, thanks for the prompt and diligent responses. Will the upcoming 2.0 module version include a resolution for this issue? If yes, do you have any timeline in mind? Currently, I have Caddy deployed in production for various sites, with the WAF temporarily disabled. As a workaround, I've implemented a server-side (Node.js Express) WAF. However, I'm eager to integrate Coraza on Caddy as soon as possible. Thank you, Simone — Reply to this email directly, view it on GitHub <#76 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAXOYAR427UOLXLAQVXGGGDXOQM37ANCNFSM6AAAAAAZNQV54I> . You are receiving this because you commented.Message ID: ***@***.***>

skixmix · 2023-07-04T14:08:43Z

Hi,

If you could give this fix higher priority, it would be fantastic 👍
Alternatively, I would be willing to test a custom build tag. If we choose that route, will I need to compile Caddy in a specific manner?

jcchavezs · 2023-07-05T10:46:12Z

I believe you can do:

XCADDY_GO_BUILD_FLAGS="-ldflags '-w s' -trimpath -tags my_custom_tag" xcaddy build --with github.com/corazawaf/coraza-caddy/v2

cc @mholt

mholt · 2023-07-05T16:20:13Z

Let me know how I can help :)

Currently we create and allocate memory for every regex we compile, however there are cases where you compile the same regex over and over e.g. corazawaf/coraza-caddy#76. Here we implement the memoize pattern to be able to reuse the regex and reduce the memory consumption.

jcchavezs · 2023-07-05T20:34:33Z

Store all state in the module's struct itself; this will get garbage collected after a config reload. If it doesn't, then not "all state" is being stored in the module's struct.

Is there any callback for when a module struct is going to be dismissed?

mholt · 2023-07-05T20:53:04Z

Good question: yes there is! :)

a module implementing the CleanerUpper: https://pkg.go.dev/github.com/caddyserver/caddy/v2#CleanerUpper -- this allows modules to say when they're done with values as configs are cycled through, and Caddy & Go's GC will take care of the rest.

jcchavezs · 2023-07-05T21:04:27Z

Awesome! Thanks @mholt

jcchavezs · 2023-07-05T21:05:17Z

@skixmix I opened this PR corazawaf/coraza#836. Please use that commit to give it a try and we see. You don't need to use a build tag.

Use something like:

xcaddy build --with github.com/corazawaf/coraza-caddy/v2@24ab5d92e9d71ebde31a34e0e86c595b568e79d3

skixmix · 2023-07-06T07:28:54Z

Great! Thank you so much. I'll give it a try as soon as possible and provide you with an update.

Have a nice day,
Simone

skixmix · 2023-07-06T11:00:27Z

Hello,

I have an update. What I did was to start with a clean Caddy setup, no site configured and no traffic. Then, I added 23 sites one by one, each one with its own Caddyfile and WAF configuration + OWASP CRS.

RAM consumption before starting: 340MB
RAM consumption after 10 sites: 1.20 GB
RAM consumption after 23 sites: 2.06 GB

Throughout the process, I noticed that the RAM usage was having peaks at 3 GB and gradually dropping successively.

to 1.8 GB (with 10 sites) and
to 2.8 GB (with 23 sites)

before reaching a stable consumption.

This is what happens when I add a new site or when I perform a reload of a site configuration (in this specific case, I added a new site).

Initial RAM consumption

During configuration reload

After configuration reload

After more or less 3 minutes

After 30 minutes

Production comparison

This is the current state of one of the VMs that I have running on one of my DCs in production, with more than 100 site configurations, same as the test ones, but without the WAF module:

--

Recap

The problem persists in such a manner that each configuration uses over 100 MB of RAM instead of the 8/9 MB used in production without the WAF, which is 10 times higher. Nevertheless, I believe that one of the issues has been resolved as the memory usage now decreases rather than continuously leaking indefinitely. 😄

mholt · 2023-07-06T16:05:32Z

You can find out what's using memory by capturing a profile. Caddy exposes HTTP endpoints for this at :2019/debug/pprof -- this is a simple HTML document with links to collect various profiles, either instantaneously or over a time period if you add query params. Capture the heap profile then use go tool pprof ... to open the profile; type top to see the top allocations, or svg to generate a vector image you can view in your browser.

jcchavezs · 2023-07-07T08:25:16Z

According to #76 (comment) we might need to memoize the aho-corasick dictionaries too.

* chore: adds memoize implementation for regexes. Currently we create and allocate memory for every regex we compile, however there are cases where you compile the same regex over and over e.g. corazawaf/coraza-caddy#76. Here we implement the memoize pattern to be able to reuse the regex and reduce the memory consumption. * docs: adds comments to code. * chore: simplify the memoize package by using sync.Map. * feat: extends memoize to ahocorasick and allow impl for tinygo but not synced as no concurrency. * tests: covers memoize_builders in tinygo. * chore: fixes nosync for tinygo. * docs: updates docs. --------- Co-authored-by: Juan Pablo Tosso <[email protected]>

anamba · 2023-09-13T00:56:00Z

I believe I am running into this issue... on startup, Caddy consumes about 1.5g ram, but each successive caddy reload (when there are config changes to apply) appears to add on about another 1.5g. Since the machine in question has only 4g physical ram, this becomes a problem quickly.

The memory usage does appear to come back down somewhat about 3-5 minutes later, but unlike the graphs, above I am not seeing memory usage level off, it just seems to settle at a higher level each time (eventually getting either OOMed, or restarted by monit when :2019 becomes slow to respond).

The high memory consumption in general is also a problem... this server has 90 sites, and I would like to enable coraza on all of them eventually, but that is currently not possible with only 4G ram. (It is a dev server, I am trying to work out the optimal coraza config here before enabling coraza on the production equivalents.)

The behavior I have described here is based on a build using the PR branch mentioned above in #76 (comment).

skixmix · 2023-09-20T10:16:57Z

Hi, do you have any update about this issue? Also, for v2, there is this one still pending #88

Thanks you,
Simone

jptosso · 2023-09-20T10:26:41Z

Hey, we have a new feature that should improve this, if you can build your own Caddy that would help testing:

XCADDY_GO_BUILD_FLAGS='-tags=memoize_builders' xcaddy build --with github.com/corazawaf/coraza-caddy/v2

If you are using Docker you can build it using

FROM caddy:2.7-builder
XCADDY_GO_BUILD_FLAGS='-tags=memoize_builders' xcaddy build --with github.com/corazawaf/coraza-caddy/v2

This implements a new patch that reuses the memory from PM and RX operators.

anamba · 2023-09-21T02:18:46Z

More testing needed, but initial tests show a huge decrease in memory usage. Promising!

skixmix · 2023-09-25T15:20:07Z

Hello,

I've tested the new feature with memoize_builders and performed the same experiment of #76 (comment). It now seems to go from 340 MB of RAM consumption to 540 MB with 10 sites, so it looks really promising!

Also, with this version #88 is no more an issue.

Thank you for the awesome work!
Simone

jptosso · 2023-09-25T16:20:34Z

Great news! @jcchavezs do you think we should enable this feature by default?

jcchavezs · 2023-09-26T20:39:53Z

@skixmix more memory improvements are coming. I am working on another PR to lazily load regexes.

@jptosso maybe we can enable it but we need to make sure risks are well documented, although minimal.

jcchavezs added the v2.0 label Jun 27, 2023

jcchavezs mentioned this issue Jul 5, 2023

chore: adds memoize implementation for regexes and ahocorasick corazawaf/coraza#836

Merged

1 task

jcchavezs mentioned this issue Jul 6, 2023

chore: uses coraza version that memoize regex generation. #84

Closed

jcchavezs mentioned this issue Aug 30, 2023

Monthly meeting agenda (August 2023) corazawaf/coraza#868

Closed

jcchavezs mentioned this issue Sep 26, 2023

feat: uses memoization to decrease memory consumption. corazawaf/coraza-proxy-wasm#220

Merged

jptosso mentioned this issue Oct 28, 2024

Regression tests for full feature matrix corazawaf/coraza#1182

Open

High RAM Consumption when activating the WAF on multiple sites #76

High RAM Consumption when activating the WAF on multiple sites #76

Comments

skixmix commented Jun 20, 2023

jptosso commented Jun 20, 2023

skixmix commented Jun 20, 2023

jcchavezs commented Jun 20, 2023 • edited Loading

Locally

Docker

mholt commented Jun 20, 2023

jcchavezs commented Jun 20, 2023 via email

mholt commented Jun 20, 2023

skixmix commented Jun 21, 2023

jcchavezs commented Jun 21, 2023 via email

skixmix commented Jun 21, 2023

skixmix commented Jun 21, 2023

skixmix commented Jun 22, 2023

jcchavezs commented Jun 22, 2023 • edited Loading

jcchavezs commented Jun 22, 2023

anuraaga commented Jun 22, 2023

skixmix commented Jun 26, 2023

jptosso commented Jun 26, 2023

jcchavezs commented Jun 26, 2023

anuraaga commented Jun 26, 2023

jcchavezs commented Jun 26, 2023 • edited Loading

anuraaga commented Jun 26, 2023

mholt commented Jun 28, 2023

skixmix commented Jul 4, 2023

jcchavezs commented Jul 4, 2023 via email • edited Loading

skixmix commented Jul 4, 2023

jcchavezs commented Jul 5, 2023

mholt commented Jul 5, 2023

jcchavezs commented Jul 5, 2023

mholt commented Jul 5, 2023

jcchavezs commented Jul 5, 2023

jcchavezs commented Jul 5, 2023 • edited Loading

skixmix commented Jul 6, 2023

skixmix commented Jul 6, 2023

Initial RAM consumption

During configuration reload

After configuration reload

After more or less 3 minutes

After 30 minutes

Production comparison

Recap

mholt commented Jul 6, 2023

jcchavezs commented Jul 7, 2023

anamba commented Sep 13, 2023

skixmix commented Sep 20, 2023

jptosso commented Sep 20, 2023 • edited Loading

anamba commented Sep 21, 2023

skixmix commented Sep 25, 2023

jptosso commented Sep 25, 2023

jcchavezs commented Sep 26, 2023

jcchavezs commented Jun 20, 2023 •

edited

Loading

jcchavezs commented Jun 22, 2023 •

edited

Loading

jcchavezs commented Jun 26, 2023 •

edited

Loading

jcchavezs commented Jul 4, 2023 via email •

edited

Loading

jcchavezs commented Jul 5, 2023 •

edited

Loading

jptosso commented Sep 20, 2023 •

edited

Loading