Missing event metadata #3246

imreczegledi-form3 · 2024-06-12T15:45:13Z

Hi 👋

We have some false positive alerts on empty events, similar to #3234, #2700 (hope I can help in these cases as well)

Missing event metadata

almost everything is null, -1 or 4294967295

{"hostname":"minikube","output":"12:15:24.058348969: Warning Account Manipulation in SSH detected 
...
{"container.id":"host","container.image.repository":null,"container.image.tag":null,"container.name":"host","evt.res":"SUCCESS","evt.time":1718108124058348969,"evt.type":"openat","fd.name":"my_sshd_config","group.gid":4294967295,"group.name":"","k8s.ns.name":null,"k8s.pod.name":null,"proc.cmdline":"bash","proc.cwd":"","proc.exepath":"","proc.pcmdline":null,"proc.pid":11453,"proc.ppid":0,"proc.sid":-1,"user.loginname":"","user.loginuid":-1,"user.name":"","user.uid":4294967295}
...
}

Falco rule: Account Manipulation in SSH, but the issue is not rule specific.

Based on my local tests, the root cause is the too small bufSizePreset parameter. This buffer is crucial when Falco has to a handle a "process flood" (e.g. a process makes hundreds of child processes).

To simulate a "process flood" I created a small golang script which triggers the rule 1000 times in different child processes (on the host).

...
cmd = exec.CommandContext(ctx, "timeout", "5s", "tail", "-f", "/home/ubuntu/my_sshd_config")
...

This is the way how you can reproduce the issue.

Test env

EC2 (Ubuntu, t3.small) with minikube
Deployed Falco chart version 4.3.0

Results

`bufSizePreset`	Logged events	Events with missing metadata	Ratio
1 (1 MB)	447	77	0,172
2 (2 MB)	582	69	0,118
3 (3 MB)	570	48	0,084
4 (4 MB) - Falco default	738	41	0,055
5 (16 MB)	998	0	-

As you can see above as we increase the buffer, the number of the events without metadata is decreasing. When we use a buffer with appropriate size the issue disappears, Falco logs will contain only appropriately enriched events.

bufSizePreset can be between 1-10

Ideas

Probably, the event enrichment logic uses some space from the bufSizePreset buffer
Due to the huge load (because of the new processes) event enrichment doesn't have enough space to work
Maybe dropping these "empty" events would be better
A bufSizePreset specific debug message (with a logic which can measure the buffer utilisation) would be very useful

Looking forward to your answer, ideas (I might have missed something)

The text was updated successfully, but these errors were encountered:

incertum · 2024-06-12T19:04:05Z

As you can see above as we increase the buffer, the number of the events without metadata is decreasing.

This is expected as Falco builds up internal state to serve you all the information (see the source code https://github.com/falcosecurity/libs/blob/master/userspace/libsinsp/parsers.cpp). If we drop too many events kernel side, the state engine is not working. Perhaps the adaptive syscalls blog post (https://falco.org/blog/adaptive-syscalls-selection/) can provide more insights, and the base_syscalls feature may be of interest to you in general.

A bufSizePreset specific debug message (with a logic which can measure the buffer utilisation) would be very useful

Have you explored the internal automatic drop alerts or Falco metrics https://falco.org/docs/metrics/falco-metrics/ as alternative? Both expose drop counters from which you can infer how the buffer is holding up.

Some more general info:

Btw, in your example log is shows "container.name":"host"
so all container fields are expected to be null, see https://falco.org/docs/reference/rules/supported-fields/#field-class-container etc

re user names and group names, is the host /etc dir mounted and available? We have had issues in the past with minikube support in general as some mounts or setup is not like on actual Kubernetes. Perhaps some of it is also because of that.
How do you use minikube? Which driver? See also https://falco.org/docs/install-operate/third-party/learning/

imreczegledi-form3 · 2024-06-13T05:55:56Z

Thanks, I will check the blog post regarding adaptive syscalls.

driver: modern-bpf

I think it isn't a minikube compatibility issue because as you can see in the table above. Majority of the events are perfectly enriched like:

{"hostname":"minikube","output":"13:21:03.940424837: Warning Account Manipulation in SSH detected ...
 "output_fields": {"container.id":"host","container.image.repository":null,"container.image.tag":null,"container.name":"host","evt.res":"SUCCESS","evt.time":1718198463940424837,"evt.type":"openat","fd.name":"/home/ubuntu/my_sshd_config","group.gid":1001,"group.name":"<NA>","k8s.ns.name":null,"k8s.pod.name":null,"proc.cmdline":"tail -f /home/ubuntu/my_sshd_config","proc.cwd":"","proc.exepath":"/usr/bin/tail","proc.pcmdline":"timeout 5s tail -f /home/ubuntu/my_sshd_config","proc.pid":12194,"proc.ppid":12192,"proc.sid":-1,"user.loginname":"docker","user.loginuid":1000,"user.name":"docker","user.uid":1000}}

so the root cause remains around the state engine/dropped events

poiana · 2024-09-15T10:10:44Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

imreczegledi-form3 added the kind/bug label Jun 12, 2024

poiana added the lifecycle/stale label Sep 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing event metadata #3246

Missing event metadata #3246

imreczegledi-form3 commented Jun 12, 2024 •

edited

Loading

incertum commented Jun 12, 2024

imreczegledi-form3 commented Jun 13, 2024

poiana commented Sep 15, 2024

Missing event metadata #3246

Missing event metadata #3246

Comments

imreczegledi-form3 commented Jun 12, 2024 • edited Loading