Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Integrity Monitoring | User Information - Linux #7401

Closed
narph opened this issue Aug 16, 2023 · 6 comments
Closed

File Integrity Monitoring | User Information - Linux #7401

narph opened this issue Aug 16, 2023 · 6 comments

Comments

@narph
Copy link
Contributor

narph commented Aug 16, 2023

Similar to Auditbeat's FIM module, our new FIM integration can monitor for file changes, but does not include the user information to capture who modified/accessed the file. This is a significant visibility gap for security analysts and a heavily requested enhancement request.

Research needs to be done to determine how we can capture user information within our FIM integration and any underlying changes required. Can the OS components we rely on today be leveraged or is an underlying change to how we gather FIM data needed?

Meta issue #3310

@elasticmachine
Copy link

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@chemamartinez
Copy link
Contributor

I made a first research about the alternatives that I found to get user data for Linux.

Fsnotify

https://github.com/fsnotify/fsnotify

It is currently used by our FIM module for recursivity. Unfortunately, it doesn't support getting user information for now, as is explained in this recent issue.

They are trying to add support for fanotify but it is not ready yet, and seems to be stalled: fsnotify/fsnotify#542

Seems that it is not a valid option for now.

Fanotify

https://github.com/torvalds/linux/tree/master/fs/notify/fanotify

Fanotify is a file access notification system built into the Linux kernel. It's designed to supersede inotify in some use cases, especially those related to FIM, by providing more detailed information and allowing for responses (like blocking actions).

In 2017, fanotify was very limited:

Compared with inotify, fanotify's assortment of events might feel limited. At present, creating, deleting, and removing events are not supported: You can watch files and directories being opened, accessed, and closed, and that's it. Moreover, mmap() generates no events. Fanotify isn't an inotify replacement; instead, it focuses on cases such as malware scanning and hierarchical storage management.

Source: https://www.linux-magazine.com/Issues/2017/194/Core-Technologies

Support for missing events was added for Linux 5.1, which was released in 2019.
https://man7.org/linux/man-pages/man7/fanotify.7.html
https://kernelnewbies.org/Linux_5.1#Improved_fanotify_for_better_file_system_monitorization

Fanotify seems to be a valid solution to get the user information, it allows to enable a flag FAN_REPORT_PIDFD to get the PID that made the change along with other metadata.

We already have some previous work to include fanotify in FIM, and also there are some external projects that could be interesting to explore such as https://github.com/opcoder0/fanotify. However, it seems that fanotify can lead to reliability issues based on past experiences.

Auditd

https://man7.org/linux/man-pages/man8/auditd.8.html

The Audit daemon is another option since it provides the user information about any change in the monitored file systems. It is also used by other FIM solutions like Wazuh or LogRhythm.

The main problem of using Auditd is that it is a dependency that must be installed in the host to be monitored, so we should be responsible of the Auditd status and also load the rules, that may sometimes conflict with rules already configured.

eBPF

https://ebpf.io

eBPF is a more recent and very powerful technology in the Linux kernel that can be used for a variety of monitoring, tracing, and networking tasks, including file integrity monitoring. eBPF allows users to run custom programs in the kernel space safely, without modifying the kernel source code or loading additional modules. When it comes to file integrity monitoring, eBPF can be employed to trace specific syscalls related to file operations and collect relevant data.

eBPF provides some considerable benefits:

  • Performance: lightweight and efficient, minimal overhead.
  • Flexibility: they can be attached, modified, or detached from the kernel at runtime without any kernel restarts.
  • Granularity: you can collect just what you need (specific syscalls, user IDs, paths, etc.)

It also involves more challenges compared to other solutions:

  • Learning curve
  • While there are some libraries that make it easier to handle eBPF programs from Go, the actual eBPF code is typically still written in a restricted subset of C. There isn't a way to write eBPF programs directly in Go. The Go runtime has features and behaviors that are not compatible with the constraints of eBPF programs.

The usual approach I think is to write the eBPF program in C, compile it, and then use one of the available Go libraries (or a new one created by us) to load, attach, and manage the eBPF program in the kernel from a Go application.

Kprobes

https://docs.kernel.org/trace/kprobes.html#

In a lower level than the other solutions, kprobes is a kernel feature that allows instrumenting the kernel by setting breakpoints and catch system calls, so it could be useful for FIM by catching file-oriented calls such as open(), write() or unlink().

Some considerations about implementing kprobes for FIM:

  • Mainly, I see similar advantages as for eBPF, regarding efficiency and flexibility.
  • On the other hand, working directly in the kernel space could be dangerous since any failure may affect the whole system, in terms of stability and also security. Therefore, this solution should involves more complexity in every aspect.

A deeper analysis would be needed to determine if this is a feasible path.

@norrietaylor
Copy link
Member

norrietaylor commented Sep 7, 2023

Hello @chemamartinez 👋

The Linux Platform team discussed this issue in a recent team meeting. Our discussion produced some additional information that you may find interesting and valuable.

First, the crux of this issue is that we need to associate a file operation with a calling process. In Linux, the process is the entity that is operating on a file. As such, it is also the entity that is associated with an effective user. If you can determine the pid of the process, it becomes straightforward to get other helpful metadata, including that of the user.

In our discussion, we focussed on three of the options you listed above for retrieving process information: fanotify, kprobes, and eBPF.

Fanotify

We do not recommend leveraging fanotify for this metadata. The reasons for this are primarily related to reliability.

Fanotiy can operate in notification or permission mode. In notification mode, the application would receive the pid but would need to scrape the /proc filesystem after an event for user metadata. This approach would introduce a race condition for short-lived processes in which the required metadata would be absent. To solve this problem, we could receive an event in permission mode. Auditbeat would hold a lock on the file while we scraped /proc filesystem for the metadata in this situation. This technique is inherently a system stability risk as if, for some reason, Auditbeat is killed while it holds this lock, the entire filesystem can be blocked. Locking a production Linux host's file system can be a source of severe SDH issues.

eBPF

eBPF is a great candidate as it can instrument a kernel event in real-time and transmit a safe event to userspace containing much of the information we need without worrying about scraping /proc.

While eBPF requires some C knowledge to write programs, this should be manageable and not involve a significant learning curve. Most eBPF tracing programs are relatively simple and easy to understand. In fact, many of the existing probes in https://github.com/elastic/ebpf could be used as they are.

There are some limitations to eBPF that we should highlight. First, there is an instruction limit for eBPF programs, which means complex logic can be challenging to implement. Algorithms that involve elaborate filtering or string parsing are often better handled in userspace. Second, eBPF can be problematic to work with for older Linux kernels. Our team prefers BTF and bpf ring buffer support, which were introduced in 5.4 and 5.8. Practically, this means 5.10 kernels and newer are suited for eBPF tracing and telemetry.

Kprobes

Kprobes are also a great candidate as they can transmit a safe event to userspace in real-time. Kprobes can be set up to use tracefs, meaning you are not operating directly in kernel space. This mitigates any stability concerns you have raised.

When using tracefs you are bound to predefined format structures, which limits the data points you can instrument. They are also not a formal API, which means they can change as the kernel is updated. This limitation means kprobes and tracefs are less flexible than eBPF.

The advantage of this approach is that much older kernels are supported.

Recommendation

Our general recommendation would be to enable Auditbeat to make a runtime decision based on the kernel version to use a kprobe or eBPF implementation. Older kernels would use kprobes, and newer kernels would use eBPF. These two implementations could be developed serially, prioritizing kernel support that would satisfy the most customers.

Another request to be aware of is that of powering Session View with Auditbeat events. We are also discussing enriching Auditbeat events with process-oriented metadata for this task. Ideally, the same backend could serve both goals and be reusable for future requests.

@andrewkroh
Copy link
Member

andrewkroh commented Oct 11, 2023

We have not decided on technology at this point. In order to have a better understanding of what data is required I'm listing the current event triggers and the data that is being reported. We need to support amd64 and arm64 with this.

Event Triggers

  • File or dir created (e.g. open, mkdir, symlink, link)
  • File or dir renamed
  • File or dir deleted
  • File modified (written, truncated)
  • Attributes modified (e.g chmod, chown, timestamps, and extended attributes)
  • (fyi) Also at startup auditbeat can scan the filesystem to look for deltas since it last ran.

Data

Apart from the inotify trigger, all of the data about the files is collected from userspace. For example, when it gets an inotify IN_ATTRIB event it will stat and getxattr to see what changed. We don't have to modify how this part works, but if there is an opportunity for the events to include specific information about what changed then could make FIM more efficient.

Auditbeat is reporting this file metadata with events:

  • path
  • target_path (for symlinks)
  • inode
  • owner uid
  • owner gid
  • size
  • mtime
  • ctime
  • type (dir, file, symlink, etc)
  • mode
  • xattrs (specifically security.selinux and system.posix_acl_access)
  • file content hashes

We want to add information about the process and user that triggered the file change event. I was thinking to add some minimal metadata that gives you some context, and enables you to pivot to process events via entity_id when you need more rich process data (this assumes that the user has enabled process events in Auditbeat).

  • process.name
  • process.pid
  • process.entity_id
  • user.id
  • user.name
  • container.id

@pkoutsovasilis
Copy link
Contributor

Hello also from my end 👋 Thank you for your messages and initial investigation @chemamartinez , @norrietaylor , and @andrewkroh.

Having your comments in mind I have been investigating, in theory so far, how a kprobe-based solution could look like to match the existing inotify-based Event Triggers mentioned above by @andrewkroh . Yesterday, I had a meeting with @nicholasberlin , @stanek-michal and @norrietaylor where we discussed my early thinking and we all deemed that this could be feasible. In a few words, I am thinking of scanning the directories that need to be monitored initially, extract the inode number of each one, and then through the appropriate kprobe-based events associate which ones of them affect files/folders that we want to monitor.

Of course there are certain open questions, as to how many of this kprobe-based events can we process and associate in a timely manner, portability of kprobe-based solution, etc. For that reason, I will start coding a PoC that builds on top of the theory and as a result we will be able to have more quantitative answers to such questions.

The thinking so far is to make a separate backend for the FIM-module which is kprobe-based and is able to produce events that consist also the pid that caused each change. Then a processor can utilise the former to enrich the events accordingly before sending them out.

@norrietaylor
Copy link
Member

@jamiehynds, perhaps we should discuss acceptance criteria and GA plan for this ticket?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests