Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[iptables] [journald] Errors when testing with Elastic Agent wolfi images #10998

Closed
Tracked by #37086
mrodm opened this issue Sep 4, 2024 · 9 comments · Fixed by #11007
Closed
Tracked by #37086

[iptables] [journald] Errors when testing with Elastic Agent wolfi images #10998

mrodm opened this issue Sep 4, 2024 · 9 comments · Fixed by #11007
Assignees
Labels
Integration:iptables Iptables Team:Elastic-Agent Label for the Agent team

Comments

@mrodm
Copy link
Contributor

mrodm commented Sep 4, 2024

Testing to run system tests using Elastic Agent docker images based on Wolfi images #10933, raised the issue that system tests are failing for these two packages (iptables and journald).

The errors that comes from the buildkite build are that elastic-package could not find hits in the data stream:

test case failed: could not find hits in logs-iptables.log-68254 data stream
test case failed: could not find hits in logs-journald.logs-31807 data stream

Reviewing Elastic Agent logs, it looks like the agent uses journalctl:

{"log.level":"info","@timestamp":"2024-08-30T09:27:50.576Z","message":"Journalctl command: journalctl --utc --output=json --follow --file /run/service_logs/test.journal --no-tail","component":{"binary":"filebeat","dataset":"elastic_agent.filebeat","id":"journald-default","type":"journald"},"log":{"source":"journald-default"},"log.logger":"input.journald","service.name":"filebeat","id":"journald-journald.logs-e11b66f5-06a3-421b-b4cb-2562c40b18ba","input_source":"/run/service_logs/test.journal","ecs.version":"1.6.0","path":"/run/service_logs/test.journal","log.origin":{"file.line":158,"file.name":"journalctl/reader.go","function":"github.com/elastic/beats/v7/filebeat/input/journald/pkg/journalctl.New"},"ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2024-08-30T09:27:50.576Z","message":"cannot read from journalctl stderr: read |0: file already closed","component":{"binary":"filebeat","dataset":"elastic_agent.filebeat","id":"journald-default","type":"journald"},"log":{"source":"journald-default"},"path":"/run/service_logs/test.journal","ecs.version":"1.6.0","log.logger":"input.journald","log.origin":{"file.line":193,"file.name":"journalctl/reader.go","function":"github.com/elastic/beats/v7/filebeat/input/journald/pkg/journalctl.New.func1"},"input_source":"/run/service_logs/test.journal","service.name":"filebeat","id":"journald-journald.logs-e11b66f5-06a3-421b-b4cb-2562c40b18ba","ecs.version":"1.6.0"}

but this command does not exist in docker images based on Wolfi. Example:

 $ docker exec -it elastic-package-agent-journald-26598-elastic-agent-1 /bin/bash
bash-5.2# journalctl --utc --output=json --follow --file /run/service_logs/test.journal --no-tail
bash: journalctl: command not found
bash-5.2# 

Should these packages be using for these tests the Elastic Agent Ubuntu based image (e.g. docker.elastic.co/elastic-agent/elastic-agent)?

cc @elastic/elastic-agent-control-plane

@mrodm
Copy link
Contributor Author

mrodm commented Sep 4, 2024

It looks like that these packages would require to use the complete Elastic Agent image.

This is the error found when using docker.elastic.co/elastic-agent/elastic-agent docker image:

{"log.level":"error","@timestamp":"2024-09-04T14:10:03.192Z","message":"Input 'journald' failed with: input journald-iptables.log-0f1d6d66-cc22-450d-ba54-2db7dd8eb020 failed: could not start journal reader: cannot start journalctl: exec: \"journalctl\": executable file not found in $PATH","component":{"binary":"filebeat","dataset":"elastic_agent.filebeat","id":"journald-default","type":"journald"},"log":{"source":"journald-default"},"log.logger":"input.journald","log.origin":{"file.line":139,"file.name":"compat/compat.go","function":"github.com/elastic/beats/v7/filebeat/input/v2/compat.(*runner).Start.func1"},"service.name":"filebeat","id":"journald-iptables.log-0f1d6d66-cc22-450d-ba54-2db7dd8eb020","ecs.version":"1.6.0","ecs.version":"1.6.0"}

And here the result of trying to use journalctl in the current docker images based on ubuntu.

 $ docker run --entrypoint /bin/bash  --rm -it docker.elastic.co/elastic-agent/elastic-agent:8.16.0-SNAPSHOT
elastic-agent@742398c6aeb2:~$ journalctl 
bash: journalctl: command not found

 $ docker run --entrypoint /bin/bash  --rm -it docker.elastic.co/elastic-agent/elastic-agent-complete:8.16.0-SNAPSHOT
elastic-agent@c759806fd4b9:~$ journalctl --version
systemd 245 (245.4-4ubuntu3.23)
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid

@blakerouse
Copy link

When would it make sense to run journalctl from inside of a container? Is the journald socket bind mounted into the container?

@cmacknz
Copy link
Member

cmacknz commented Sep 4, 2024

I commented this in Slack, but what is happening is that these tests are assuming the Ubuntu based image is approximately equal to a native Linux environment, and Wolfi breaks that assumption on purpose. journalctl being in our container shouldn't be an expectation.

For now we can keep testing on the Elastic Agent Ubuntu image in 8.x. For 9.x we'll need a different setup.

@andrewkroh
Copy link
Member

When would it make sense to run journalctl from inside of a container? Is the journald socket bind mounted into the container?

fwiw I deploy Filebeat containers to all my nodes with the journal files (/run/log/journal, /var/log/journal/) mounted, and then run a few instances of the journald input with selectors for the core host-based services that I'm interested in. I'm not objecting to anything (I don't mind customizing an image to make this work), I just wanted to mention that there are some uses.

@cmacknz
Copy link
Member

cmacknz commented Sep 4, 2024

fwiw I deploy Filebeat containers to all my nodes with the journal files (/run/log/journal, /var/log/journal/) mounted, and then run a few instances of the journald input with selectors for the core host-based services that I'm interested in

Thanks, reading journal files from k8s nodes is definitely a valid use case. Given we are changing the journald input implementation to invoke journalctl, to do this for containers would require us to to install journalctl which has a lot of systemd itself as a dependency. I would greatly prefer to keep systemd of our base Wolfi container, perhaps we could put it in the complete variant of the wolfi image though.

The move to just invoking journalctl aligns us with the OTel journald receiver, but also makes the use case of reading journal files from k8s nodes when running from a container painful or just not possible by default. The OTel collector itself also has this problem, see open-telemetry/opentelemetry-collector-releases#462 for some discussion. I don't love this, but we currently don't have a GA way to read journald logs in a container and we will inherit the OTel collector's problems here regardless. As mentioned earlier, we at least have the elastic-agent-complete image as a potential escape hatch for this problem.

CC @belimawr

@blakerouse
Copy link

blakerouse commented Sep 5, 2024

I think we should look at removing the need to execute journalctl at all, and get the information we need by connecting to the socket of journald.

That removes the need for journalctl to be in either the Ubuntu container or the Wolfi container.

@belimawr
Copy link
Contributor

belimawr commented Sep 5, 2024

I think we should look at removing the need to execute journalctl at all

We used to use a library instead of calling journalctl, but it was causing Filebeat to crash in a irrecoverable way and that was dependant on the version of libsystemd on the host. Here are some links with more details:

I haven't found a way to read journal files from a socket, as far as I could find, the socket is only for writing data to journald.

@pkoutsovasilis
Copy link
Contributor

based on this comment having CGO-based code is avoided and utilising coreos/go-systemd is falling into this category. So a pure go solution would have to be coded to support what @blakerouse mentioned above

@cmacknz
Copy link
Member

cmacknz commented Sep 5, 2024

Yes, upstream OTel has open-telemetry/opentelemetry-collector-contrib#32711 tracking a pure Go implementation but it has stalled. There definitely seems to be a wide need for this, I'd be in favour of us making one as a general open source alternative to journalctl or go-systemd. I don't think doing this has to block initial GA of the journald input unless we think we'd need breaking changes to do it.

With reference to the socket or just receiving journal events without journalctl, https://www.freedesktop.org/software/systemd/man/latest/systemd-journal-remote.service.html might be the reference for how to do this. It is unclear if this can stream previously written journal files or only receives the latest events though.

There is also the ForwardToSocket argument but I don't know that it is enabled by default on most hosts https://www.freedesktop.org/software/systemd/man/latest/journald.conf.html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Integration:iptables Iptables Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants