-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Libbeat] ECS Logging #15544
Comments
Pinging @elastic/ingest (Project:fleet) |
For me, its the matter of effort we want to put upfront and the benefits we can get: Option 1: is indeed a simple formatter but offer less control to the author, with the growing addition of modules and the move to the agent this doesn't give us a lot of finer control on what and on what form the data is send to Fleet. Is that OK? Maybe yes maybe no. Option 2: This indeed gives us more work in the code we have to maintain, but this gives us freedom while enforcing type-safety. In bonus we get better contextual error reporting when implemented. And the machine vs human-readable format is a plus, look at the following example. log.Errorf("Failed to access '%{file}'", ecs.File.Path("test.txt")) Produces {
...
"file": {
"path": "test.txt"
},
"log": {
...
},
"message": "Failed to access 'test.txt'"
} I strongly suggest you look at the examples in the readme. Both options 1 and 2 can be gracefully added, keeping old unstructured |
Note, I might be a bit biased, @michalpristas and myself did play with parts of the ECSlogs framework, we did sadly not use it to the full potential because we wanted to go back to what libbeat was using. We did find a few issues which are now fixed. |
Well, I'm biased towards option 2, but it definitely needs some more testing. It also adds serilog like format strings. About file rotation: We might consider to use stdout/stderr more by default. On linux it is kind of the standard (systemd and containers prefer this). Well, not sure windows event log would be good target or if we should continue to prefer file rotation. The rotation in ecslog always adds a timestamp to rotated files. This is important if you want to collect the logs using filebeat. Without adding timestamps (default in Beats) filebeat can block rotation to happen, because Windows doesn't allow log files to be rotated/deleted as long as filebeat is still reading them. Currently in Beats one needs to configure an rotation interval in order to have timestamps added to the rotated files. The zap project does not provide file rotation support. The file rotation we have is a custom implementation within libbeat. I think the file rotation implementation in ecslog also better separates triggers, rotation, layout and also support gzipping old files. Ecslog also has better support for logging errors + supports wrapping/returning errors with an optional ECS based context. Finally ECS fields from error messages can be added to the log event. Option 3 was what I originally experimented with. Have a meta-logger that uses other loggers (zap/zerolog/logrus/...) as backend. But I finally opted against this solution. Besides being able to format a log message, the core functionality that differentiates structured logging libraries is the context and how fields can be added. This can even impact encoding. For example zap just appends fields (more like streaming new fields as events into the context). Nesting support is limited to: have a field with a custom object or create a nesting level at the end of the stream (one can not remove the nesting level anymore). Fields with same names do appear multiple times in a document, which is not supported by all JSON parsers. At worst events might not be indexible in ES, which was one of the main goals of ecslog. Elasticsearch fails to parse the JSON document if we try this: Logrus on the otherhand uses a map for it's context that is copied and updated whenever fields are added (rather expensive in comparison). ecslog uses a hybrid of zap and logrus. One adds fields like zap, but when logging we sort the fields and remove duplicates. The sorting can be cached for future log operations. ecslog contexts can be combined (they are a tree internall), without copying and iterating contexts over and over again. ecslog also differentiates between standardized fields and custom fields, allowing the layout to handle adhoc custom fields by developers to be handled differently. For example always encode custom fields to an embedded string to not break mappings in Elasticsearch. Finally having a custom context type requires a custom logger. Still using other loggers as backend would require context to be streamed to the backend logger, that needs to build a custom context to be thrown away right after each time we want to log. |
The goal of the ECS logging libraries is twofold. As mentioned in Option 1, the MVP is to at least support 10-15 fields that can be populated transparently by a logging formatter. This MVP level lowers the bar to adoption of the ECS logging library in question. You just enable it, and the basics are done for you with very little effort. But there is also a desire to let developers to log any arbitrary ECS field as well, with these libraries. So my understanding is that the mix of option 1 and 2 could actually yield a reuseable Golang logging library that hits both targets, and can be added to the list of ECS Logging libraries we offer customers here https://github.com/elastic/ecs-logging 😀🤞🏻? cc @philkra |
This issue is not about the ecs-logging project, multiple library support or multiple language support. It is about ECS based logging in Beats in particular. With this in mind let's talk about what we want to expect from the logging support in Beats in the future. Some features/requirements coming to mind:
|
Log level and selectors changed dynamically for Agent. That would be useful for remote debugging. Imagine an inspect button in the UI that displays live log. |
This functionality is not provided by zap. Only Fields + JSON encoder are provided by Zap. Almost everything else is custom in Beats. |
Pinging @elastic/ingest-management (Team:Ingest Management) |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
@ruflin Is this something we can still look into before 8.0? Switching default logging formats post 8.0 becomes a trickier process. |
Pinging @elastic/agent (Team:Elastic-Agent) |
I added the Elastic Agent team label and ping @jlind23 on timing questions ;-) |
Which option was the chosen one finally? |
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
For more context. Both Elasticsearch and Kibana will emit ECS logs OOTB with 8.0, APM server already does for a while. It would be a missed opportunity if we don't make the same move observability wide. |
@Mpdreamz can we include this change in 8.0? Doing it for 7.16 is impossible as we are almost running out of time. |
@Mpdreamz What is apm-server using for the ecs logs? If it works well we should follow the same path in Beats. |
I believe APM-server uses zap but is not yet on https://github.com/elastic/ecs-logging-go-zap and uses a custom formatter. We ultimately want to hop on cc @simitt let me know if the above is not true :) |
I added basic ECS logging support to the custom This can be enabled by setting So, basically it should be as easy as turning on |
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
Also just to confirm |
Elastic Agent itself is not a beat and not directly based on libbeat. @michalpristas could you comment the logging behaviour? |
Associated PRs: #28573 - remove code for 8.0 |
Motivation
We want to move to ECS log format and get benefits of a uniform logging schema on all the products. This allows us to develop ECS aware visualization of behavior like correlating metrics or events between products. The change in format is a breaking change and we should target it for 8.0.
History of logging in beats
Initially, beats were using a custom logging library and after we partially moved to uber's zap library. I believe the initial reason to move to this library was to reduce the scope of things we have to maintain, JSON output, better namespace support and also get a speed increase benefit when we are actively writing a lot of log events.
Note, I did use the word partially because we never completed the initial intended complete migration. We wanted to remove all global loggers instantiation and correctly create namespace per concerns, sadly we still have a lot of global loggers and not all logger instances have the right context attached to it which complicate debugging and support.
How we will migrate to ECS?
I think we have a few options to migrate to the ECS log format.
Option 1
Keep using the uber Zap library and add a custom formatter on top of it to add 10-15 ECS fields to each logs statement.
Option 2
A few month ago @urso, @ruflin and I talked about how we could better target ECS log format. With that @urso created a POC called ECSlog.
Option 3
Could it be possible to mix Option 1 and Option 2, meaning that we delegate file rotation and exporter to Zap but we enforce everything on top of it. This actually mean that Zab mostly deal with raw bytes or string? I don't know if its possibe, WDYT @urso?
Next steps
In all the option above we need to write new codes, either we write a new formatter or bring ECSLog to stable, or a mix. We have to keep in mind, moving to a structured format mean that we will have to deal with backward compatibility.
I surely miss pros and cons and I would like your opinion @urso @graphaelli @ruflin @andresrc @webmat
related to #15129, elastic/apm-server#3155
The text was updated successfully, but these errors were encountered: