[Libbeat] ECS Logging #15544

ph · 2020-01-14T17:55:06Z

Motivation

We want to move to ECS log format and get benefits of a uniform logging schema on all the products. This allows us to develop ECS aware visualization of behavior like correlating metrics or events between products. The change in format is a breaking change and we should target it for 8.0.

History of logging in beats

Initially, beats were using a custom logging library and after we partially moved to uber's zap library. I believe the initial reason to move to this library was to reduce the scope of things we have to maintain, JSON output, better namespace support and also get a speed increase benefit when we are actively writing a lot of log events.

Note, I did use the word partially because we never completed the initial intended complete migration. We wanted to remove all global loggers instantiation and correctly create namespace per concerns, sadly we still have a lot of global loggers and not all logger instances have the right context attached to it which complicate debugging and support.

How we will migrate to ECS?

I think we have a few options to migrate to the ECS log format.

Option 1

Keep using the uber Zap library and add a custom formatter on top of it to add 10-15 ECS fields to each logs statement.

Pros	Cons
Low impact on current code	Hard to extend when more typed fields are added (IP Address, Source)
Rely on external library reduced changes	Do not improve error reporting, the logs are still flat and we don't have an easy way to add contextual data.
	When using the Structured export format it doesn't necessarily mean that we produce valid JSON.

Option 2

A few month ago @urso, @ruflin and I talked about how we could better target ECS log format. With that @urso created a POC called ECSlog.

Pros	Cons
Can use any fields From ECS	It's a complete library to maintain (exported, file rotation)
Typesafety (generated from ECS definition)	Not stable yet.
Allow both machine readable error message and human readable error message
Strong emphasis on context embedded/traceability
Implement an ECS friendly error type
We can make the old logging statement works as is and gracefully upgrade
Enforce valid content with types

Option 3

Could it be possible to mix Option 1 and Option 2, meaning that we delegate file rotation and exporter to Zap but we enforce everything on top of it. This actually mean that Zab mostly deal with raw bytes or string? I don't know if its possibe, WDYT @urso?

Next steps

In all the option above we need to write new codes, either we write a new formatter or bring ECSLog to stable, or a mix. We have to keep in mind, moving to a structured format mean that we will have to deal with backward compatibility.

I surely miss pros and cons and I would like your opinion @urso @graphaelli @ruflin @andresrc @webmat

related to #15129, elastic/apm-server#3155

elasticmachine · 2020-01-14T17:55:09Z

Pinging @elastic/ingest (Project:fleet)

ph · 2020-01-14T18:02:07Z

For me, its the matter of effort we want to put upfront and the benefits we can get:

Option 1: is indeed a simple formatter but offer less control to the author, with the growing addition of modules and the move to the agent this doesn't give us a lot of finer control on what and on what form the data is send to Fleet. Is that OK? Maybe yes maybe no.

Option 2: This indeed gives us more work in the code we have to maintain, but this gives us freedom while enforcing type-safety. In bonus we get better contextual error reporting when implemented. And the machine vs human-readable format is a plus, look at the following example.

log.Errorf("Failed to access '%{file}'", ecs.File.Path("test.txt"))

Produces

{
  ...
  "file": {
    "path": "test.txt"
  },
  "log": {
    ...
  },
  "message": "Failed to access 'test.txt'"
}

I strongly suggest you look at the examples in the readme.

Both options 1 and 2 can be gracefully added, keeping old unstructured

ph · 2020-01-14T18:05:59Z

Note, I might be a bit biased, @michalpristas and myself did play with parts of the ECSlogs framework, we did sadly not use it to the full potential because we wanted to go back to what libbeat was using. We did find a few issues which are now fixed.

urso · 2020-01-15T17:41:39Z

Well, I'm biased towards option 2, but it definitely needs some more testing. It also adds serilog like format strings.

About file rotation: We might consider to use stdout/stderr more by default. On linux it is kind of the standard (systemd and containers prefer this). Well, not sure windows event log would be good target or if we should continue to prefer file rotation.

The rotation in ecslog always adds a timestamp to rotated files. This is important if you want to collect the logs using filebeat. Without adding timestamps (default in Beats) filebeat can block rotation to happen, because Windows doesn't allow log files to be rotated/deleted as long as filebeat is still reading them. Currently in Beats one needs to configure an rotation interval in order to have timestamps added to the rotated files. The zap project does not provide file rotation support. The file rotation we have is a custom implementation within libbeat. I think the file rotation implementation in ecslog also better separates triggers, rotation, layout and also support gzipping old files.

Ecslog also has better support for logging errors + supports wrapping/returning errors with an optional ECS based context. Finally ECS fields from error messages can be added to the log event.

Option 3 was what I originally experimented with. Have a meta-logger that uses other loggers (zap/zerolog/logrus/...) as backend. But I finally opted against this solution. Besides being able to format a log message, the core functionality that differentiates structured logging libraries is the context and how fields can be added. This can even impact encoding.

For example zap just appends fields (more like streaming new fields as events into the context). Nesting support is limited to: have a field with a custom object or create a nesting level at the end of the stream (one can not remove the nesting level anymore). Fields with same names do appear multiple times in a document, which is not supported by all JSON parsers. At worst events might not be indexible in ES, which was one of the main goals of ecslog. Elasticsearch fails to parse the JSON document if we try this: PUT /testdoc {"field": 1, "field": 2}. Although it is an artificial example, events like this are easily producible with zap.

Logrus on the otherhand uses a map for it's context that is copied and updated whenever fields are added (rather expensive in comparison).

ecslog uses a hybrid of zap and logrus. One adds fields like zap, but when logging we sort the fields and remove duplicates. The sorting can be cached for future log operations. ecslog contexts can be combined (they are a tree internall), without copying and iterating contexts over and over again.

ecslog also differentiates between standardized fields and custom fields, allowing the layout to handle adhoc custom fields by developers to be handled differently. For example always encode custom fields to an embedded string to not break mappings in Elasticsearch.

Finally having a custom context type requires a custom logger. Still using other loggers as backend would require context to be streamed to the backend logger, that needs to build a custom context to be thrown away right after each time we want to log.

webmat · 2020-01-17T14:47:08Z

The goal of the ECS logging libraries is twofold. As mentioned in Option 1, the MVP is to at least support 10-15 fields that can be populated transparently by a logging formatter. This MVP level lowers the bar to adoption of the ECS logging library in question. You just enable it, and the basics are done for you with very little effort. But there is also a desire to let developers to log any arbitrary ECS field as well, with these libraries.

So my understanding is that the mix of option 1 and 2 could actually yield a reuseable Golang logging library that hits both targets, and can be added to the list of ECS Logging libraries we offer customers here https://github.com/elastic/ecs-logging 😀🤞🏻?

cc @philkra

urso · 2020-01-21T13:31:07Z

This issue is not about the ecs-logging project, multiple library support or multiple language support. It is about ECS based logging in Beats in particular. With this in mind let's talk about what we want to expect from the logging support in Beats in the future. Some features/requirements coming to mind:

Structured logging is a must
Logs must follow ECS and must be indexible in ECS
a. Logger should never create events that can not be indexed or not parsed by Elasticsearch
b. Logger must fill in default ECS fields seamlessly
c. Logger or Library must guarantee type safety when logging ECS fields (ensure events are always indexible)
d. Besides standardized fields, logger must still support custom fields
- custom fields could be standardized as well (ECS + Beats logging schema)
- fully privte custom fields must be handled differently (e.g. embedd json object as string) to guanratee 'always indexible'
"Reasonable" performance
Multiple outputs support:

standard out/error (depending on level) for containers, systemd, journald
windows event logs
macOS OSLog
file rotation: (configure triggers, rotation with gzip, ...)
log event routing:
- e.g. send Error/Critical to stderr, other logs to stdout
- e.g. send all logs to rotated log file, but Error/Critical to windows events logs

Selector based log-level

all selectors start with default log level
log level can be changed in Beats config file
log level can be changed via HTTP API (Problem: how to define and discover existing selectors?)

Good support for testing:

create logger instance writing it's logs to testing.T
create logger instance writing logs to a byte buffer or an array of map[string]interface{}
=> no global logging

Good error support:

errors can have context
errors can wrap other errors.

ph · 2020-01-21T13:37:17Z

Log level and selectors changed dynamically for Agent. That would be useful for remote debugging. Imagine an inspect button in the UI that displays live log.

urso · 2020-03-02T18:38:30Z

Option 3

Could it be possible to mix Option 1 and Option 2, meaning that we delegate file rotation and exporter to Zap but we enforce everything on top of it. This actually mean that Zab mostly deal with raw bytes or string? I don't know if its possibe, WDYT @urso?

This functionality is not provided by zap. Only Fields + JSON encoder are provided by Zap. Almost everything else is custom in Beats.

elasticmachine · 2020-05-19T13:33:19Z

Pinging @elastic/ingest-management (Team:Ingest Management)

botelastic · 2021-09-14T15:58:14Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Mpdreamz · 2021-10-11T13:58:04Z

@ruflin Is this something we can still look into before 8.0?

Switching default logging formats post 8.0 becomes a trickier process.

elasticmachine · 2021-10-12T08:51:36Z

Pinging @elastic/agent (Team:Elastic-Agent)

ruflin · 2021-10-12T08:52:08Z

I added the Elastic Agent team label and ping @jlind23 on timing questions ;-)

jlind23 · 2021-10-12T11:39:48Z

Which option was the chosen one finally?

elasticmachine · 2021-10-12T11:40:05Z

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

Mpdreamz · 2021-10-12T14:34:28Z

For more context. Both Elasticsearch and Kibana will emit ECS logs OOTB with 8.0, APM server already does for a while. It would be a missed opportunity if we don't make the same move observability wide.

jlind23 · 2021-10-12T16:15:55Z

@Mpdreamz can we include this change in 8.0? Doing it for 7.16 is impossible as we are almost running out of time.

ruflin · 2021-10-13T07:57:11Z

@Mpdreamz What is apm-server using for the ecs logs? If it works well we should follow the same path in Beats.

Mpdreamz · 2021-10-14T09:36:17Z

I believe APM-server uses zap but is not yet on https://github.com/elastic/ecs-logging-go-zap and uses a custom formatter. We ultimately want to hop on ecs-logging-go-zap as well.

cc @simitt let me know if the above is not true :)

simitt · 2021-10-14T09:51:40Z

I added basic ECS logging support to the custom logp handling in libbeat some time ago #17974, and it is using ecs-logging-go-zap under the hood.

This can be enabled by setting logging.ecs:true, which apm-server already defaults to since a couple minor versions. It is up to every beat to do the same.
When enabled, It adds the ECS log information according to the MVP spec, so only minimal ECS information for now. But every beat can add additional information when adding a log line. For example, apm-server sets additional log information in the middleware. Every beat is still responsible to not add non-ecs compatible fields in its logs.

So, basically it should be as easy as turning on logging.ecs:true by default in every beat to get minimal ECS information added to the logs.

Mpdreamz · 2021-10-14T11:15:23Z

Thanks for the correction @simitt 👍

@jlind23 if this can go into 8.0 that would be a great outcome 👍

elasticmachine · 2021-10-14T11:23:02Z

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

Mpdreamz · 2021-10-14T11:23:12Z

Also just to confirm Elastic Agent uses whatever the default is in libbeat correct? So this would apply to Elastic Agent too?

ruflin · 2021-10-18T11:56:31Z

Elastic Agent itself is not a beat and not directly based on libbeat. @michalpristas could you comment the logging behaviour?

michel-laterman · 2021-11-04T20:31:11Z

Associated PRs:

#28573 - remove code for 8.0
elastic/observability-docs#1208 - remove setting from elastic-agent docs for 8.0
#28772 - deprecate settings for 7.16
elastic/observability-docs#1206 - mark settings as deprecated in elastic-agent docs for 7.16

ph added breaking change Project:fleet labels Jan 14, 2020

ph mentioned this issue Jan 14, 2020

[Agent] Refactor: Export our log using libbeat logging library.; #15129

Closed

andresrc added Team:Integrations Label for the Integrations team and removed Team:Beats labels Mar 6, 2020

simitt mentioned this issue Apr 23, 2020

Use ecszap formatter to add MVP support for ECS logging #17942

Closed

ph added the Team:Ingest Management label May 19, 2020

ph removed the Project:fleet label May 19, 2020

ph removed the Team:Ingest Management label Oct 14, 2020

botelastic bot added the Stalled label Sep 14, 2021

botelastic bot removed the Stalled label Oct 11, 2021

ruflin added the Team:Elastic-Agent Label for the Agent team label Oct 12, 2021

jlind23 added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Oct 12, 2021

jlind23 added the 7.16-candidate label Oct 12, 2021

jlind23 added 8.0-candidate and removed 7.16-candidate labels Oct 12, 2021

Mpdreamz mentioned this issue Oct 14, 2021

[Breaking change] Switch logging to ECS-JSON layout by default elastic/kibana#114968

Closed

andresrc added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Oct 14, 2021

Mpdreamz mentioned this issue Oct 14, 2021

Change Fleet Server over to use ECS logger to ensure logging format and ECS compliance elastic/fleet-server#371

Closed

michel-laterman self-assigned this Oct 20, 2021

michel-laterman mentioned this issue Oct 20, 2021

Force ECS and JSON logging for libbeat/logp #28573

Merged

4 tasks

jlind23 linked a pull request Oct 25, 2021 that will close this issue

Force ECS and JSON logging for libbeat/logp #28573

Merged

4 tasks

jlind23 added v8.0.0 and removed 8.0-candidate labels Nov 2, 2021

michel-laterman mentioned this issue Nov 2, 2021

Mark logging.ecs and json as deprecated #28772

Merged

2 tasks

michel-laterman closed this as completed in #28573 Nov 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Libbeat] ECS Logging #15544

[Libbeat] ECS Logging #15544

ph commented Jan 14, 2020 •

edited

Loading

elasticmachine commented Jan 14, 2020

ph commented Jan 14, 2020 •

edited

Loading

ph commented Jan 14, 2020

urso commented Jan 15, 2020 •

edited

Loading

webmat commented Jan 17, 2020

urso commented Jan 21, 2020 •

edited

Loading

ph commented Jan 21, 2020

urso commented Mar 2, 2020

Option 3

elasticmachine commented May 19, 2020

botelastic bot commented Sep 14, 2021

Mpdreamz commented Oct 11, 2021

elasticmachine commented Oct 12, 2021

ruflin commented Oct 12, 2021

jlind23 commented Oct 12, 2021

elasticmachine commented Oct 12, 2021

Mpdreamz commented Oct 12, 2021

jlind23 commented Oct 12, 2021

ruflin commented Oct 13, 2021

Mpdreamz commented Oct 14, 2021

simitt commented Oct 14, 2021

Mpdreamz commented Oct 14, 2021

elasticmachine commented Oct 14, 2021

Mpdreamz commented Oct 14, 2021

ruflin commented Oct 18, 2021

michel-laterman commented Nov 4, 2021

[Libbeat] ECS Logging #15544

[Libbeat] ECS Logging #15544

Comments

ph commented Jan 14, 2020 • edited Loading

Motivation

History of logging in beats

How we will migrate to ECS?

Option 1

Option 2

Option 3

Next steps

elasticmachine commented Jan 14, 2020

ph commented Jan 14, 2020 • edited Loading

ph commented Jan 14, 2020

urso commented Jan 15, 2020 • edited Loading

webmat commented Jan 17, 2020

urso commented Jan 21, 2020 • edited Loading

ph commented Jan 21, 2020

urso commented Mar 2, 2020

Option 3

elasticmachine commented May 19, 2020

botelastic bot commented Sep 14, 2021

Mpdreamz commented Oct 11, 2021

elasticmachine commented Oct 12, 2021

ruflin commented Oct 12, 2021

jlind23 commented Oct 12, 2021

elasticmachine commented Oct 12, 2021

Mpdreamz commented Oct 12, 2021

jlind23 commented Oct 12, 2021

ruflin commented Oct 13, 2021

Mpdreamz commented Oct 14, 2021

simitt commented Oct 14, 2021

Mpdreamz commented Oct 14, 2021

elasticmachine commented Oct 14, 2021

Mpdreamz commented Oct 14, 2021

ruflin commented Oct 18, 2021

michel-laterman commented Nov 4, 2021

ph commented Jan 14, 2020 •

edited

Loading

ph commented Jan 14, 2020 •

edited

Loading

urso commented Jan 15, 2020 •

edited

Loading

urso commented Jan 21, 2020 •

edited

Loading