Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ES output] Correctly log event fields in events log file #40512

Merged
merged 1 commit into from
Aug 14, 2024

Conversation

belimawr
Copy link
Contributor

@belimawr belimawr commented Aug 13, 2024

Proposed commit message

The Elasticsearch output, when faced with ingestion errors was logging the raw publisher.Event that had already been encoded, hence no event fields were present in the logs.

This commit fixes it by adding a String method to the encodedEvent type and using the encodedEvent in the logs instead of the publisher.Event.

Closes: #40509

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

None, it fixes a bug.

## Author's Checklist

How to test this PR locally

Start Filebeat with the following configuration

filebeat.inputs:
  - type: filestream
    id: filestream-input-id
    enabled: true
    parsers:
      - ndjson:
          target: ""
          overwrite_keys: true
          expand_keys: true
          add_error_key: true
          ignore_decoding_error: false
    paths:
      - /tmp/flog.log

output:
  elasticsearch:
    hosts:
      - localhost:9200
    protocol: https
    username: elastic
    password: changeme
    allow_older_versions: true
    ssl.verification_mode: none

logging:
  level: debug
  event_data:
    files:
      name: filebeat-events-data # that's the default, change it if you want another name.

Create the log file /tmp/flog.log with the following content:

{"message":"foo bar","int":10,"string":"str"}
{"message":"another message","int":20,"string":"str2"}
{"message":"index failure","int":"not a number","string":10}
{"message":"second index failure","int":"not a number","string":10}
A broken JSON

Look for the event log file: logs/filebeat-events-data*.ndjson and ensure the logged events contains their fields.

Here is an example of how the content of the event log file should look like:

{
  "log.level": "warn",
  "@timestamp": "2024-08-13T16:42:09.008-0400",
  "log.logger": "elasticsearch",
  "log.origin": {
    "function": "github.com/elastic/beats/v7/libbeat/outputs/elasticsearch.(*Client).applyItemStatus",
    "file.name": "elasticsearch/client.go",
    "file.line": 490
  },
  "message": "Cannot index event '{\"@timestamp\":\"2024-08-13T20:42:05.928Z\",\"host\":{\"name\":\"millennium-falcon\"},\"agent\":{\"version\":\"8.16.0\",\"ephemeral_id\":\"6d195bff-27a4-40c4-9b3e-c3ecb068f06e\",\"id\":\"6a760df8-a3e6-4369-886a-3f499c792302\",\"name\":\"millennium-falcon\",\"type\":\"filebeat\"},\"log\":{\"file\":{\"device_id\":\"40\",\"inode\":\"51817\",\"path\":\"/tmp/flog.log\"},\"offset\":101},\"string\":10,\"message\":\"index failure\",\"int\":\"not a number\",\"input\":{\"type\":\"filestream\"},\"ecs\":{\"version\":\"8.0.0\"}}\n' (status=400): {\"type\":\"document_parsing_exception\",\"reason\":\"[1:380] failed to parse field [int] of type [long] in document with id 'iVl6TZEBA82tHj8dCPpP'. Preview of field's value: 'not a number'\",\"caused_by\":{\"type\":\"illegal_argument_exception\",\"reason\":\"For input string: \\\"not a number\\\"\"}}, dropping event!",
  "service.name": "filebeat",
  "log.type": "event",
  "ecs.version": "1.6.0"
}
{
  "log.level": "warn",
  "@timestamp": "2024-08-13T16:42:09.009-0400",
  "log.logger": "elasticsearch",
  "log.origin": {
    "function": "github.com/elastic/beats/v7/libbeat/outputs/elasticsearch.(*Client).applyItemStatus",
    "file.name": "elasticsearch/client.go",
    "file.line": 490
  },
  "message": "Cannot index event '{\"@timestamp\":\"2024-08-13T20:42:05.928Z\",\"host\":{\"name\":\"millennium-falcon\"},\"agent\":{\"name\":\"millennium-falcon\",\"type\":\"filebeat\",\"version\":\"8.16.0\",\"ephemeral_id\":\"6d195bff-27a4-40c4-9b3e-c3ecb068f06e\",\"id\":\"6a760df8-a3e6-4369-886a-3f499c792302\"},\"ecs\":{\"version\":\"8.0.0\"},\"log\":{\"offset\":162,\"file\":{\"path\":\"/tmp/flog.log\",\"device_id\":\"40\",\"inode\":\"51817\"}},\"message\":\"second index failure\",\"int\":\"not a number\",\"string\":10,\"input\":{\"type\":\"filestream\"}}\n' (status=400): {\"type\":\"document_parsing_exception\",\"reason\":\"[1:401] failed to parse field [int] of type [long] in document with id 'ill6TZEBA82tHj8dCPpP'. Preview of field's value: 'not a number'\",\"caused_by\":{\"type\":\"illegal_argument_exception\",\"reason\":\"For input string: \\\"not a number\\\"\"}}, dropping event!",
  "service.name": "filebeat",
  "log.type": "event",
  "ecs.version": "1.6.0"
}

## Use cases
## Screenshots

@belimawr belimawr requested a review from a team as a code owner August 13, 2024 20:43
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Aug 13, 2024
@belimawr belimawr added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Aug 13, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Aug 13, 2024
Copy link
Contributor

mergify bot commented Aug 13, 2024

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @belimawr? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@belimawr belimawr force-pushed the decode-event-when-loggin-on-es branch from befcdd4 to 3992b2c Compare August 13, 2024 20:45
@cmacknz cmacknz added the backport-8.15 Automated backport to the 8.15 branch with mergify label Aug 13, 2024
@belimawr belimawr changed the title [ES output] Log event fields in events log file [ES output] Correctly log event fields in events log file Aug 13, 2024
@AndersonQ AndersonQ self-requested a review August 14, 2024 07:31
Copy link
Contributor

mergify bot commented Aug 14, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b decode-event-when-loggin-on-es upstream/decode-event-when-loggin-on-es
git merge upstream/main
git push upstream decode-event-when-loggin-on-es

The Elasticsearch output, when faced with ingestion errors was logging
the raw publisher.Event that had already been encoded, hence no event
fields were present in the logs.

This commit fixes it by adding a String method to the encodedEvent
type and using the encodedEvent in the logs instead of the
publisher.Event.
@belimawr belimawr force-pushed the decode-event-when-loggin-on-es branch from 4f44670 to ff5c4f8 Compare August 14, 2024 16:49
@cmacknz
Copy link
Member

cmacknz commented Aug 14, 2024

Do we know why the existing event log tests failed to catch this, since they were intended to?

@belimawr
Copy link
Contributor Author

Do we know why the existing event log tests failed to catch this, since they were intended to?

As you'd suggested, the tests were looking for a substring that is also present in the ES response, it was just the value of the field. Now the new tests look for a piece of the JSON encoding of the fields, which at the time of testing was not part of the ES response, it has to come from the event being logged.

@belimawr belimawr merged commit e7732c6 into elastic:main Aug 14, 2024
123 checks passed
mergify bot pushed a commit that referenced this pull request Aug 14, 2024
The Elasticsearch output, when faced with ingestion errors was logging
the raw publisher.Event that had already been encoded, hence no event
fields were present in the logs.

This commit fixes it by adding a String method to the encodedEvent
type and using the encodedEvent in the logs instead of the
publisher.Event.

(cherry picked from commit e7732c6)

# Conflicts:
#	libbeat/outputs/elasticsearch/client.go
pierrehilbert pushed a commit that referenced this pull request Aug 19, 2024
…nts log file (#40531)

* [ES output] Log event fields in events log file (#40512)

The Elasticsearch output, when faced with ingestion errors was logging
the raw publisher.Event that had already been encoded, hence no event
fields were present in the logs.

This commit fixes it by adding a String method to the encodedEvent
type and using the encodedEvent in the logs instead of the
publisher.Event.

(cherry picked from commit e7732c6)

# Conflicts:
#	libbeat/outputs/elasticsearch/client.go

* Fix merge conflicts

---------

Co-authored-by: Tiago Queiroz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.15 Automated backport to the 8.15 branch with mergify Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Early event encoding obscures raw event content in the event log
6 participants