Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update filestream reader offset when line is skipped #23417

Merged

Conversation

kvch
Copy link
Contributor

@kvch kvch commented Jan 11, 2021

What does this PR do?

This PR adds two previously missing offset updates to the filestream reader when a line is skipped.

Why is it important?

The offset could be incorrect if Filebeat skips the line for the following reasons:

  1. The line is unparsable
  2. The line should not be published because of user configuration in export_line or import_line

If the offset is not updated in the reader, the state information of newer published events become incorrect. This might lead to duplicated events if Filebeat is restarted.

Checklist

  • My code follows the style guidelines of this project
    - [ ] I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation
    - [ ] I have made corresponding change to the default configuration files
    - [ ] I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

  1. Start Filebeat with the following configuration
filebeat.inputs:
- type: filestream
  enabled: true
  paths:
    - test.log
  exclude_lines: ['^DONOTPUBLISH']

output.elasticsearch:
  enabled: true
  hosts: ["localhost:9200"]

Reading this file

line 1
DONOTPUBLISH line2
line 3
DONOTPUBLISH line4
line 5
  1. Stop Filebeat
  2. Add new lines to the input file which will be published
  3. Start Filebeat

Validate that Filebeat does not send duplicate messages.

@kvch kvch added bug Filebeat Filebeat labels Jan 11, 2021
@kvch kvch requested a review from urso January 11, 2021 14:04
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jan 11, 2021
@kvch kvch added the Team:Elastic-Agent Label for the Agent team label Jan 11, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 11, 2021
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jan 11, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #23417 updated

    • Start Time: 2021-01-21T16:31:30.902+0000
  • Duration: 52 min 42 sec

  • Commit: 7176c36

Test stats 🧪

Test Results
Failed 0
Passed 5135
Skipped 574
Total 5709

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 5135
Skipped 574
Total 5709

@kvch kvch added the needs_backport PR is waiting to be backported to other branches. label Jan 11, 2021
@kvch kvch force-pushed the fix-filebeat-update-filestream-offset-when-needed branch from fc8d50f to 7176c36 Compare January 21, 2021 16:30
@kvch kvch merged commit e5cd64f into elastic:master Jan 21, 2021
@kvch kvch added v7.12.0 and removed needs_backport PR is waiting to be backported to other branches. labels Jan 21, 2021
kvch added a commit to kvch/beats that referenced this pull request Jan 21, 2021
This PR adds two previously missing offset updates to the `filestream` reader when a line is skipped.

The offset could be incorrect if Filebeat skips the line if the line should not be published because of user configuration in `export_line` or `import_line`

If the offset is not updated in the reader, the state information of newer published events become incorrect. This might lead to duplicated events if Filebeat is restarted.

(cherry picked from commit e5cd64f)
kvch added a commit that referenced this pull request Jan 21, 2021
This PR adds two previously missing offset updates to the `filestream` reader when a line is skipped.

The offset could be incorrect if Filebeat skips the line if the line should not be published because of user configuration in `export_line` or `import_line`

If the offset is not updated in the reader, the state information of newer published events become incorrect. This might lead to duplicated events if Filebeat is restarted.

(cherry picked from commit e5cd64f)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Filebeat Filebeat Team:Elastic-Agent Label for the Agent team v7.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants