Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filebeat stateful inputs - remove source name from context.ID #38603

Closed

Conversation

andrewkroh
Copy link
Member

@andrewkroh andrewkroh commented Mar 25, 2024

Proposed commit message

For stateful inputs the context.ID is modified by appending the source name (as returned by the input-cursor.Source#Name method). This creates an inconsistency in behavior between stateless and stateful inputs.

I found the behavior to be surprising because the ID is user-configurable, but the user's configured value gets mutated (and only for some inputs types). So I think the clearest solution is to stop appending the source name.

From analyzing each of the stateful inputs that will be affected by this change, none of them use the ID is manner that will be affected. The value only appears to be used in log messages, metrics, or http tracer file names.

Stateful Input context.ID usages
filebeat/input/filestream log message
filebeat/input/journald not used
filebeat/input/winlog not used
x-pack/filebeat/input/awss3 metric ID
x-pack/filebeat/input/azureblobstorage not used
x-pack/filebeat/input/cel metric ID and http tracer file name
x-pack/filebeat/input/gcs not used
x-pack/filebeat/input/httpjson metric ID and http tracer file name
x-pack/filebeat/input/o365audit not used
x-pack/filebeat/input/shipper not used
x-pack/filebeat/input/websocket metric ID

The cursor ID that is written into the registry is constructed independent of the context.ID so it should not be affected. See

func (inp *managedInput) createSourceID(s Source) string {
if inp.userID != "" {
return fmt.Sprintf("%v::%v::%v", inp.manager.Type, inp.userID, s.Name())
}
return fmt.Sprintf("%v::%v", inp.manager.Type, s.Name())
}

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Mar 25, 2024
@andrewkroh andrewkroh added the Filebeat Filebeat label Mar 25, 2024
Copy link
Contributor

mergify bot commented Mar 25, 2024

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @andrewkroh? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@elasticmachine
Copy link
Collaborator

elasticmachine commented Mar 25, 2024

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2024-03-27T15:46:54.134+0000

  • Duration: 134 min 46 sec

Test stats 🧪

Test Results
Failed 0
Passed 8425
Skipped 773
Total 9198

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@andrewkroh andrewkroh force-pushed the bugfix/fb/stateful-input-context-id branch from d0e12a6 to a22037a Compare March 26, 2024 18:35
@andrewkroh andrewkroh added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Mar 26, 2024
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Mar 26, 2024
For stateful inputs the context.ID is modified by appending the source name
(as returned by the input-cursor.Source#Name method). This creates an
inconsistency in behavior between stateless and stateful inputs.

I found the behavior to be surprising because the ID is user-configurable, but
the user's configured value gets mutated (and only for some inputs types). So I think
the clearest solution is to stop appending the source name.

From analyzing each of the stateful inputs that will be affected by this change,
none of them use the ID is manner that will be affected. The value only appears
to be used in log messages, metrics, or http tracer file names.

| Stateful Input                         | context.ID usages                   |
|----------------------------------------|-------------------------------------|
| filebeat/input/filestream              | log message                         |
| filebeat/input/journald                | not used                            |
| filebeat/input/winlog                  | not used                            |
| x-pack/filebeat/input/awss3            | metric ID                           |
| x-pack/filebeat/input/azureblobstorage | not used                            |
| x-pack/filebeat/input/cel              | metric ID and http tracer file name |
| x-pack/filebeat/input/gcs              | not used                            |
| x-pack/filebeat/input/httpjson         | metric ID and http tracer file name |
| x-pack/filebeat/input/o365audit        | not used                            |
| x-pack/filebeat/input/shipper          | not used                            |
| x-pack/filebeat/input/websocket        | metric ID                           |

The cursor ID that is written into the registry is constructed independent of the
context.ID so it should not be affected. See
https://github.com/elastic/beats/blob/78dc6649400610e9f908af7c90f35819f061e4e0/filebeat/input/v2/input-cursor/input.go#L171-L176
Always call the context.CancelFunc function to stop the running input in the case
of a test failure. This ensures that any resources associated with the input
are freed.
input_integration_test.go:324:16: Error return value of `writer.Write` is not checked (errcheck)
                        writer.Write(line)
                                    ^
input_integration_test.go:1036:11: Error return value of `f.Write` is not checked (errcheck)
                        f.Write([]byte(fmt.Sprintf("hello world %d\n", r*iterations+n)))
@andrewkroh andrewkroh force-pushed the bugfix/fb/stateful-input-context-id branch from 287c1d8 to ed68e73 Compare March 27, 2024 15:46
@andrewkroh andrewkroh marked this pull request as ready for review March 28, 2024 16:21
@andrewkroh andrewkroh requested a review from a team as a code owner March 28, 2024 16:21
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

Copy link
Contributor

@belimawr belimawr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for Filestream, this will affect our ability to debug issues, there are some places where the context.ID it gets logged alongside errors, and the ID allows us to know the affected files instead of just having a generic "Filestream".

Some examples:

if err := hg.tg.Go(startHarvester(ctx, hg, src, false, hg.metrics)); err != nil {
ctx.Logger.Warnf(
"tried to start harvester with task group already closed",
ctx.ID)
}

if err != nil {
ctx.Logger.Warnf(
"input %s tried to Continue harvester with task group already closed",
ctx.ID)
}

if err := hg.tg.Go(startHarvester(ctx, hg, src, true, hg.metrics)); err != nil {
ctx.Logger.Warnf(
"input %s tried to restart harvester with task group already closed",
ctx.ID)
}

The inputs that use this for metric name the TCP input

metrics := netmetrics.NewTCP("tcp", ctx.ID, s.config.Host, pollInterval, log)
worry me because if metrics with the same ID are registered it will cause Filebeat to panic.

I agree standardisation is good, but on this case I am really concerned by the side effects.

I looked at the linked issue, and just changing how input IDs are created does not seem to be a robust solution for me. Therefore I'm not quite sure the benefits of this change.

@rdner rdner removed their request for review April 5, 2024 20:38
@v1v v1v added the backport-8.x Automated backport to the 8.x branch with mergify label Sep 11, 2024
@andrewkroh
Copy link
Member Author

The goal of this PR has now been accomplished in #40909 through a change that does not affect the existing ID value. Closing.

@andrewkroh andrewkroh closed this Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify Filebeat Filebeat Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants