Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] cel input - non HTTP 200 status_code causes rapid retry #33999

Closed
andrewkroh opened this issue Dec 8, 2022 · 2 comments · Fixed by #34002
Closed

[Filebeat] cel input - non HTTP 200 status_code causes rapid retry #33999

andrewkroh opened this issue Dec 8, 2022 · 2 comments · Fixed by #34002
Assignees
Labels

Comments

@andrewkroh
Copy link
Member

andrewkroh commented Dec 8, 2022

When the CEL program returns a state.status_code that is non HTTP 200 then the input will repeat the request rapidly. This occurs for status codes that are outside the range that are handled by go-retryablehttp like [201, 499] except 429. If I omit the status_code then it basically works as expected by making a request, reporting the error, and then waiting for the next interval. So perhaps when a non-200 status_code is reported then the auto-generated error.message could include information about the status code and it would wait for the next interval?

In 6.231 sec the input made 10893 requests. That's ~1700 request per second. 😨

% grep -h "HTTP request" cel.trace | wc -l
   10893
% head -1 cel.trace| jq '."@timestamp"'
"2022-12-08T10:38:10.445-0500"
% tail -1 cel.trace| jq '."@timestamp"'
"2022-12-08T10:38:16.676-0500"

Version: filebeat version 8.7.0 (arm64), libbeat 8.7.0 [f08eed8 built 2022-12-08 15:32:00 +0000 UTC]

Reproduction

package main

import (
	"log"
	"net/http"
)

func main() {
	handler := func(writer http.ResponseWriter, request *http.Request) {
		http.Error(writer, "Go away", http.StatusForbidden)
	}
	log.Fatal(http.ListenAndServe("localhost:9888", http.HandlerFunc(handler)))
}
---

filebeat.inputs:
  - type: cel
    interval: 1m

    resource:
      url: http://localhost:9888
      retry:
        min_wait: 10s
        max_wait: 120s
      tracer:
        filename: cel.trace

    program: |
      get('http://localhost:9888').as(resp, bytes(resp.Body).as(body, {
          "events": (resp.StatusCode == 200) ? try(body.decode_json()) : {"error": {"message": "unexpected status code " + string(resp.StatusCode)} },
          "status_code": resp.StatusCode,
          "want_more": false,
      }))

output.console.pretty: true
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Dec 8, 2022
@andrewkroh andrewkroh added Team:Security-External Integrations and removed needs_team Indicates that the issue/PR needs a Team:* label labels Dec 8, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@efd6 efd6 self-assigned this Dec 8, 2022
@efd6
Copy link
Contributor

efd6 commented Dec 8, 2022

Proposed fix.

diff --git a/x-pack/filebeat/input/cel/input.go b/x-pack/filebeat/input/cel/input.go
index c1a1f1a504..0f77a177b1 100644
--- a/x-pack/filebeat/input/cel/input.go
+++ b/x-pack/filebeat/input/cel/input.go
@@ -529,10 +529,14 @@ func handleResponse(log *logp.Logger, state map[string]interface{}, limiter *rat
                                        waitUntil = t
                                }
                        }
-                       fallthrough
-               default:
-                       delete(state, "events")
                        return false, waitUntil, nil
+               default:
+                       status := http.StatusText(statusCode)
+                       if status == "" {
+                               status = "unknown status code"
+                       }
+                       state["events"] = map[string]interface{}{"error.message": fmt.Sprintf("failed http request with %s: %d", status, statusCode)}
+                       return true, time.Time{}, nil
                }
        }
        return true, waitUntil, nil

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants