Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/ottl] ParseSeverity function #35079

Open
djaglowski opened this issue Sep 9, 2024 · 7 comments
Open

[pkg/ottl] ParseSeverity function #35079

djaglowski opened this issue Sep 9, 2024 · 7 comments
Labels
enhancement New feature or request pkg/ottl priority:p2 Medium

Comments

@djaglowski
Copy link
Member

Component(s)

pkg/ottl

Is your feature request related to a problem? Please describe.

OTTL can directly set severity number and text, but it would be nice if there was an optimized function for interpreting values. Stanza currently has a dedicated severity interpreter which has a couple advantages over using OTTL:

  1. Configuration is simple. The user can specify a mapping of values, ranges, or pre-defined categories, along with the level to which these values should be interpreted.
  2. Performance costs is almost entirely paid for at startup by building a full mapping which can then be used for instant interpretation of any value.

Describe the solution you'd like

The primary challenge with OTTL is that its functional nature may not lend itself well to specifying mappings. I'm not sure if there is a way to do this today, but I'd like users to be able to specify a function that contains an arbitrary number of mapping options. e.g. ParseSeverity(attributes["sev"], AsError("err", "error", "NOOO"), AsInfo("info", "hey"), AsInfoRange(1, 100), ...)

Describe alternatives you've considered

No response

Additional context

No response

@djaglowski djaglowski added enhancement New feature or request needs triage New item requiring triage labels Sep 9, 2024
Copy link
Contributor

github-actions bot commented Sep 9, 2024

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@TylerHelmuth
Copy link
Member

@djaglowski we support map literals now, will that help? The grammar supports:

ParseSeverity(attributes["sev"], {"err": ["error", "NOOO"], "info": ["hey"]})

As a reference for discussion, the existing solution would be multiple statements with conditions:

set(severity_number, SEVERITY_NUMBER_ERROR) where attributes["sev"] == "error" or attributes["sev"] == "NOOO"
set(severity_number, SEVERITY_NUMBER_INFO) where attributes["sev"] == "hey" or (attributes["status code"] >= 1 and attributes["status code"] < 100) 

Any additional function will need to be simpler than those statements.

@TylerHelmuth
Copy link
Member

It is possible that the solution doesn't need to be an additional function and instead could be a additional section in the transformprocessor config.

@djaglowski
Copy link
Member Author

djaglowski commented Sep 10, 2024

we support map literals now, will that help?

I don't think this is sufficient for ranges, which are a common use case with severity interpretation.

Any additional function will need to be simpler than those statements.

I don't think that's difficult but performance should also be a consideration here.


I'll give another example of statements which could be much clearer and more performant with a better solution.

In this case, HTTP status codes are interpreted into severity number and text:

set(severity_number, 5) where (IsMatch(body["status"], "^1[0-9]{2}$") and true)
set(severity_text, "debug") where (IsMatch(body["status"], "^1[0-9]{2}$") and true)
set(severity_number, 9) where (IsMatch(body["status"], "^2[0-9]{2}$") and true)
set(severity_text, "info") where (IsMatch(body["status"], "^2[0-9]{2}$") and true)
set(severity_number, 9) where (IsMatch(body["status"], "^3[0-9]{2}$") and true)
set(severity_text, "info") where (IsMatch(body["status"], "^3[0-9]{2}$") and true)
set(severity_number, 13) where (IsMatch(body["status"], "^4[0-9]{2}$") and true)
set(severity_text, "warn") where (IsMatch(body["status"], "^4[0-9]{2}$") and true)
set(severity_number, 17) where (IsMatch(body["status"], "^5[0-9]{2}$") and true)
set(severity_text, "error") where (IsMatch(body["status"], "^5[0-9]{2}$") and true)

This is both difficult to understand and not very performant since each statement must execute against each log record, and because the statements themselves are non-trivial to evaluate.

In contrast, the stanza is far simpler to understand:

severity:
  parse_from: body["status"]
  mapping:
    debug:
      - min: 100
        max: 199
    info:
      - min: 200
        max: 299
      - min: 300
        max: 399
    warn:
      - min: 400
        max: 499
    error:
      - min: 500
        max: 599

# (Actually it can be even simpler but this is a special case where aliases represent predefined ranges)
# mapping:
#   debug: 1xx
#   info:
#     - 2xx
#     - 3xx
#   warn: 4xx
#   error: 5xx

Additionally, there is a substantial performance optimization which occurs with the stanza configuration. Specifically, we're able to put all values into a map at startup, so that every log records requires exactly one map lookup.

I don't have strong opinions about exactly which form this should take in OTTL but I think it should be supported in some way that avoids such dense and wasteful statements.

@s71m
Copy link

s71m commented Oct 24, 2024

hi, i vote for this feature request, because

 receivers:
   syslog:
    udp:
      listen_address: 0.0.0.0:54525
    protocol: rfc5424

in that case logs will contain SeverityText, there will be "err", "crit", but I would like to compare it with the more well-known "ERROR", "FATAL" etc.
And I broke my head how to do this, because severity is a top level of entry and no possibility to get access here

      - type: severity_parser
        parse_from: severity_text

And only transformation can lead to the solution:

processors:
  transform:
    log_statements:
      - context: log
        statements:
          # --- DEBUG ---
          - set(severity_text, "DEBUG") where severity_text == "trace"
          - set(severity_text, "DEBUG") where severity_text == "debug"
          - set(severity_text, "DEBUG") where severity_text == "DEBUG"

          # --- INFO ---
          - set(severity_text, "INFO") where severity_text == "info"
          - set(severity_text, "INFO") where severity_text == "INFO"

          # --- WARNING ---
          - set(severity_text, "WARNING") where severity_text == "warn"
          - set(severity_text, "WARNING") where severity_text == "WARN"

          # --- ERROR ---
          - set(severity_text, "ERROR") where severity_text == "err"
          - set(severity_text, "ERROR") where severity_text == "error"

          # --- CRITICAL ---
          - set(severity_text, "CRITICAL") where severity_text == "crit"
          - set(severity_text, "CRITICAL") where severity_text == "fatal"
          - set(severity_text, "CRITICAL") where severity_text == "FATAL"

@djaglowski
Copy link
Member Author

@s71m if you are using udp receiver you can define your own severity mapping as part of severity_parser.

@s71m
Copy link

s71m commented Oct 25, 2024

@s71m if you are using udp receiver you can define your own severity mapping as part of severity_parser.

@djaglowski thx for reply! I'm using both, syslog-ng on remote vps actually via tcp, but for severity need to define field "parse_from", and i cant figured it out, which field i need to pass.

@TylerHelmuth TylerHelmuth added the priority:p2 Medium label Nov 15, 2024
@TylerHelmuth TylerHelmuth changed the title [ottl] ParseSeverity function [pkg/ottl] ParseSeverity function Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pkg/ottl priority:p2 Medium
Projects
None yet
Development

No branches or pull requests

4 participants