-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inhibition rules evaluation performance on many target alerts #3932
Comments
I implemented the memoization in #3933, which also includes a benchmark. |
Hi! 👋 I thought |
Hey, |
Ah. So weird – I don't understand why Alertmanager needs to re-calculate inhibitions and silences for GET requests. Anyway, thank you for your contribution. I'll make sure to take a closer look later today or tomorrow. |
Friendly reminder, that I would still appreciate some feedback. Likely this got lost in between other things. |
Context
Requesting
/alerts
from my organizations alertmanager returns a reply in 4-6 seconds.The alertmanager has consistently
Given this scale, I was a bit surprised that fetching the alerts takes multiple seconds.
Taking a trace via pprof revealed that at least 4 seconds are spent calculating inhibited alerts.
Furthermore, a few hundred milliseconds are spent serializing the json response, which is expected due to the size of ~20 MiB in our case.
The structure of my organizations inhibition rules can lead to ~10k alerts matching a target matcher and ~1k alerts matching the source matcher.
Currently the benchmarks only consider cases with a single alert satisfying a target matcher.
Calling
/alerts
results basically in the following algorithm being executed at the moment:Mutes
here):Memoize source cache target match evaluations
The
Prevent two-sided match
step evaluatesr.TargetMatchers.Matches(a.Labels)
, wherea
is from the cached source alerts.It's invoked over and over again for each alert that is a valid target (potentially evaluating regular expressions), but it is not dependent on a specific target alert.
Assuming I did not miss a side effect, the result of
r.TargetMatchers.Matches(a.Labels)
is only dependent on a specific inhibition rule and that rule's source alert cache.By memoizing that expression, a performance gain can be achieved when many alerts are valid targets.
Constructing indices
Checking for label equality between the set of valid source alerts and valid target alerts runs in quadratic time.
By maintaining an index per label, which maps that label's values to an "index into a slice of all alerts", alerts that have a label with a specific value can found by a map lookup.
For multiple labels that need to be checked for equality, multiple slices of indices can be intersected in linear time.
This would be a larger change though.
The text was updated successfully, but these errors were encountered: