-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add backoff to flush op #13140
Conversation
551891a
to
48bbf5c
Compare
@@ -126,7 +128,10 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) { | |||
|
|||
f.IntVar(&cfg.ConcurrentFlushes, "ingester.concurrent-flushes", 32, "How many flushes can happen concurrently from each stream.") | |||
f.DurationVar(&cfg.FlushCheckPeriod, "ingester.flush-check-period", 30*time.Second, "How often should the ingester see if there are any blocks to flush. The first flush check is delayed by a random time up to 0.8x the flush check period. Additionally, there is +/- 1% jitter added to the interval.") | |||
f.DurationVar(&cfg.FlushOpTimeout, "ingester.flush-op-timeout", 10*time.Minute, "The timeout before a flush is cancelled.") | |||
f.DurationVar(&cfg.FlushOpBackoff.MinBackoff, "ingester.flush-op-backoff-min-period", 10*time.Second, "Minimum backoff period when a flush fails. Each concurrent flush has its own backoff, see `ingester.concurrent-flushes`.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose to declare these here instead of using cfg.FlushOpBackoff.RegisterFlagsWithPrefix()
for two reasons:
- I wanted to use
-
separator instead of.
to be consistent withingester.flush-op-timeout
. - I wanted to write a more specific help message for each of these options that explains how they are used.
5bb49f5
to
837d5c8
Compare
1ee50bd
to
127099b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
e661dcf
to
d26e496
Compare
6ca1cb2
to
1994e1a
Compare
I made some changes to how the timeout works (per flush instead of per op), would love another review.
1c23a0f
to
c1bf4d6
Compare
This commit adds a configurable backoff to flush ops in the ingester. This is to prevent situations where the store put operation fails fast (i.e. 401 Unauthorized) and can cause ingesters to be rate limited.
c1bf4d6
to
ae77be1
Compare
@@ -135,8 +137,9 @@ func (i *Ingester) sweepStream(instance *instance, stream *stream, immediate boo | |||
} | |||
|
|||
func (i *Ingester) flushLoop(j int) { | |||
l := log.With(i.logger, "loop", j) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the loop
to the logger so when a flush fails, or is canceled after max retries, we can see which loop it was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Well done 👏
What this PR does / why we need it:
This commit adds a configurable backoff to flush ops in the ingester. This is to prevent situations where the store put operation fails fast (i.e. 401 Unauthorized) and can cause ingesters to be rate limited.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Checklist
CONTRIBUTING.md
guide (required)feat
PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.docs/sources/setup/upgrade/_index.md
production/helm/loki/Chart.yaml
and updateproduction/helm/loki/CHANGELOG.md
andproduction/helm/loki/README.md
. Example PRdeprecated-config.yaml
anddeleted-config.yaml
files respectively in thetools/deprecated-config-checker
directory. Example PR