Skip to content

Commit

Permalink
compact: retry on cleanPartialMarked errors if possible (thanos-io#5922)
Browse files Browse the repository at this point in the history
cleanPartialMarked is calling SyncMetas which basically can have retriable errors.
By checking for retriable errors and retrying, it can prevent the compact from shutdown the HTTP server.

Signed-off-by: Seena Fallah <[email protected]>

Signed-off-by: Seena Fallah <[email protected]>
  • Loading branch information
clwluvw authored and Nathaniel Graham committed May 18, 2023
1 parent 30664d5 commit 3ed8bc1
Showing 1 changed file with 13 additions and 1 deletion.
14 changes: 13 additions & 1 deletion cmd/thanos/compact.go
Original file line number Diff line number Diff line change
Expand Up @@ -557,7 +557,19 @@ func runCompact(
// since one iteration potentially could take a long time.
if conf.cleanupBlocksInterval > 0 {
g.Add(func() error {
return runutil.Repeat(conf.cleanupBlocksInterval, ctx.Done(), cleanPartialMarked)
return runutil.Repeat(conf.cleanupBlocksInterval, ctx.Done(), func() error {
err := cleanPartialMarked()
if err != nil && compact.IsRetryError(err) {
// The RetryError signals that we hit an retriable error (transient error, no connection).
// You should alert on this being triggered too frequently.
level.Error(logger).Log("msg", "retriable error", "err", err)
compactMetrics.retried.Inc()

return nil
}

return err
})
}, func(error) {
cancel()
})
Expand Down

0 comments on commit 3ed8bc1

Please sign in to comment.