Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to flush sinks on shutdown #912

Merged
merged 4 commits into from
Mar 7, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@
* A prometheus remote-write sink compatible with Cortex and more. Thanks, [philipnrmn](https://github.com/philipnrmn)!
* Some veneur-prometheus arguments to rename and add additional tags. Thanks, [christopherb-stripe](https://github.com/christopherb-stripe)!
* Migrate Prometheus to new config format; part of multi-sink routing update. Thanks, [truong-stripe](https://github.com/truong-stripe)!
* Authentication support for Cortex remote-write sink. Thanks,
[oscil8](https://github.com/oscil8)!
* Authentication support for Cortex remote-write sink. Thanks, [oscil8](https://github.com/oscil8)!
* Option to flush sinks on shutdown. Thanks, [csolidum](https://github.com/csolidum)!

## Bugfixes
* A fix for forwarding metrics with gRPC using the kubernetes discoverer. Thanks, [androohan](https://github.com/androohan)!
Expand Down
1 change: 1 addition & 0 deletions config.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ type Config struct {
} `yaml:"features"`
FlushFile string `yaml:"flush_file"`
FlushMaxPerBody int `yaml:"flush_max_per_body"`
FlushOnShutdown bool `yaml:"flush_on_shutdown"`
FlushWatchdogMissedFlushes int `yaml:"flush_watchdog_missed_flushes"`
ForwardAddress string `yaml:"forward_address"`
ForwardUseGrpc bool `yaml:"forward_use_grpc"`
Expand Down
3 changes: 3 additions & 0 deletions example.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,9 @@ interval: "10s"
# watchdog.
flush_watchdog_missed_flushes: 0

# Whether to flush sinks on shutdown. Defaults to false
flush_on_shutdown: false

# Veneur can "sychronize" it's flushes with the system clock, flushing at even
# intervals i.e. 0, 10, 20… to align with the `interval`. This is disabled by
# default for now, as it can cause thundering herds in large installations.
Expand Down
8 changes: 8 additions & 0 deletions server.go
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,7 @@ type Server struct {
GRPCListenAddrs []net.Addr
RcvbufBytes int

FlushOnShutdown bool
Interval time.Duration
synchronizeInterval bool
numReaders int
Expand Down Expand Up @@ -498,6 +499,8 @@ func NewFromConfig(config ServerConfig) (*Server, error) {
}
logger.WithField("BlockProfileRate", conf.BlockProfileRate).Info("Set block profile rate (nanoseconds)")

ret.FlushOnShutdown = conf.FlushOnShutdown

// Use the pre-allocated Workers slice to know how many to start.
logger.WithField("number", len(ret.Workers)).Info("Preparing workers")
for i := range ret.Workers {
Expand Down Expand Up @@ -1372,6 +1375,11 @@ func (s *Server) Shutdown() {
// TODO(aditya) shut down workers and socket readers
s.logger.Info("Shutting down server gracefully")
close(s.shutdown)
if s.FlushOnShutdown {
ctx, cancel := context.WithDeadline(context.Background(), time.Now().Add(s.Interval))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is interval required? what happens if flush is set to true, but interval isn't set? could that potentially lead to not allowing enough time for flushing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like veneur has a default interval setting of 10s. My concern with not setting some sort of timeout is that the sidecar might get stuck in some sort of shutdown flow. In the case of flushing taking too long, this timeout also applies to the normal flush to sinks, so if flushing takes longer than this, you're already dropping metrics.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, context.WithTimeout(context.Background(), s.Interval) is one less step.

s.Flush(ctx)
cancel()
}
graceful.Shutdown()
for _, source := range s.sources {
source.Stop()
Expand Down
1 change: 1 addition & 0 deletions testdata/http_test_config.json
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
},
"FlushFile": "",
"FlushMaxPerBody": 0,
"FlushOnShutdown": false,
"FlushWatchdogMissedFlushes": 0,
"ForwardAddress": "",
"ForwardUseGrpc": false,
Expand Down
1 change: 1 addition & 0 deletions testdata/http_test_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ features:
migrate_span_sinks: false
flush_file: ""
flush_max_per_body: 0
flush_on_shutdown: false
flush_watchdog_missed_flushes: 0
forward_address: ""
forward_use_grpc: false
Expand Down