Remote Write Allocs Improvements #5614

csmarchbanks · 2019-05-30T23:02:30Z

A few different commits to improve allocations when running remote write.

Remove an unneeded temp variable
Only allocate pendingSamples once per shard
Allocate the send slice far fewer times
Use a mask rather than always copy samples into a temporary slice
Allocate a snappy buffer per shard rather than create one per request.

Added benchmark:

benchmark                     old ns/op     new ns/op     delta
BenchmarkSampleDelivery-4     1206794       1141762       -5.39%

benchmark                     old allocs     new allocs     delta
BenchmarkSampleDelivery-4     7152           7121           -0.43%

benchmark                     old bytes     new bytes     delta
BenchmarkSampleDelivery-4     909406        610669        -32.85%

Much of the remaining allocs are from proto encoding/decoding.

Reviewing commit by commit will have all changes in isolation.

@cstyan Have you been running any benchmarks for these? I only see one that requires a real WAL somewhere.

storage/remote/queue_manager.go

cstyan · 2019-05-31T17:08:36Z

storage/remote/queue_manager.go

 	max := s.qm.cfg.MaxSamplesPerSend
+	pendingSamples := make([]prompb.TimeSeries, 0, max)


I think this has the potential to cause more backup problems.

Could you explain this comment more? I am not seeing how this would cause backup problems.

Sorry, I was thinking this would enforce there being no more than maxSamplesPerSend pending samples, but that's not how slices work 🤦‍♂️

Backup meaning the shards not being able to send samples at the rate their being scrapped and written to the WAL. At the moment we send as soon as we reach maxSamplesPerSend or the flush timeout. At the moment we can potentially have some multiple of maxSamplesPerSend samples pending, meaning that after a successful send there's theoretically less time before we have enough samples for the next send. If we enforced only have maxSamplesPerSend pending it could take longer to fill up pendingSample given that we block on the shards queue.

csmarchbanks · 2019-05-31T21:33:37Z

When looking further through the code I am pretty convinced that seriesMtx is unnecessary which allows me simplify Append to only have one loop and no tempSamples. I have been running remote write locally for awhile now with -race, and only found the race condition in maxGauge that I fixed with the most recent commit.

csmarchbanks · 2019-06-03T19:51:46Z

Update: I ran this branch all weekend in a cluster without issue. I am reasonably sure that the mutex is not needed.

tomwilkie · 2019-06-04T20:39:32Z

storage/remote/queue_manager.go

 	backoff := s.qm.cfg.MinBackoff
-	req, highest, err := buildWriteRequest(samples)
+	req, highest, err := buildWriteRequest(samples, *buf)
+	*buf = req


Do you think this assignment to the pointer should be in buildWriteRequest?

I put it here because it is the last place that deals with *shards and I wanted to keep the mutation isolated to those functions. I am happy to move it down to buildWriteRequest if you have a strong preference.

tomwilkie · 2019-06-04T20:42:03Z

Update: I ran this branch all weekend in a cluster without issue. I am reasonably sure that the mutex is not needed.

I think you're right, as there is a single go routine in the watcher driving all these interactions. This will become more obvious when we refactor the WAL Watcher.

LGTM from me but I'll hold off on merging until @cstyan has had a chance to run this in our env.

csmarchbanks · 2019-06-11T13:59:46Z

@tomwilkie I know @cstyan ran this for a little bit in one of your environments. Do you need to run it longer or is this good to go?

tomwilkie · 2019-06-26T18:58:19Z

Let me ask @cstyan. Sorry for dropping ball on this.

Signed-off-by: Chris Marchbanks <[email protected]>

It is not possible for any of the places protected by the seriesMtx to be called concurrently so it is safe to remove. By removing the mutex we can simplify the Append code to one loop. Signed-off-by: Chris Marchbanks <[email protected]>

Signed-off-by: Chris Marchbanks <[email protected]>

cstyan · 2019-06-27T05:03:43Z

I just deployed a new build of this image to some test prometheus instances, it will run overnight at the least and I'll check resource usage in the morning.

tomwilkie · 2019-06-27T18:47:44Z

LGTM

csmarchbanks force-pushed the rw-allocs branch 2 times, most recently from 0d65880 to c5be9ce Compare May 31, 2019 04:12

brian-brazil reviewed May 31, 2019

View reviewed changes

storage/remote/queue_manager.go Outdated Show resolved Hide resolved

cstyan reviewed May 31, 2019

View reviewed changes

csmarchbanks force-pushed the rw-allocs branch from c5be9ce to df4a028 Compare May 31, 2019 21:00

csmarchbanks force-pushed the rw-allocs branch from df4a028 to f2b40ab Compare June 3, 2019 19:51

tomwilkie reviewed Jun 4, 2019

View reviewed changes

csmarchbanks mentioned this pull request Jun 11, 2019

resolve race condition in maxGauge #5647

Merged

csmarchbanks force-pushed the rw-allocs branch from f2b40ab to 82757c8 Compare June 13, 2019 22:24

tomwilkie added the component/remote storage label Jun 26, 2019

csmarchbanks added 6 commits June 26, 2019 16:33

Add benchmark for sample delivery

eab9f4c

Signed-off-by: Chris Marchbanks <[email protected]>

Simplify StoreSeries to have only one loop

d53b9e8

Signed-off-by: Chris Marchbanks <[email protected]>

Reduce allocations for pending samples in runShard

d3ebe47

Signed-off-by: Chris Marchbanks <[email protected]>

Only allocate one send slice per segment

b618d2f

Signed-off-by: Chris Marchbanks <[email protected]>

Remove queue manager seriesMtx

6a2203c

It is not possible for any of the places protected by the seriesMtx to be called concurrently so it is safe to remove. By removing the mutex we can simplify the Append code to one loop. Signed-off-by: Chris Marchbanks <[email protected]>

Cache a buffer in each shard for snappy to use

b5b8651

Signed-off-by: Chris Marchbanks <[email protected]>

csmarchbanks force-pushed the rw-allocs branch from 82757c8 to b5b8651 Compare June 26, 2019 22:45

tomwilkie merged commit 06bdaf0 into prometheus:master Jun 27, 2019

csmarchbanks deleted the rw-allocs branch June 27, 2019 18:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remote Write Allocs Improvements #5614

Remote Write Allocs Improvements #5614

csmarchbanks commented May 30, 2019 •

edited

Loading

cstyan May 31, 2019

csmarchbanks May 31, 2019

cstyan May 31, 2019

csmarchbanks commented May 31, 2019

csmarchbanks commented Jun 3, 2019

tomwilkie Jun 4, 2019

csmarchbanks Jun 4, 2019

tomwilkie commented Jun 4, 2019

csmarchbanks commented Jun 11, 2019

tomwilkie commented Jun 26, 2019

cstyan commented Jun 27, 2019

tomwilkie commented Jun 27, 2019

		max := s.qm.cfg.MaxSamplesPerSend
		pendingSamples := make([]prompb.TimeSeries, 0, max)

Remote Write Allocs Improvements #5614

Remote Write Allocs Improvements #5614

Conversation

csmarchbanks commented May 30, 2019 • edited Loading

cstyan May 31, 2019

Choose a reason for hiding this comment

csmarchbanks May 31, 2019

Choose a reason for hiding this comment

cstyan May 31, 2019

Choose a reason for hiding this comment

csmarchbanks commented May 31, 2019

csmarchbanks commented Jun 3, 2019

tomwilkie Jun 4, 2019

Choose a reason for hiding this comment

csmarchbanks Jun 4, 2019

Choose a reason for hiding this comment

tomwilkie commented Jun 4, 2019

csmarchbanks commented Jun 11, 2019

tomwilkie commented Jun 26, 2019

cstyan commented Jun 27, 2019

tomwilkie commented Jun 27, 2019

csmarchbanks commented May 30, 2019 •

edited

Loading