-
Notifications
You must be signed in to change notification settings - Fork 9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remote Write Allocs Improvements #5614
Conversation
0d65880
to
c5be9ce
Compare
max := s.qm.cfg.MaxSamplesPerSend | ||
pendingSamples := make([]prompb.TimeSeries, 0, max) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this has the potential to cause more backup problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain this comment more? I am not seeing how this would cause backup problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was thinking this would enforce there being no more than maxSamplesPerSend
pending samples, but that's not how slices work 🤦♂️
Backup meaning the shards not being able to send samples at the rate their being scrapped and written to the WAL. At the moment we send as soon as we reach maxSamplesPerSend
or the flush timeout. At the moment we can potentially have some multiple of maxSamplesPerSend
samples pending, meaning that after a successful send there's theoretically less time before we have enough samples for the next send. If we enforced only have maxSamplesPerSend
pending it could take longer to fill up pendingSample
given that we block on the shards queue.
When looking further through the code I am pretty convinced that |
Update: I ran this branch all weekend in a cluster without issue. I am reasonably sure that the mutex is not needed. |
backoff := s.qm.cfg.MinBackoff | ||
req, highest, err := buildWriteRequest(samples) | ||
req, highest, err := buildWriteRequest(samples, *buf) | ||
*buf = req |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this assignment to the pointer should be in buildWriteRequest
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put it here because it is the last place that deals with *shards
and I wanted to keep the mutation isolated to those functions. I am happy to move it down to buildWriteRequest
if you have a strong preference.
I think you're right, as there is a single go routine in the watcher driving all these interactions. This will become more obvious when we refactor the WAL Watcher. LGTM from me but I'll hold off on merging until @cstyan has had a chance to run this in our env. |
@tomwilkie I know @cstyan ran this for a little bit in one of your environments. Do you need to run it longer or is this good to go? |
Let me ask @cstyan. Sorry for dropping ball on this. |
Signed-off-by: Chris Marchbanks <[email protected]>
Signed-off-by: Chris Marchbanks <[email protected]>
Signed-off-by: Chris Marchbanks <[email protected]>
Signed-off-by: Chris Marchbanks <[email protected]>
It is not possible for any of the places protected by the seriesMtx to be called concurrently so it is safe to remove. By removing the mutex we can simplify the Append code to one loop. Signed-off-by: Chris Marchbanks <[email protected]>
Signed-off-by: Chris Marchbanks <[email protected]>
I just deployed a new build of this image to some test prometheus instances, it will run overnight at the least and I'll check resource usage in the morning. |
A few different commits to improve allocations when running remote write.
pendingSamples
once per shardsend
slice far fewer timesAdded benchmark:
Much of the remaining allocs are from proto encoding/decoding.
Reviewing commit by commit will have all changes in isolation.
@cstyan Have you been running any benchmarks for these? I only see one that requires a real WAL somewhere.