-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
drastically reduce allocations in ring buffer implementation #64
Conversation
Before this change `Append` was allocating a new slice every time because len(s.b) == cap(s.b). In my testings len(s.b) is usually small enough (<= 50, sometimes spikes to 200) even in high-write benchmarks such as BenchmarkSendRecvLarge, often near 0, so copy should not be too expensive. It is also efficient when write rate is roughly equal to read rate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This is tricky code, but it looks correct. One comment I'd like you to consider, but I don't feel too strongly about it.
But I'd like to get a review from @marten-seemann as well.
util.go
Outdated
// have no unread chunks, just move pos | ||
s.bPos = 0 | ||
s.b = s.b[:0] | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd change this to to check if at least half of the slice is free. That'll slightly increase allocations until we hit a steady-state, but should avoid the degenerate case where we slide by one every single time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely a good improvement, but I am not sure about "at least half of the slice", I think it should be a bit lower than that, how about 0.25 of the capacity? Also, then it will be equal to append()
growth factor (when cap > 1024). Added with 0.25 for now.
That got me thinking, should we limit the maximum capacity of the buffer (recreate slice with default capacity when reach certain maximum and buffer is empty now)? What is the average Stream lifespan?
I tried to test this on the package's benchmark, but it does not grow at all because of in-memory network.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't had the time to do a thorough review yet, but more documentation (maybe even an example) here would be very useful to understand what the code is doing.
…me case when we shift slice by one every time
Sorry for the delay. I am not sure what kind of documentation this needs, basically this code tries to reuse slice's capacity with shifting values to the start so append() still adds values to the end, like a ring-buffer but simply. I can try to add more inline comments if you prefer. |
Assigned to @libp2p/go-libp2p-maintainers to see if anyone else can look. |
@marten-seemann : understood there are other things you're currently focused on. When you do reengage, does this comment make things more clear for you? |
Yes, more inline comments would be highly appreciated. I haven't worked with this code for a few months, and I find it quite hard to understand without any comments (this partially applies to the code we had before as well). |
@marten-seemann added more comments. |
@@ -54,6 +54,7 @@ func BenchmarkSendRecv(b *testing.B) { | |||
recvBuf := make([]byte, 512) | |||
|
|||
doneCh := make(chan struct{}) | |||
b.ResetTimer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest you also add b.ReportAllocs()
here and in BenchmarkSendRecvLarge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is necessary, you can add -benchmem
to go test
to achieve the same result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but I'm suggesting it because allocs seem to be of ongoing concern. Up to you.
Comparison of benchmarks using benchstat (with ReportAllocs added manually):
|
Reuse cap of
s.b
slice by shifting its elements to the left on writes. Before this changeAppend
was allocating a new slice every time because len(s.b) == cap(s.b). In my testings len(s.b) is usually small enough (<= 50, sometimes spikes to 200) even in high-write benchmarks such as BenchmarkSendRecvLarge, often near 0, so copy should not be too expensive. It is also efficient when write rate is roughly equal to read rate.Tested with
Before
After this commit