-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dbnode] Adaptive WriteBatch allocations #3429
Changes from 3 commits
1717fc7
df172e9
2a26a83
ba0140e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -33,6 +33,13 @@ var ( | |
errTagsAndEncodedTagsRequired = errors.New("tags iterator and encoded tags must be provided") | ||
) | ||
|
||
const ( | ||
// preallocateBatchCoeff is used for allocating write batches of slightly bigger | ||
// capacity than needed for the current request, in order to reduce allocations on | ||
// subsequent reuse of pooled write batch. | ||
preallocateBatchCoeff = 1.2 | ||
) | ||
|
||
type writeBatch struct { | ||
writes []BatchWrite | ||
pendingIndex []PendingIndexInsert | ||
|
@@ -50,13 +57,25 @@ type writeBatch struct { | |
|
||
// NewWriteBatch creates a new WriteBatch. | ||
func NewWriteBatch( | ||
batchSize int, | ||
initialBatchSize int, | ||
ns ident.ID, | ||
finalizeFn func(WriteBatch), | ||
) WriteBatch { | ||
var ( | ||
writes []BatchWrite | ||
pendingIndex []PendingIndexInsert | ||
) | ||
|
||
if initialBatchSize > 0 { | ||
writes = make([]BatchWrite, 0, initialBatchSize) | ||
pendingIndex = make([]PendingIndexInsert, 0, initialBatchSize) | ||
// Leaving nil slices if initialBatchSize == 0, | ||
// they will be allocated when needed, based on the actual batch size. | ||
} | ||
|
||
return &writeBatch{ | ||
writes: make([]BatchWrite, 0, batchSize), | ||
pendingIndex: make([]PendingIndexInsert, 0, batchSize), | ||
writes: writes, | ||
pendingIndex: pendingIndex, | ||
ns: ns, | ||
finalizeFn: finalizeFn, | ||
} | ||
|
@@ -102,14 +121,29 @@ func (b *writeBatch) Reset( | |
batchSize int, | ||
ns ident.ID, | ||
) { | ||
var writes []BatchWrite | ||
// Preallocate slightly more when not using initialBatchSize. | ||
preallocateBatchCap := int(float32(batchSize) * preallocateBatchCoeff) | ||
|
||
if batchSize > cap(b.writes) { | ||
writes = make([]BatchWrite, 0, batchSize) | ||
batchCap := batchSize | ||
if cap(b.writes) == 0 { | ||
batchCap = preallocateBatchCap | ||
} | ||
b.writes = make([]BatchWrite, 0, batchCap) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: I think it would be very useful to have a gauge for current allocated batch capacity. It could be valuable in cases when, e.g. we receive one huge batch and subsequent batches are much smaller. The memory usage will probably be increased after this even if we are receiving smaller batches so this gauge could help us to identify such cases. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How would this gauge work? I guess we would increment it on line 139, but when would we decrement it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It could be also a counter for now, I was thinking that maybe in future we will update this to also shrink batch capacity so gauge makes sense from this point of view. |
||
} else { | ||
b.writes = b.writes[:0] | ||
} | ||
|
||
if batchSize > cap(b.pendingIndex) { | ||
batchCap := batchSize | ||
if cap(b.pendingIndex) == 0 { | ||
batchCap = preallocateBatchCap | ||
} | ||
b.pendingIndex = make([]PendingIndexInsert, 0, batchCap) | ||
} else { | ||
writes = b.writes[:0] | ||
b.pendingIndex = b.pendingIndex[:0] | ||
} | ||
|
||
b.writes = writes | ||
b.ns = ns | ||
b.finalizeEncodedTagsFn = nil | ||
b.finalizeAnnotationFn = nil | ||
|
@@ -144,14 +178,14 @@ func (b *writeBatch) PendingIndex() []PendingIndexInsert { | |
return b.pendingIndex | ||
} | ||
|
||
// Set the function that will be called to finalize annotations when a WriteBatch | ||
// is finalized, allowing the caller to pool them. | ||
// SetFinalizeEncodedTagsFn sets the function that will be called to finalize encodedTags | ||
// when a WriteBatch is finalized, allowing the caller to pool them. | ||
func (b *writeBatch) SetFinalizeEncodedTagsFn(f FinalizeEncodedTagsFn) { | ||
b.finalizeEncodedTagsFn = f | ||
} | ||
|
||
// Set the function that will be called to finalize annotations when a WriteBatch | ||
// is finalized, allowing the caller to pool them. | ||
// SetFinalizeAnnotationFn sets the function that will be called to finalize annotations | ||
// when a WriteBatch is finalized, allowing the caller to pool them. | ||
func (b *writeBatch) SetFinalizeAnnotationFn(f FinalizeAnnotationFn) { | ||
b.finalizeAnnotationFn = f | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,13 +25,10 @@ import ( | |
) | ||
|
||
const ( | ||
// defaultInitialiBatchSize determines the initial batch size that will be used when filling up the | ||
// pool. | ||
defaultInitialBatchSize = 1024 | ||
// defaultWritePoolMaxBatchSize is the default maximum size for a writeBatch that the pool | ||
// will allow to remain in the pool. Any batches larger than that will be discarded to prevent | ||
// excessive memory use forever in the case of an exceptionally large batch write. | ||
defaultMaxBatchSize = 100000 | ||
defaultMaxBatchSize = 10240 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reducing this default as 100000 seems to be crazy lot (it would result in a 39MB batch). |
||
) | ||
|
||
// WriteBatchPool is a pool of WriteBatch. | ||
|
@@ -44,14 +41,9 @@ type WriteBatchPool struct { | |
// NewWriteBatchPool constructs a new WriteBatchPool. | ||
func NewWriteBatchPool( | ||
opts pool.ObjectPoolOptions, | ||
initialBatchSizeOverride, | ||
initialBatchSize int, | ||
maxBatchSizeOverride *int, | ||
) *WriteBatchPool { | ||
initialBatchSize := defaultInitialBatchSize | ||
if initialBatchSizeOverride != nil { | ||
initialBatchSize = *initialBatchSizeOverride | ||
} | ||
|
||
maxBatchSize := defaultMaxBatchSize | ||
if maxBatchSizeOverride != nil { | ||
maxBatchSize = *maxBatchSizeOverride | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do this if cap of writes is zero?
I thought you’d always want to allocate 1.2x the size you actually need since next request will likely have a little more or a little less.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for this was to preserve the original behaviour when having
InitialBatchSize
set in config (so that if you have initial batch size of 128 in config, and a batch size of 200 arrives, exactly 200 would be allocated, and not 200*1.2).I'll add an explicit
writeBatch.adaptiveSize
flag which will help me handle every case better.