-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(NODE-5906): optimize toArray to use batches #4171
Conversation
5c87be0
to
f888cd3
Compare
712f24c
to
98d4471
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you get the chance can you fill out the release highlights?
And (I guess on slack?) can you share the numbers you're seeing, in BSON and previous driver performance improvements I have often included a conservative estimation of improvements downstream projects can expect to see from the perf change.
9d43fb9
to
62d1acc
Compare
Description
Performance optimization on
AbstractCursor.toArray
What is changing?
Rather than
await
ing each document,toArray
now onlyawait
s a new batch.Is there new documentation needed for these changes?
No.
What is the motivation for this change?
Performance improvement. See performance data matrix here.
Release Highlight (Remove if this ends up being only a code clarity fix)
Optimized
cursor.toArray()
Prior to this change,
toArray()
simply used the cursor's async iterator API, which parses BSON documents lazily (see more here).toArray()
, however, eagerly fetches the entire set of results, pushing each document into the returned array. As such,toArray
does not have the same benefits from lazy parsing as other parts of the cursor API.With this change, when
toArray()
accumulates documents, it empties the current batch of documents into the array before calling the async iterator again, which means each iteration will fetch the next batch rather than wrap each document in a promise. This allows thecursor.toArray()
to avoid the required delays associated with async/await execution, and allows for a performance improvement of up to 5% on average! 🎉Note: This performance optimization does not apply if a transform has provided to
cursor.map()
beforetoArray
is called.Double check the following
npm run check:lint
scripttype(NODE-xxxx)[!]: description
feat(NODE-1234)!: rewriting everything in coffeescript