-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buffer: optimize writing short strings #54310
base: main
Are you sure you want to change the base?
Conversation
ronag
commented
Aug 10, 2024
•
edited
Loading
edited
@nodejs/buffer |
e470948
to
82438ad
Compare
82438ad
to
27a11a3
Compare
Failed to start CI⚠ Something was pushed to the Pull Request branch since the last approving review. ✘ Refusing to run CI on potentially unsafe PRhttps://github.com/nodejs/node/actions/runs/10333845691 |
644b157
to
ebbe73f
Compare
d8eb307
to
67b80d8
Compare
67b80d8
to
a9fd9ac
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #54310 +/- ##
==========================================
- Coverage 87.09% 87.09% -0.01%
==========================================
Files 647 647
Lines 181816 181837 +21
Branches 34884 34893 +9
==========================================
+ Hits 158360 158377 +17
+ Misses 16764 16760 -4
- Partials 6692 6700 +8
|
while (true) { | ||
const code = StringPrototypeCharCodeAt(string, n); | ||
if (code >= 128) { | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that buffer.write('aaaaa€')
now returns 3 instead of 8.
Edit: I think it is just slower now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on?
buffers/buffer-write-string.js n=1000000 len=1 args='' encoding='' *** 526.57 % ±36.19% ±51.92% ±76.21%
buffers/buffer-write-string.js n=1000000 len=16 args='' encoding='' *** 52.39 % ±18.12% ±25.97% ±38.06%
buffers/buffer-write-string.js n=1000000 len=32 args='' encoding='' -3.44 % ±10.55% ±14.63% ±20.33%
buffers/buffer-write-string.js n=1000000 len=8 args='' encoding='' *** 196.38 % ±3.09% ±4.43% ±6.49%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the benchmark does not use a string with mixed sigle byte and multibyte characters. I think the worst case would be something like this 'a'.repeat(15) + '€'
. In this case the optimization is only overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We're trying to optimize for the most common case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is an ASCII only string whose length is less than or equal to 16 characters the most common case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regression is minimal for non one byte case:
buffers/buffer-write-string-short.js n=1000000 char='€' len=0 *** 606.09 % ±26.40% ±36.19% ±49.34%
buffers/buffer-write-string-short.js n=1000000 char='€' len=1 -0.45 % ±13.79% ±19.68% ±28.63%
buffers/buffer-write-string-short.js n=1000000 char='€' len=16 -0.76 % ±5.97% ±8.50% ±12.32%
buffers/buffer-write-string-short.js n=1000000 char='€' len=32 0.51 % ±11.84% ±16.61% ±23.50%
buffers/buffer-write-string-short.js n=1000000 char='€' len=8 * -8.12 % ±6.90% ±9.87% ±14.43%
buffers/buffer-write-string-short.js n=1000000 char='a' len=0 *** 543.24 % ±12.04% ±17.27% ±25.35%
buffers/buffer-write-string-short.js n=1000000 char='a' len=1 *** 478.53 % ±31.70% ±45.53% ±66.96%
buffers/buffer-write-string-short.js n=1000000 char='a' len=16 *** 112.38 % ±6.11% ±8.72% ±12.69%
buffers/buffer-write-string-short.js n=1000000 char='a' len=32 ** -5.18 % ±3.21% ±4.57% ±6.64%
buffers/buffer-write-string-short.js n=1000000 char='a' len=8 *** 218.82 % ±11.24% ±15.64% ±21.85%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better comparison:
confidence improvement accuracy (*) (**) (***)
buffers/buffer-write-string-short.js n=1000000 fallback=0 len=0 *** 555.83 % ±10.01% ±14.34% ±21.00%
buffers/buffer-write-string-short.js n=1000000 fallback=0 len=1 *** 447.19 % ±126.93% ±182.34% ±268.25%
buffers/buffer-write-string-short.js n=1000000 fallback=0 len=16 *** 118.07 % ±18.44% ±26.17% ±37.74%
buffers/buffer-write-string-short.js n=1000000 fallback=0 len=32 -1.22 % ±5.77% ±8.22% ±11.92%
buffers/buffer-write-string-short.js n=1000000 fallback=0 len=8 *** 192.61 % ±38.32% ±54.53% ±78.97%
buffers/buffer-write-string-short.js n=1000000 fallback=1 len=0 *** 522.42 % ±106.34% ±152.67% ±224.38%
buffers/buffer-write-string-short.js n=1000000 fallback=1 len=1 ** -4.23 % ±3.03% ±4.21% ±5.85%
buffers/buffer-write-string-short.js n=1000000 fallback=1 len=16 *** -24.35 % ±9.58% ±13.18% ±18.07%
buffers/buffer-write-string-short.js n=1000000 fallback=1 len=32 -4.96 % ±12.35% ±17.52% ±25.26%
buffers/buffer-write-string-short.js n=1000000 fallback=1 len=8 *** -22.52 % ±7.75% ±11.02% ±15.95%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not make much sense to benchmark length 0 and 1 imho. I'm also not convinced that length <= 16 is a common case, but I have no proof (so far, there is only one comment with less 16 bytes in the PR comments).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use it a lot for primary keys that usually are less than 16 bytes. Also for building e.g. json responses require writing lots of small strings.
a9fd9ac
to
af328a2
Compare
function main({ len, n, fallback }) { | ||
const buf = Buffer.allocUnsafe(len); | ||
const string = fallback && len > 0 | ||
? Buffer.from('a'.repeat(len - 1) + '€').toString() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you fix the formatting issues?
Putting this on hold until #54311 after lands. |
I've added the "blocked" label, as this PR is on hold until another is resolved. Feel free to undo. |
PR-URL: #54310 PR-URL: #54311 Reviewed-By: Michaël Zasso <[email protected]> Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Santiago Gimeno <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]>
PR-URL: #54310 PR-URL: #54311 Reviewed-By: Michaël Zasso <[email protected]> Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Santiago Gimeno <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]>
PR-URL: #54310 PR-URL: #54311 Reviewed-By: Michaël Zasso <[email protected]> Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Santiago Gimeno <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]>