Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve wordwrap performance #19

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

kiyonlin
Copy link
Contributor

Hey @muesli !

In this improvement, I use unsafe to convert byte slice to a string without memory allocation which is heavily used by fasthttp.

OLD:

BenchmarkBuffer_PrintableRuneWidth-8   	 4995201	       241 ns/op	      95 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 4939071	       243 ns/op	      96 B/op	       1 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 4903446	       246 ns/op	      95 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 5072530	       240 ns/op	      95 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 4879065	       244 ns/op	      95 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 5018516	       240 ns/op	      96 B/op	       1 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 4881970	       245 ns/op	      95 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 5002642	       244 ns/op	      95 B/op	       1 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 4909580	       244 ns/op	      95 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 4891392	       243 ns/op	      96 B/op	       1 allocs/op

NEW:

BenchmarkBuffer_PrintableRuneWidth-8   	 5765908	       203 ns/op	       0 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 5739984	       204 ns/op	       0 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 5854398	       204 ns/op	       0 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 5838634	       203 ns/op	       0 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 5804761	       202 ns/op	       0 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 5797371	       203 ns/op	       0 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 5743213	       203 ns/op	       0 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 6095284	       203 ns/op	       0 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 5844835	       207 ns/op	       0 B/op	       0 allocs/op
BenchmarkBuffer_PrintableRuneWidth-8   	 5756268	       209 ns/op	       0 B/op	       0 allocs/op

Benchstat:

name                         old time/op    new time/op    delta
Buffer_PrintableRuneWidth-8     243ns ± 1%     203ns ± 1%   -16.41%  (p=0.000 n=10+8)

name                         old alloc/op   new alloc/op   delta
Buffer_PrintableRuneWidth-8     95.3B ± 1%      0.0B       -100.00%  (p=0.000 n=10+10)

name                         old allocs/op  new allocs/op  delta
Buffer_PrintableRuneWidth-8     0.40 ±150%      0.00           ~     (p=0.087 n=10+10)

I don’t know if you mind this trick of using unsafe, so feel free to close this PR directly.

@muesli
Copy link
Owner

muesli commented Oct 21, 2020

Interesting approach!

I think we'll keep this on the backburner though, as it's not an order of magnitude faster. In such cases I think I prefer to keep the code simple(r). Zero allocations are obviously another argument for it, so I'll keep this PR in mind.

@kiyonlin
Copy link
Contributor Author

Interesting approach!

I think we'll keep this on the backburner though, as it's not an order of magnitude faster. In such cases I think I prefer to keep the code simple(r). Zero allocations are obviously another argument for it, so I'll keep this PR in mind.

Makes sense.

We can see if it helps in package wordwrap because it depends on ansi Buffer.PrintableRuneWidth

@kiyonlin kiyonlin changed the title Improve ansi buffer PrintableRuneWidth performance Improve wordwrap performance Oct 22, 2020
@kiyonlin
Copy link
Contributor Author

kiyonlin commented Oct 22, 2020

Improvement for wordwrap performance

OLD:

BenchmarkWordWrapString-8   	  801043	      1330 ns/op	     879 B/op	      11 allocs/op
BenchmarkWordWrapString-8   	  924964	      1347 ns/op	     879 B/op	      11 allocs/op
BenchmarkWordWrapString-8   	 1000000	      1317 ns/op	     879 B/op	      11 allocs/op
BenchmarkWordWrapString-8   	  990302	      1317 ns/op	     879 B/op	      11 allocs/op
BenchmarkWordWrapString-8   	  869684	      1355 ns/op	     879 B/op	      11 allocs/op
BenchmarkWordWrapString-8   	  933944	      1350 ns/op	     879 B/op	      11 allocs/op
BenchmarkWordWrapString-8   	  912145	      1353 ns/op	     876 B/op	      11 allocs/op
BenchmarkWordWrapString-8   	 1000000	      1336 ns/op	     873 B/op	      11 allocs/op
BenchmarkWordWrapString-8   	  768968	      1320 ns/op	     867 B/op	      11 allocs/op
BenchmarkWordWrapString-8   	  927973	      1358 ns/op	     879 B/op	      11 allocs/op

NEW:

BenchmarkWordWrapString-8   	 1000000	      1010 ns/op	       0 B/op	       0 allocs/op
BenchmarkWordWrapString-8   	 1213159	       991 ns/op	       0 B/op	       0 allocs/op
BenchmarkWordWrapString-8   	 1234002	       952 ns/op	       0 B/op	       0 allocs/op
BenchmarkWordWrapString-8   	 1000000	      1006 ns/op	       0 B/op	       0 allocs/op
BenchmarkWordWrapString-8   	 1000000	      1006 ns/op	       0 B/op	       0 allocs/op
BenchmarkWordWrapString-8   	 1244863	      1012 ns/op	       0 B/op	       0 allocs/op
BenchmarkWordWrapString-8   	 1233816	       977 ns/op	       0 B/op	       0 allocs/op
BenchmarkWordWrapString-8   	 1241438	      1017 ns/op	       0 B/op	       0 allocs/op
BenchmarkWordWrapString-8   	 1000000	      1024 ns/op	       0 B/op	       0 allocs/op
BenchmarkWordWrapString-8   	 1000000	      1007 ns/op	       0 B/op	       0 allocs/op

benchstat:

name              old time/op    new time/op    delta
WordWrapString-8    1.34µs ± 2%    1.01µs ± 3%   -24.86%  (p=0.000 n=10+9)

name              old alloc/op   new alloc/op   delta
WordWrapString-8      878B ± 1%        0B       -100.00%  (p=0.000 n=9+10)

name              old allocs/op  new allocs/op  delta
WordWrapString-8      11.0 ± 0%       0.0       -100.00%  (p=0.000 n=10+10)

@kiyonlin
Copy link
Contributor Author

Update newest benchstat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants