fix: major performance issues with bytea performance #2240 #2241

regevbr · 2020-06-18T11:54:31Z

a simple query to retrieve a 100mb bytea column from a one table, one row DB takes forever and causes the cpu to spike at 100%, rendering my node instance unresponsive.

The issue is exactly like #1286 and the fix is +- the same as brianc/node-packet-reader#3

The code at:

node-postgres/packages/pg-protocol/src/parser.ts

Lines 91 to 93 in ea6ac2a

    
           combinedBuffer = Buffer.allocUnsafe(this.remainingBuffer.byteLength + buffer.byteLength) 
        
           this.remainingBuffer.copy(combinedBuffer) 
        
           buffer.copy(combinedBuffer, this.remainingBuffer.byteLength)

suffers from too many allocations and buffer copies going around.

The fix is a bit messy but it gets the job done.

Before the fix, fetching 100mb of bytea column took 3 minutes and 10 seconds and consumed on average 95% cpu usage during the entire time

After the fix the query took only! 40 seconds and consumed on average 3% cpu

brianc · 2020-06-18T14:34:16Z

ah wow thanks - i'll check this out now & get it pushed out

regevbr · 2020-06-18T14:35:34Z

@brianc no prob, a review now will be welcome, but some tests are failing so it is not ready to be merged just yet. I will ping you when it is ready

brianc · 2020-06-18T14:35:55Z

at first glance this may have regressed performance in other places - travis is failing on 'big-simple-query-tests'

brianc · 2020-06-18T14:36:14Z

ah gotcha - i'll let you pound on it a bit more than & check back l8r!

regevbr · 2020-06-18T14:36:46Z

@brianc that sounds great!

regevbr · 2020-06-18T23:42:08Z

@brianc I fixed the bug and the CI is passing now :-)
Can you please review?

regevbr · 2020-06-19T00:55:17Z

I made some more improvements and refactoring and added a few comments. It is now really ready for review

regevbr · 2020-06-20T13:12:30Z

packages/pg-protocol/src/parser.ts

+        // Adjust the cursors of remainingBuffer
+        this.remainingBufferOffset = combinedBufferOffset
+      } else {
+        // To avoid side effects, copy the remaining part of the new buffer to remainingBuffer with extra space for next buffer


@brianc this part is the only thing I think I can still improve. The question is if I can use the input buffer for the next run without anyone modifying it out of scope. What do you think?

what do you mean by anyone modifying it out of scope? The only place in the whole codebase where we deal w/ raw input buffers from the socket is here in parser.ts and its not exposed to the user ever so any type of modifications you do here are safe in that no one else should (or can AFAIK) ever modify input buffers from the socket.

It will be better here to use the existing input buffer for the next run instead of copying it over. What you say is that it is safe for me to do (as far as you know) which is great! I will change it then.
I was worried that who ever passes that buffer, can decide themselves to resuse that buffer and then it will mess up with my code...

the buffer is passed in here as the whole net.Socket

The only place I could imagine this might cause issues is w/ some of the tests which may incorrectly be reusing buffers though nothing comes to mind. Some of the tests are old and kinda wonky so you could just try & run the tests...as long as they pass we're definitely good. The raw buffers of the socket are never exposed anywhere outside of the parser.

Great will do

regevbr · 2020-06-23T07:09:54Z

I uses a nice tool called clinic.js to visualize the before and after of this PR.
Before:

After:

Before:

After:

regevbr · 2020-06-24T20:42:33Z

@brianc did you have a chance to review my PR?

regevbr · 2020-07-02T09:52:58Z

@brianc @charmander can you please CR?

brianc · 2020-07-02T15:32:58Z

I'm sorry for the delay. I've been having some anxiety attacks related to covid (I live in Austin, the #1 worst city in the US for new cases) & I've been really slacking on this work trying to keep mental health intact as best I can. I'll take a look at this now...I really appreicate the work.

regevbr · 2020-07-02T15:45:37Z

Oh bummer, I'm sorry you have to go through that! I hope you will overcome it soon :-)
I don't want to impose any more pressure on you so don't feel obliged to work on it if you don't feel like it.
I trust that when everything will be back to normal for you, you will find the time to look it over.

brianc · 2020-07-02T15:49:01Z

I looked over this - it looks pretty good! I ran my crude benchmarks too & perf doesn't seem to be impacted. Is there any way to write any kind of test for this? It's a great change, I just worry w/o tests it might regress in the future (as this currently already was kind of a regression when I moved away from buffer-reader since this lib didn't contain any tests that stress dealing w/ big bytea returns)

brianc · 2020-07-02T15:52:41Z

packages/pg-protocol/src/parser.ts

+      combinedBufferLength,
+      reuseRemainingBuffer,
+      combinedBufferFullLength,
+    })


I have a question for you....a long time ago it was more efficient to make "backing classes" for things you'd return frequently as v8 can optimize them much more aggressively. You are using a type CombinedBuffer and constructing it w/ the object shorthand here & returning it w/ the object shorthand on line 172. Do you think it'd make a difference to actually make this a class?
something like this...

class CombinedBuffer { constructor(public combinedBuffer: buffer = combinedBuffer, public combinedBufferOffset: number = offset, public combinedBufferLength: number = length, public reuseRemainingBuffer: boolean = reuse, public combinedBufferFullLength: number = fullLength) {} }

and then return it with this.consumeBuffer(new CombinedBuffer(combinedBuffer, offset, combinedBufferLength, reuseRemainingBuffer, combinedBufferFullLength))

?

Staying on top of how v8 optimizes things is a moving target so I'm not sure if you have any experience around the perf diff of using dynamically created objects versus proper "classes." I wouldn't normally sweat it but this is a white code code path & it might be a small perf win here.

You might also be able to allocate a single instance of CombinedBuffer on the parser class itself & just set the properties on it versus allocating new ones every time. Again, a micro-optimization but those things can add up when the code is called so frequently.

I can play around w/ this too & push to your branch (I think github lets me do that?) if you're cool w/ it & think it's an okay idea.

That is an interesting solution, why don't you let me research it (as I'm not too familiar with v8 bits) and try to enhance it

regevbr · 2020-07-02T16:00:03Z

I looked over this - it looks pretty good! I ran my crude benchmarks too & perf doesn't seem to be impacted. Is there any way to write any kind of test for this? It's a great change, I just worry w/o tests it might regress in the future (as this currently already was kind of a regression when I moved away from buffer-reader since this lib didn't contain any tests that stress dealing w/ big bytea returns)

I'm glad to hear that! thanks!
Do you have a test scenario in place? I believe it will be a bit complicated as you will need to stress test it and make sure that the cpu doesn't get hogged.
I will try to add a test and see if I manage to get it to fail with the current code

brianc · 2020-07-02T16:25:39Z

I will try to add a test and see if I manage to get it to fail with the current code

Yeah if you can't then I understand. It's incredibly difficult to test how much CPU some code is using within a unit test. It'd be ideal to have some way to catch a performance regression. You could add another benchmark to this crude benchmark file. I run that before any merge & manually check for perf regressions.

regevbr · 2020-07-02T17:03:15Z

I will try to add a test and see if I manage to get it to fail with the current code

Yeah if you can't then I understand. It's incredibly difficult to test how much CPU some code is using within a unit test. It'd be ideal to have some way to catch a performance regression. You could add another benchmark to this crude benchmark file. I run that before any merge & manually check for perf regressions.

Ok will check it out, thanks!

regevbr · 2020-07-03T15:11:05Z

@brianc I started reading about the performance difference between object literal and classes (or more specifically constructor functions) and indeed back in 2014 there was a difference, but it doesn't exist anymore...
For example you can look at this post but if you run the jsperf test he used there today, there is no difference.
But all of that doesn't matter for us, since I retained the input buffer for the next run, the code got much simpler and there is no need to pass objects around anymore.

I also added a bench test. Running it in the current code takes 144954 ms! and after the fix it only runs for 1107 ms which is an insane improvement.

Can you please CR again and run the bench tests locally to verify the fix?

I appreciate your time looking into it and hope things are better back home!

brianc · 2020-07-06T20:46:26Z

Fantastic! Thanks for your work on this! ❤️ I'll merge this tomorrow in the AM & do a patch release.

regevbr · 2020-07-07T13:38:22Z

Thanks @brianc for merging!
Please let me know when there is an official release with the fix :-)

brianc · 2020-07-07T14:37:12Z

@regevbr Thanks for your persistence and great work on this. I apologize for the delays on my end. I like to release in the AM (CST) So I can be around all day in case some weird incompatibility goes out. I've released your changes with [email protected]. Going to put them into my prod app today!

regevbr · 2020-07-07T14:40:57Z

@brianc sure not a problem! always happy to help.
Your delays are more than justifiable and in any case it is not my place to judge as this is your (awesome) library :-)
I will update my repos as well and let you know if everything works.
Thanks again!

fix: major performance issues with bytea performance brianc#2240

0455504

fix: major performance issues with bytea performance brianc#2240

c31205f

regevbr added 3 commits June 19, 2020 02:53

fix: major performance issues with bytea performance brianc#2240

13ff0e1

fix: major performance issues with bytea performance brianc#2240

316b119

fix: major performance issues with bytea performance brianc#2240

89758ce

regevbr mentioned this pull request Jun 19, 2020

bug: playing a video from assets is super slow #2074 requarks/wiki#2087

Merged

fix: major performance issues with bytea performance brianc#2240

5e0d684

regevbr commented Jun 20, 2020

View reviewed changes

brianc reviewed Jul 2, 2020

View reviewed changes

regevbr added 5 commits July 3, 2020 17:52

fix: major performance issues with bytea performance brianc#2240

64c78b0

fix: major performance issues with bytea performance brianc#2240

bf53552

fix: major performance issues with bytea performance brianc#2240

410a6ab

fix: major performance issues with bytea performance brianc#2240

69af267

fix: major performance issues with bytea performance brianc#2240

1d3f155

brianc merged commit 9ba49b7 into brianc:master Jul 7, 2020

regevbr deleted the #2240 branch July 7, 2020 13:35

brianc mentioned this pull request Jul 10, 2020

"timeout exceeded when trying to connect" spike after upgrading to 8.2.1 #2262

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: major performance issues with bytea performance #2240 #2241

fix: major performance issues with bytea performance #2240 #2241

regevbr commented Jun 18, 2020 •

edited

Loading

brianc commented Jun 18, 2020

regevbr commented Jun 18, 2020 •

edited

Loading

brianc commented Jun 18, 2020

brianc commented Jun 18, 2020

regevbr commented Jun 18, 2020

regevbr commented Jun 18, 2020

regevbr commented Jun 19, 2020

regevbr Jun 20, 2020

brianc Jul 2, 2020

regevbr Jul 2, 2020

brianc Jul 2, 2020

regevbr Jul 2, 2020

regevbr commented Jun 23, 2020

regevbr commented Jun 24, 2020

regevbr commented Jul 2, 2020

brianc commented Jul 2, 2020

regevbr commented Jul 2, 2020

brianc commented Jul 2, 2020

brianc Jul 2, 2020

brianc Jul 2, 2020

regevbr Jul 2, 2020

regevbr commented Jul 2, 2020

brianc commented Jul 2, 2020

regevbr commented Jul 2, 2020

regevbr commented Jul 3, 2020

brianc commented Jul 6, 2020

regevbr commented Jul 7, 2020

brianc commented Jul 7, 2020

regevbr commented Jul 7, 2020

	combinedBuffer = Buffer.allocUnsafe(this.remainingBuffer.byteLength + buffer.byteLength)
	this.remainingBuffer.copy(combinedBuffer)
	buffer.copy(combinedBuffer, this.remainingBuffer.byteLength)

fix: major performance issues with bytea performance #2240 #2241

fix: major performance issues with bytea performance #2240 #2241

Conversation

regevbr commented Jun 18, 2020 • edited Loading

brianc commented Jun 18, 2020

regevbr commented Jun 18, 2020 • edited Loading

brianc commented Jun 18, 2020

brianc commented Jun 18, 2020

regevbr commented Jun 18, 2020

regevbr commented Jun 18, 2020

regevbr commented Jun 19, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

regevbr commented Jun 23, 2020

regevbr commented Jun 24, 2020

regevbr commented Jul 2, 2020

brianc commented Jul 2, 2020

regevbr commented Jul 2, 2020

brianc commented Jul 2, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

regevbr commented Jul 2, 2020

brianc commented Jul 2, 2020

regevbr commented Jul 2, 2020

regevbr commented Jul 3, 2020

brianc commented Jul 6, 2020

regevbr commented Jul 7, 2020

brianc commented Jul 7, 2020

regevbr commented Jul 7, 2020

regevbr commented Jun 18, 2020 •

edited

Loading

regevbr commented Jun 18, 2020 •

edited

Loading