Major performance improvement #3

mramato · 2017-05-10T19:44:40Z

Reader.prototype.addChunk was calling Buffer.concat constantly, which increased garbage collection and just all-around killed performance. The exact implications of this is documented in brianc/node-postgres#1286, which has a test case for showing how performance is affected.

Rather than concatenating buffers to the new buffer size constantly, this version uses a growth strategy that doubles the size of the buffer each time and tracks the functional length in a separate chunkLength variable. This significantly reduces the amount of allocation and provides a 25x
performance in my test cases, the larger the amount of data the query is returning, the greater improvement of performance.

Since this uses a doubling buffer, it was important to avoid unbounded growth, so I also added a reclamation strategy which reduces the size of the buffer by half whenever more than half of the data has been read.

I wasn't sure if it was okay to delete Reader.prototype._save outright, since it now just always returns false in all cases.

I didn't add any new unit tests because the existing ones should cover the refactored code, but let me know if you would like me to add something specific.

Also, thanks for node-postgres, it's awesome and hopefully this helps makes it even more awesome. 😄

`Reader.prototype.addChunk` was calling `Buffer.concat` constantly, which increased garbage collection and just all-around killed performance. The exact implications of this is documented in brianc/node-postgres#1286, which has a test case for showing how performance is affected. Rather than concatenating buffers to the new buffer size constantly, this change uses a growth strategy that doubles the size of the buffer each time and tracks the functional length in a separate `chunkLength` variable. This significantly reduces the amount of allocation and provides a 25x performance in my test cases, the larger the amount of data the query is returning, the greater improvement of performance. Since this uses a doubling buffer, it was important to avoid growing forever, so I also added a reclaimation strategy which reduces the size of the buffer wever time more than half of the data has been read.

brianc

This great! Thank you! The perf improvements are really exciting! 💃 I have been doing this for as long as node-postgres with outgoing packets: https://github.com/brianc/node-buffer-writer/blob/master/index.js#L12 I think I've just overlooked the incoming packets for one reason or another.

The only thing I'm not really feeling is resizing the buffer back down doing copies from larger to smaller buffers. I think there's a way to do it more efficiently, while still reclaiming memory at a reasonable pace. Postgres packets are length prefixed and so frequently this.offset === this.chunk.length will be true meaning the entire response will be consumed. I would check for this condition at the top of addChunk and if the entire existing buffered input has been consumed, just set this.chunk = chunk and return. That way the buffer grows to the entire length of a huge incoming packet with your efficient doubling algorithm, but is only shrunk once the whole thing is consumed. I think in practice this will be simpler and hopefully even more performant.

brianc · 2017-05-11T18:55:07Z

index.js

@@ -8,32 +8,48 @@ var Reader = module.exports = function(options) {
  options = options || {}
  this.offset = 0
  this.lastChunk = false
-  this.chunk = null
+  this.chunk = Buffer.alloc(4);


currently node-postgres supports node @v0.10.x onward - does this exist in older versions of node?

note to self: I need to set up travis on this repo.

I realized I can just go back to assigning this to null and just keeping a handle to the first chunk added.

brianc · 2017-05-11T18:56:14Z

index.js

+    this.chunk.copy(newBuffer, 0, this.offset);
+    this.chunk = newBuffer;
+    this.chunkLength -= this.offset;
+    this.offset = 0;
  }
 }

 Reader.prototype._save = function() {


yeah if this method is only returning false and doing nothing else we might as well inline it (though v8 probably is anyway)

mramato · 2017-05-11T19:04:13Z

Thanks for the review, @brianc, working on the requested changes now.

mramato · 2017-05-11T19:20:10Z

Postgres packets are length prefixed and so frequently this.offset === this.chunk.length will be true meaning the entire response will be consumed.

Thanks, I was actually hoping you might have some insight into this area (I know nothing about the postgres wire protocol), I just didn't want to open a PR for what I thought may end up as unbounded memory growth. Your suggestion is a lot cleaner and predictable.

1. Fix Node @v0.10.x by removing `Buffer.alloc` usage. 2. Remove uneeded Reader.prototype._save/ 3. Reset buffer at the end of the packet instead of shrinking dynamically. 4. Remove some semi-colons to keep style consistent.

mramato · 2017-05-11T19:49:18Z

@brianc, I believe I've addressed all of your review comments. I also tested on v0.10.48 just to confirm I didn't break anything there. Please let me know if there are any other changes you'd like. Thanks again for the review and for pg in general.

brianc · 2017-05-12T03:06:15Z

This PR is amazing & I really appreciate the way you approached everything! 👏 I'm super excited to test this out & get it merged. I've got a ton on my plate right now, but I'll do everything I can to get this tested and merged in by EOD tomorrow. If for some reason I get sandbagged so badly at work I can't get to this, know it'll be in by the end of the weekend!

If I had travis set up on this repo it'd be a lot easier actually, but this thing has changed so little in the past 5 years, and when I initially set it up I didn't have the rigor I do now....so I'll try & get that set up tomorrow too!

mramato · 2017-05-12T13:11:39Z

Thank you for the kind words. By the weekend would be awesome, but I'm just glad to know it's on your radar.

If you uncover anything during your testing, just let me know and I can fix it (or if it's easier for you, feel free to just tweak things yourself, I don't mind at all). Thanks again

This was referenced May 10, 2017

Problems with bytea performance brianc/node-postgres#1286

Closed

Problems with bytea performance knex/knex#2052

Closed

Add tests for dynamic buffer compaction.

9c41ee6

brianc requested changes May 11, 2017

View reviewed changes

Changes after review

626066b

1. Fix Node @v0.10.x by removing `Buffer.alloc` usage. 2. Remove uneeded Reader.prototype._save/ 3. Reset buffer at the end of the packet instead of shrinking dynamically. 4. Remove some semi-colons to keep style consistent.

brianc merged commit 0fc6347 into brianc:master May 12, 2017

mramato deleted the performance branch May 12, 2017 15:25

charmander mentioned this pull request Jul 24, 2017

client.query() slowness brianc/node-postgres#1103

Closed

This was referenced Jun 17, 2020

Major performance issues with bytea performance brianc/node-postgres#2240

Closed

fix: major performance issues with bytea performance #2240 brianc/node-postgres#2241

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major performance improvement #3

Major performance improvement #3

mramato commented May 10, 2017

brianc left a comment

brianc May 11, 2017

mramato May 11, 2017

brianc May 11, 2017

mramato commented May 11, 2017

mramato commented May 11, 2017

mramato commented May 11, 2017

brianc commented May 12, 2017

mramato commented May 12, 2017

Major performance improvement #3

Major performance improvement #3

Conversation

mramato commented May 10, 2017

brianc left a comment

Choose a reason for hiding this comment

brianc May 11, 2017

Choose a reason for hiding this comment

mramato May 11, 2017

Choose a reason for hiding this comment

brianc May 11, 2017

Choose a reason for hiding this comment

mramato commented May 11, 2017

mramato commented May 11, 2017

mramato commented May 11, 2017

brianc commented May 12, 2017

mramato commented May 12, 2017