src: unify implementations of Utf8Value, TwoByteValue, etc. #6357

addaleax · 2016-04-23T20:15:18Z

Checklist

tests and code linting passes
the commit message follows commit guidelines

Affected core subsystem(s)

src

Description of change

Unify the common code of Utf8Value, TwoByteValue, BufferValue and StringBytes::InlineDecoder into one class. Always make the result zero-terminated for the first three.

This fixes two problems in passing:

When the conversion of the input value to String fails, make the buffer zero-terminated anyway. Previously, this would have resulted in possibly reading uninitialized data in multiple places in the code. An instance of that problem can be reproduced by running e.g. valgrind node -e 'net.isIP({ toString() { throw Error() } })'. (Most other places where Utf8Value or TwoByteValue can possibly receive non-strings work too, there’s nothing special about isIP.)
Previously, BufferValue copied one byte too much from the source, possibly resulting in an out-of-bounds memory access. This can be reproduced by running e.g. valgrind node -e 'fs.openSync(Buffer.from("node".repeat(8192)), "r")'.

If you don’t like the code changes by themselves because they don’t really modify behaviour in any way, I’d still like to see these issues fixed. Since I don’t see any reason why the latter one can’t occur in real life and bite users of the Buffer-accepting fs API in v6 with a segfault, I’m marking this with the 6.0.0 milestone.

/cc @bnoordhuis @jasnell

bnoordhuis · 2016-04-25T11:41:29Z

src/util.h

-class Utf8Value {
+// Allocates an array of member type T. For up to StackStorageSize items,
+// the stack is used, otherwise malloc().
+template<typename T, size_t StackStorageSize>


Style: space after template and can you call the second parameter kStackStorageSize?

bnoordhuis · 2016-04-25T11:45:42Z

Mostly LGTM. I like how it gets rid of the duplication.

addaleax · 2016-04-25T14:42:48Z

Updated this with @bnoordhuis’ suggestions. I lifted out() from StringBytes::InlineDecoder to the new MaybeStackBuffer class and used that for the decoding stuff instead of *this. I think the name is okay, and at least it doesn’t introduce anything really new.

addaleax · 2016-04-25T14:44:43Z

CI: https://ci.nodejs.org/job/node-test-commit/3040/

jasnell · 2016-04-25T15:12:18Z

Nice! I'd been hoping to get around to doing something similar but you beat me to it! LGTM if CI is green and @bnoordhuis and @trevnorris are happy!

Fishrock123 · 2016-04-25T21:43:23Z

Clearing this from v6.0.0 since it doesn't appear to be a breaking change.

Edit: Sounds like this is a bug in v6

trevnorris · 2016-04-25T23:23:04Z

src/util.cc

-    fail_ = false;
+    size_t len = Buffer::Length(value);
+    EnsureSufficientStorage(len + 1);
+    memcpy(out(), Buffer::Data(value), len);


I'm missing why you're adding 1 above, but no where else.

@trevnorris In the other places, it’s added too, but rather to the length when determining the possibly necessary buffer size.

The only reason for that is that string->WriteUtf8 and string->Write return the actual length, so the calculations for the position of the trailing 0 byte don’t have to involve the “old” pre-computed buffer size there.

I can change this to align it with the other classes if you prefer, but I personally would find having to write memcpy(…, len - 1); the weirder of two choices.

trevnorris · 2016-04-25T23:29:04Z

Great work. Don't have time at the moment to give this as thorough a review as I'd like, but I'll pick this up again tonight.

bnoordhuis · 2016-04-27T10:12:16Z

src/util.cc

-                          Local<Value> value,
-                          char** dst,
-                          const size_t size) {
+template<typename T>


Style: space after template.

addaleax · 2016-04-27T11:54:25Z

Updated this with nits addressed and using storage instead of len to (hopefully) have consistent and more appropriate variable naming.

trevnorris · 2016-04-27T17:59:17Z

src/util.h

+      } else {
+        buf_ = static_cast<T*>(malloc(sizeof(T) * storage));
+        CHECK_NE(buf_, nullptr);
+      }


Could you add length_ = sizeof(T) * storage, then do something like CHECK_LE(length_, length) in SetLength()? Just an extra sanity check to make sure we're never accidentally allowing the writable length to be greater than the allocated memory.

addaleax · 2016-04-27T19:28:10Z

Rebased, squashed and updated with @trevnorris’ recent suggestions

addaleax · 2016-04-28T16:17:16Z

CI again: https://ci.nodejs.org/job/node-test-commit/3077/

Unify the common code of `Utf8Value`, `TwoByteValue`, `BufferValue` and `StringBytes::InlineDecoder` into one class. Always make the result zero-terminated for the first three. This fixes two problems in passing: * When the conversion of the input value to String fails, make the buffer zero-terminated anyway. Previously, this would have resulted in possibly reading uninitialized data in multiple places in the code. An instance of that problem can be reproduced by running e.g. `valgrind node -e 'net.isIP({ toString() { throw Error() } })'`. * Previously, `BufferValue` copied one byte too much from the source, possibly resulting in an out-of-bounds memory access. This can be reproduced by running e.g. `valgrind node -e \ 'fs.openSync(Buffer.from("node".repeat(8192)), "r")'`. Further minor changes: * This lifts the `out()` method of `StringBytes::InlineDecoder` to the common class so that it can be used when using the overloaded `operator*` does not seem appropiate. * Hopefully clearer variable names. * Add checks to make sure the length of the data does not exceed the allocated storage size, including the possible null terminator. PR-URL: #6357 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Trevor Norris <[email protected]>

Unify the common code of `Utf8Value`, `TwoByteValue`, `BufferValue` and `StringBytes::InlineDecoder` into one class. Always make the result zero-terminated for the first three. This fixes two problems in passing: * When the conversion of the input value to String fails, make the buffer zero-terminated anyway. Previously, this would have resulted in possibly reading uninitialized data in multiple places in the code. An instance of that problem can be reproduced by running e.g. `valgrind node -e 'net.isIP({ toString() { throw Error() } })'`. * Previously, `BufferValue` copied one byte too much from the source, possibly resulting in an out-of-bounds memory access. This can be reproduced by running e.g. `valgrind node -e \ 'fs.openSync(Buffer.from("node".repeat(8192)), "r")'`. Further minor changes: * This lifts the `out()` method of `StringBytes::InlineDecoder` to the common class so that it can be used when using the overloaded `operator*` does not seem appropiate. * Hopefully clearer variable names. * Add checks to make sure the length of the data does not exceed the allocated storage size, including the possible null terminator. PR-URL: nodejs#6357 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Trevor Norris <[email protected]>

MylesBorins · 2016-06-01T23:48:49Z

@addaleax lts?

addaleax · 2016-06-02T00:10:51Z

@thealphanerd I think the change here depends on too much other stuff but I’ll make a backport PR with one of the bugfixes here.

Make sure dereferencing a `Utf8Value` instance always returns a zero-terminated string, even if the conversion to string failed. The corresponding bugfix in the master branch happened in 44a4032 (nodejs#6357).

Make sure dereferencing a `Utf8Value` instance always returns a zero-terminated string, even if the conversion to string failed. The corresponding bugfix in the master branch happened in 44a4032 (#6357). Ref: #6357 PR-URL: #7101 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Myles Borins <[email protected]>

Unify the common code of `Utf8Value`, `TwoByteValue`, `BufferValue` and `StringBytes::InlineDecoder` into one class. Always make the result zero-terminated for the first three. This fixes two problems in passing: * When the conversion of the input value to String fails, make the buffer zero-terminated anyway. Previously, this would have resulted in possibly reading uninitialized data in multiple places in the code. An instance of that problem can be reproduced by running e.g. `valgrind node -e 'net.isIP({ toString() { throw Error() } })'`. * Previously, `BufferValue` copied one byte too much from the source, possibly resulting in an out-of-bounds memory access. This can be reproduced by running e.g. `valgrind node -e \ 'fs.openSync(Buffer.from("node".repeat(8192)), "r")'`. Further minor changes: * This lifts the `out()` method of `StringBytes::InlineDecoder` to the common class so that it can be used when using the overloaded `operator*` does not seem appropiate. * Hopefully clearer variable names. * Add checks to make sure the length of the data does not exceed the allocated storage size, including the possible null terminator. PR-URL: nodejs#6357 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Trevor Norris <[email protected]>

Unify the common code of `Utf8Value`, `TwoByteValue`, `BufferValue` and `StringBytes::InlineDecoder` into one class. Always make the result zero-terminated for the first three. This fixes two problems in passing: * When the conversion of the input value to String fails, make the buffer zero-terminated anyway. Previously, this would have resulted in possibly reading uninitialized data in multiple places in the code. An instance of that problem can be reproduced by running e.g. `valgrind node -e 'net.isIP({ toString() { throw Error() } })'`. * Previously, `BufferValue` copied one byte too much from the source, possibly resulting in an out-of-bounds memory access. This can be reproduced by running e.g. `valgrind node -e \ 'fs.openSync(Buffer.from("node".repeat(8192)), "r")'`. Further minor changes: * This lifts the `out()` method of `StringBytes::InlineDecoder` to the common class so that it can be used when using the overloaded `operator*` does not seem appropiate. * Hopefully clearer variable names. * Add checks to make sure the length of the data does not exceed the allocated storage size, including the possible null terminator. PR-URL: #6357 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Trevor Norris <[email protected]>

addaleax added the c++ Issues and PRs that require attention from people who are familiar with C++. label Apr 23, 2016

addaleax added this to the 6.0.0 milestone Apr 23, 2016

bnoordhuis reviewed Apr 25, 2016
View reviewed changes

Fishrock123 removed this from the 6.0.0 milestone Apr 25, 2016

Fishrock123 added the confirmed-bug Issues with confirmed bugs. label Apr 25, 2016

trevnorris reviewed Apr 25, 2016
View reviewed changes

estliberitas force-pushed the master branch 2 times, most recently from 7da4fd4 to c7066fb Compare April 26, 2016 05:23

bnoordhuis reviewed Apr 27, 2016
View reviewed changes

src/util.cc

Local<Value> value,

char** dst,

const size_t size) {

template<typename T>

Copy link

Member

bnoordhuis Apr 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: space after template.

trevnorris reviewed Apr 27, 2016
View reviewed changes

addaleax force-pushed the unify-string-values branch from a7aa94f to 14c7a02 Compare April 27, 2016 19:11

addaleax closed this Apr 29, 2016

addaleax deleted the unify-string-values branch April 29, 2016 13:48

This was referenced May 4, 2016

Propose v6.1.0 wrong base please ignore #6556

Closed

Propose v6.1.0 (Openssl update) #6557

Merged

MylesBorins added the lts-watch-v4.x label Jun 1, 2016

addaleax added dont-land-on-v5.x and removed lts-watch-v4.x labels Jun 2, 2016

addaleax mentioned this pull request Jun 2, 2016

v4.x: src: make sure Utf8Value always zero-terminates #7101

Closed

2 tasks

MylesBorins mentioned this pull request Jul 12, 2016

v4.5.0 proposal #7688

Merged

addaleax mentioned this pull request Mar 26, 2017

module: add support for abi stable module API #11975

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src: unify implementations of Utf8Value, TwoByteValue, etc. #6357

src: unify implementations of Utf8Value, TwoByteValue, etc. #6357

addaleax commented Apr 23, 2016

bnoordhuis Apr 25, 2016

bnoordhuis commented Apr 25, 2016

addaleax commented Apr 25, 2016

addaleax commented Apr 25, 2016

jasnell commented Apr 25, 2016

Fishrock123 commented Apr 25, 2016 •

edited

Loading

trevnorris Apr 25, 2016

addaleax Apr 25, 2016

trevnorris commented Apr 25, 2016

bnoordhuis Apr 27, 2016

addaleax commented Apr 27, 2016

trevnorris Apr 27, 2016

addaleax commented Apr 27, 2016

addaleax commented Apr 28, 2016

MylesBorins commented Jun 1, 2016

addaleax commented Jun 2, 2016

src: unify implementations of Utf8Value, TwoByteValue, etc. #6357

src: unify implementations of Utf8Value, TwoByteValue, etc. #6357

Conversation

addaleax commented Apr 23, 2016

Checklist

Affected core subsystem(s)

Description of change

bnoordhuis Apr 25, 2016

Choose a reason for hiding this comment

bnoordhuis commented Apr 25, 2016

addaleax commented Apr 25, 2016

addaleax commented Apr 25, 2016

jasnell commented Apr 25, 2016

Fishrock123 commented Apr 25, 2016 • edited Loading

trevnorris Apr 25, 2016

Choose a reason for hiding this comment

addaleax Apr 25, 2016

Choose a reason for hiding this comment

trevnorris commented Apr 25, 2016

bnoordhuis Apr 27, 2016

Choose a reason for hiding this comment

addaleax commented Apr 27, 2016

trevnorris Apr 27, 2016

Choose a reason for hiding this comment

addaleax commented Apr 27, 2016

addaleax commented Apr 28, 2016

MylesBorins commented Jun 1, 2016

addaleax commented Jun 2, 2016

Fishrock123 commented Apr 25, 2016 •

edited

Loading