Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: unify implementations of Utf8Value, TwoByteValue, etc. #6357

Closed
wants to merge 1 commit into from

Conversation

addaleax
Copy link
Member

Checklist
  • tests and code linting passes
  • the commit message follows commit guidelines
Affected core subsystem(s)

src

Description of change

Unify the common code of Utf8Value, TwoByteValue, BufferValue and StringBytes::InlineDecoder into one class. Always make the result zero-terminated for the first three.

This fixes two problems in passing:

  • When the conversion of the input value to String fails, make the buffer zero-terminated anyway. Previously, this would have resulted in possibly reading uninitialized data in multiple places in the code. An instance of that problem can be reproduced by running e.g. valgrind node -e 'net.isIP({ toString() { throw Error() } })'. (Most other places where Utf8Value or TwoByteValue can possibly receive non-strings work too, there’s nothing special about isIP.)
  • Previously, BufferValue copied one byte too much from the source, possibly resulting in an out-of-bounds memory access. This can be reproduced by running e.g. valgrind node -e 'fs.openSync(Buffer.from("node".repeat(8192)), "r")'.

If you don’t like the code changes by themselves because they don’t really modify behaviour in any way, I’d still like to see these issues fixed. Since I don’t see any reason why the latter one can’t occur in real life and bite users of the Buffer-accepting fs API in v6 with a segfault, I’m marking this with the 6.0.0 milestone.

/cc @bnoordhuis @jasnell

@addaleax addaleax added the c++ Issues and PRs that require attention from people who are familiar with C++. label Apr 23, 2016
@addaleax addaleax added this to the 6.0.0 milestone Apr 23, 2016
class Utf8Value {
// Allocates an array of member type T. For up to StackStorageSize items,
// the stack is used, otherwise malloc().
template<typename T, size_t StackStorageSize>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: space after template and can you call the second parameter kStackStorageSize?

@bnoordhuis
Copy link
Member

Mostly LGTM. I like how it gets rid of the duplication.

@addaleax
Copy link
Member Author

Updated this with @bnoordhuis’ suggestions. I lifted out() from StringBytes::InlineDecoder to the new MaybeStackBuffer class and used that for the decoding stuff instead of *this. I think the name is okay, and at least it doesn’t introduce anything really new.

@addaleax
Copy link
Member Author

@jasnell
Copy link
Member

jasnell commented Apr 25, 2016

Nice! I'd been hoping to get around to doing something similar but you beat me to it! LGTM if CI is green and @bnoordhuis and @trevnorris are happy!

@Fishrock123 Fishrock123 removed this from the 6.0.0 milestone Apr 25, 2016
@Fishrock123
Copy link
Contributor

Fishrock123 commented Apr 25, 2016

Clearing this from v6.0.0 since it doesn't appear to be a breaking change.

Edit: Sounds like this is a bug in v6

@Fishrock123 Fishrock123 added the confirmed-bug Issues with confirmed bugs. label Apr 25, 2016
fail_ = false;
size_t len = Buffer::Length(value);
EnsureSufficientStorage(len + 1);
memcpy(out(), Buffer::Data(value), len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing why you're adding 1 above, but no where else.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trevnorris In the other places, it’s added too, but rather to the length when determining the possibly necessary buffer size.

The only reason for that is that string->WriteUtf8 and string->Write return the actual length, so the calculations for the position of the trailing 0 byte don’t have to involve the “old” pre-computed buffer size there.

I can change this to align it with the other classes if you prefer, but I personally would find having to write memcpy(…, len - 1); the weirder of two choices.

@trevnorris
Copy link
Contributor

Great work. Don't have time at the moment to give this as thorough a review as I'd like, but I'll pick this up again tonight.

Local<Value> value,
char** dst,
const size_t size) {
template<typename T>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: space after template.

@addaleax
Copy link
Member Author

Updated this with nits addressed and using storage instead of len to (hopefully) have consistent and more appropriate variable naming.

} else {
buf_ = static_cast<T*>(malloc(sizeof(T) * storage));
CHECK_NE(buf_, nullptr);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add length_ = sizeof(T) * storage, then do something like CHECK_LE(length_, length) in SetLength()? Just an extra sanity check to make sure we're never accidentally allowing the writable length to be greater than the allocated memory.

@addaleax
Copy link
Member Author

Rebased, squashed and updated with @trevnorris’ recent suggestions

@addaleax
Copy link
Member Author

@addaleax addaleax closed this Apr 29, 2016
@addaleax addaleax deleted the unify-string-values branch April 29, 2016 13:48
Fishrock123 pushed a commit that referenced this pull request May 4, 2016
Unify the common code of `Utf8Value`, `TwoByteValue`, `BufferValue`
and `StringBytes::InlineDecoder` into one class. Always make the
result zero-terminated for the first three.

This fixes two problems in passing:

* When the conversion of the input value to String fails,
  make the buffer zero-terminated anyway. Previously, this would
  have resulted in possibly reading uninitialized data in multiple
  places in the code. An instance of that problem can be reproduced
  by running e.g.
  `valgrind node -e 'net.isIP({ toString() { throw Error() } })'`.
* Previously, `BufferValue` copied one byte too much from the source,
  possibly resulting in an out-of-bounds memory access.
  This can be reproduced by running e.g.
  `valgrind node -e \
    'fs.openSync(Buffer.from("node".repeat(8192)), "r")'`.

Further minor changes:
* This lifts the `out()` method of `StringBytes::InlineDecoder`
  to the common class so that it can be used when using the
  overloaded `operator*` does not seem appropiate.
* Hopefully clearer variable names.
* Add checks to make sure the length of the data does not exceed
  the allocated storage size, including the possible null terminator.

PR-URL: #6357
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Trevor Norris <[email protected]>
joelostrowski pushed a commit to joelostrowski/node that referenced this pull request May 4, 2016
Unify the common code of `Utf8Value`, `TwoByteValue`, `BufferValue`
and `StringBytes::InlineDecoder` into one class. Always make the
result zero-terminated for the first three.

This fixes two problems in passing:

* When the conversion of the input value to String fails,
  make the buffer zero-terminated anyway. Previously, this would
  have resulted in possibly reading uninitialized data in multiple
  places in the code. An instance of that problem can be reproduced
  by running e.g.
  `valgrind node -e 'net.isIP({ toString() { throw Error() } })'`.
* Previously, `BufferValue` copied one byte too much from the source,
  possibly resulting in an out-of-bounds memory access.
  This can be reproduced by running e.g.
  `valgrind node -e \
    'fs.openSync(Buffer.from("node".repeat(8192)), "r")'`.

Further minor changes:
* This lifts the `out()` method of `StringBytes::InlineDecoder`
  to the common class so that it can be used when using the
  overloaded `operator*` does not seem appropiate.
* Hopefully clearer variable names.
* Add checks to make sure the length of the data does not exceed
  the allocated storage size, including the possible null terminator.

PR-URL: nodejs#6357
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Trevor Norris <[email protected]>
@MylesBorins
Copy link
Contributor

@addaleax lts?

@addaleax
Copy link
Member Author

addaleax commented Jun 2, 2016

@thealphanerd I think the change here depends on too much other stuff but I’ll make a backport PR with one of the bugfixes here.

addaleax added a commit to addaleax/node that referenced this pull request Jun 2, 2016
Make sure dereferencing a `Utf8Value` instance always returns
a zero-terminated string, even if the conversion to string failed.

The corresponding bugfix in the master branch happened in 44a4032
(nodejs#6357).
MylesBorins pushed a commit that referenced this pull request Jun 2, 2016
Make sure dereferencing a `Utf8Value` instance always returns
a zero-terminated string, even if the conversion to string failed.

The corresponding bugfix in the master branch happened in 44a4032
(#6357).

Ref: #6357
PR-URL: #7101
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Myles Borins <[email protected]>
MylesBorins pushed a commit that referenced this pull request Jun 24, 2016
Make sure dereferencing a `Utf8Value` instance always returns
a zero-terminated string, even if the conversion to string failed.

The corresponding bugfix in the master branch happened in 44a4032
(#6357).

Ref: #6357
PR-URL: #7101
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Myles Borins <[email protected]>
MylesBorins pushed a commit that referenced this pull request Jun 24, 2016
Make sure dereferencing a `Utf8Value` instance always returns
a zero-terminated string, even if the conversion to string failed.

The corresponding bugfix in the master branch happened in 44a4032
(#6357).

Ref: #6357
PR-URL: #7101
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Myles Borins <[email protected]>
addaleax added a commit to addaleax/node that referenced this pull request Jul 12, 2016
Unify the common code of `Utf8Value`, `TwoByteValue`, `BufferValue`
and `StringBytes::InlineDecoder` into one class. Always make the
result zero-terminated for the first three.

This fixes two problems in passing:

* When the conversion of the input value to String fails,
  make the buffer zero-terminated anyway. Previously, this would
  have resulted in possibly reading uninitialized data in multiple
  places in the code. An instance of that problem can be reproduced
  by running e.g.
  `valgrind node -e 'net.isIP({ toString() { throw Error() } })'`.
* Previously, `BufferValue` copied one byte too much from the source,
  possibly resulting in an out-of-bounds memory access.
  This can be reproduced by running e.g.
  `valgrind node -e \
    'fs.openSync(Buffer.from("node".repeat(8192)), "r")'`.

Further minor changes:
* This lifts the `out()` method of `StringBytes::InlineDecoder`
  to the common class so that it can be used when using the
  overloaded `operator*` does not seem appropiate.
* Hopefully clearer variable names.
* Add checks to make sure the length of the data does not exceed
  the allocated storage size, including the possible null terminator.

PR-URL: nodejs#6357
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Trevor Norris <[email protected]>
MylesBorins pushed a commit that referenced this pull request Jul 12, 2016
Unify the common code of `Utf8Value`, `TwoByteValue`, `BufferValue`
and `StringBytes::InlineDecoder` into one class. Always make the
result zero-terminated for the first three.

This fixes two problems in passing:

* When the conversion of the input value to String fails,
  make the buffer zero-terminated anyway. Previously, this would
  have resulted in possibly reading uninitialized data in multiple
  places in the code. An instance of that problem can be reproduced
  by running e.g.
  `valgrind node -e 'net.isIP({ toString() { throw Error() } })'`.
* Previously, `BufferValue` copied one byte too much from the source,
  possibly resulting in an out-of-bounds memory access.
  This can be reproduced by running e.g.
  `valgrind node -e \
    'fs.openSync(Buffer.from("node".repeat(8192)), "r")'`.

Further minor changes:
* This lifts the `out()` method of `StringBytes::InlineDecoder`
  to the common class so that it can be used when using the
  overloaded `operator*` does not seem appropiate.
* Hopefully clearer variable names.
* Add checks to make sure the length of the data does not exceed
  the allocated storage size, including the possible null terminator.

PR-URL: #6357
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Trevor Norris <[email protected]>
MylesBorins pushed a commit that referenced this pull request Jul 12, 2016
Unify the common code of `Utf8Value`, `TwoByteValue`, `BufferValue`
and `StringBytes::InlineDecoder` into one class. Always make the
result zero-terminated for the first three.

This fixes two problems in passing:

* When the conversion of the input value to String fails,
  make the buffer zero-terminated anyway. Previously, this would
  have resulted in possibly reading uninitialized data in multiple
  places in the code. An instance of that problem can be reproduced
  by running e.g.
  `valgrind node -e 'net.isIP({ toString() { throw Error() } })'`.
* Previously, `BufferValue` copied one byte too much from the source,
  possibly resulting in an out-of-bounds memory access.
  This can be reproduced by running e.g.
  `valgrind node -e \
    'fs.openSync(Buffer.from("node".repeat(8192)), "r")'`.

Further minor changes:
* This lifts the `out()` method of `StringBytes::InlineDecoder`
  to the common class so that it can be used when using the
  overloaded `operator*` does not seem appropiate.
* Hopefully clearer variable names.
* Add checks to make sure the length of the data does not exceed
  the allocated storage size, including the possible null terminator.

PR-URL: #6357
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Trevor Norris <[email protected]>
@MylesBorins MylesBorins mentioned this pull request Jul 12, 2016
MylesBorins pushed a commit that referenced this pull request Jul 14, 2016
Unify the common code of `Utf8Value`, `TwoByteValue`, `BufferValue`
and `StringBytes::InlineDecoder` into one class. Always make the
result zero-terminated for the first three.

This fixes two problems in passing:

* When the conversion of the input value to String fails,
  make the buffer zero-terminated anyway. Previously, this would
  have resulted in possibly reading uninitialized data in multiple
  places in the code. An instance of that problem can be reproduced
  by running e.g.
  `valgrind node -e 'net.isIP({ toString() { throw Error() } })'`.
* Previously, `BufferValue` copied one byte too much from the source,
  possibly resulting in an out-of-bounds memory access.
  This can be reproduced by running e.g.
  `valgrind node -e \
    'fs.openSync(Buffer.from("node".repeat(8192)), "r")'`.

Further minor changes:
* This lifts the `out()` method of `StringBytes::InlineDecoder`
  to the common class so that it can be used when using the
  overloaded `operator*` does not seem appropiate.
* Hopefully clearer variable names.
* Add checks to make sure the length of the data does not exceed
  the allocated storage size, including the possible null terminator.

PR-URL: #6357
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Trevor Norris <[email protected]>
MylesBorins pushed a commit that referenced this pull request Jul 14, 2016
Unify the common code of `Utf8Value`, `TwoByteValue`, `BufferValue`
and `StringBytes::InlineDecoder` into one class. Always make the
result zero-terminated for the first three.

This fixes two problems in passing:

* When the conversion of the input value to String fails,
  make the buffer zero-terminated anyway. Previously, this would
  have resulted in possibly reading uninitialized data in multiple
  places in the code. An instance of that problem can be reproduced
  by running e.g.
  `valgrind node -e 'net.isIP({ toString() { throw Error() } })'`.
* Previously, `BufferValue` copied one byte too much from the source,
  possibly resulting in an out-of-bounds memory access.
  This can be reproduced by running e.g.
  `valgrind node -e \
    'fs.openSync(Buffer.from("node".repeat(8192)), "r")'`.

Further minor changes:
* This lifts the `out()` method of `StringBytes::InlineDecoder`
  to the common class so that it can be used when using the
  overloaded `operator*` does not seem appropiate.
* Hopefully clearer variable names.
* Add checks to make sure the length of the data does not exceed
  the allocated storage size, including the possible null terminator.

PR-URL: #6357
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Trevor Norris <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. confirmed-bug Issues with confirmed bugs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants