Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: fix range checking for slowToString #2919

Conversation

thefourtheye
Copy link
Contributor

Following the discussion #2668,
this patch fixes the first problem discussed there. If start is not
a valid number in the range, then the default value zero will be used.
Same way, if end is not a valid number in the accepted range, then,
by default, the length of the buffer is assumed.

cc @trevnorris @ChALkeR

@thefourtheye thefourtheye added the buffer Issues and PRs related to the buffer subsystem. label Sep 16, 2015
@trevnorris
Copy link
Contributor

I would change the commit message to start with This patch fixes ... then place Following the discussion nodejs#2668 at the bottom as metadata. e.g.

Ref: https://github.com/nodejs/node/issues/2668

end = this.length;
}
end = +end;
end = end < 0 ? this.length : end >>> 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want NaN values to coerce to 0? (realize that is what was done before, but while were here :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I ll make that to this.length :-)

@thefourtheye
Copy link
Contributor Author

@trevnorris Updated for NaN.

@trevnorris
Copy link
Contributor

Great. If CI is happy then LGTM.

This is a semver-major, so will go in the v5 release.

@brendanashworth brendanashworth added the semver-major PRs that contain breaking changes and should be released in the next major version. label Sep 17, 2015
@thefourtheye
Copy link
Contributor Author

@ChALkeR
Copy link
Member

ChALkeR commented Sep 18, 2015

I am not able to test this now.
What happens on large positive / large negative numbers? Could you include those in the testcase?

@trevnorris
Copy link
Contributor

@ChALkeR They'll wrap. Same as they always have.

@ChALkeR
Copy link
Member

ChALkeR commented Sep 19, 2015

@trevnorris Ah. So that is out of scope for this commit.

This commit now converts negative and NaN values of end to this.length. This looks mostly compatible to what there was before this patch.

LGTM, though such behavior is a bit strange.

@ChALkeR
Copy link
Member

ChALkeR commented Sep 19, 2015

Judging from the surrounding code, it looks like negative values of end were meant to return an empty string. @trevnorris, what do you think? Though that will be a breaking change, while this one isn't.

@trevnorris
Copy link
Contributor

@ChALkeR Good catch. You are correct.

This PR is a great fix for existing undefined behavior, and making end equal to the length if less than zero seems counter intuitive. Personal opinion, it should be set equal to zero.

@thefourtheye
Copy link
Contributor Author

@trevnorris @ChALkeR Made sure that the negative value for end is taken as 0.

@trevnorris
Copy link
Contributor

I'd still recommend putting the reference issue as I posted above, but if CI is good then LGTM.

@thefourtheye
Copy link
Contributor Author

CI Run before landing: https://ci.nodejs.org/job/node-test-pull-request/364/

@trevnorris I'll fix that while landing. @ChALkeR LGTY?

@ChALkeR
Copy link
Member

ChALkeR commented Sep 22, 2015

LGTM code-wise.

Only a minor notice that I would re-order the lines and group together the checks based on the variable that they are related to. Something like this:

  start = +start;
  if (!Number.isInteger(start)) {
    start = start >>> 0;
  }
  if (start < 0) start = 0;

  if (end === undefined || end === Infinity || end !== end) {
    end = this.length;
  }
  end = +end < 0 ? 0 : end >>> 0;
  if (end > this.length) end = this.length;
  if (end <= start) return '';

  if (!encoding) encoding = 'utf8';
  var loweredCase = false;

But that is not required and not everyone might agree with this change, I guess.
@trevnorris ?

@ChALkeR
Copy link
Member

ChALkeR commented Sep 22, 2015

Eh? this is semver-major? If yes, should the start/end arguments be fixed here completely so they don't get wrapped? @trevnorris, what do you think?

Here is what I mean:

  start = +start;
  if (start >= this.length) return '';
  start = (start > 0) ? start >>> 0 : 0;

  if (end === undefined || end === Infinity || end !== end) {
    end = this.length;
  } else {
    end = +end;
    if (end <= start) return '';
    end = (end <= this.length) ? end >>> 0 : this.length;
  }

  if (!encoding) encoding = 'utf8';
  var loweredCase = false;

@trevnorris
Copy link
Contributor

Was going to save some of this for later, but while we're here.

I'd say while your here we should make sure parsing is consistent. For example end is first uint32 coerced in JS then int32 coerced in C++. I may suggest we change ParseArrayIndex() in node_internals.h to use Uint32Value() instead of Int32Value(). Then make sure values in JS are a uint32.

This would mean the values will go through n >>> 0 before being passed to C++.

Open to ideas on how we handle out of range values. I suppose we'll floor any non-integers that are in the uint32 range with the >>> operation.

@ChALkeR
Copy link
Member

ChALkeR commented Sep 22, 2015

Yeah, the internals should have unsigned 32-bit numbers, and the js side should validate the indicies for them to be in the allowable range (from 0 to this.length). That's exactly what I proposed on the js side in the above message.

Atm, this PR converts the start to start >>> 0 without first checking that it's in between 0 and this.length, so 4294967296 would become 0, which is unexpected and should probably be fixed here while we are on a semver-major change.

@trevnorris
Copy link
Contributor

okay. so how about this:

if (start >= this.length || end <= 0)
  return '';

if (start < 0 || start >>> 0 != parseInt(+start))
  start = 0;
else
  start >>>= 0;

if (end > this.length || end >>> 0 != parseInt(+end))
  end = this.length;
else
  end >>>= 0;

if (end <= start)
  return '';

That should handle all the cases discussed on IRC.

UPDATE: Based on IRC discussion.

@ChALkeR
Copy link
Member

ChALkeR commented Sep 22, 2015

@trevnorris That code LGTM.

@trevnorris
Copy link
Contributor

@ChALkeR awesome. Great team work everyone putting together these coercion checks.

If `start` is not a valid number in the range, then the default value
zero will be used. Same way, if `end` is not a valid number in the
accepted range, then, by default, the length of the buffer is assumed.

Ref: nodejs#2668

PR-URL: nodejs#2919
@thefourtheye thefourtheye force-pushed the buffer-slow-to-string-range-fix branch from b2b9e5d to ac60740 Compare September 22, 2015 22:46
assert.equal(rangeBuffer.toString('ascii', 0, NaN), 'abc');
assert.equal(rangeBuffer.toString('ascii', 0, undefined), 'abc');
assert.equal(rangeBuffer.toString('ascii', 0), 'abc');
assert.equal(rangeBuffer.toString('ascii', 0, null), '');
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if these four tests returning empty strings is okay though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should end be coerced to 0 instead of this.length?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the end value is not valid, simply ignoring it and assuming the actual length would be logical, wouldn't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (start >= this.length || end <= 0)

Let's try if (start >= this.length || end < 0).

@thefourtheye
Copy link
Contributor Author

@ChALkeR @trevnorris I updated the PR as per our discussion. PTAL.

* Note that, the comparison is done with abstract equality operator (!=) to
* allow stringified numbers ('1') and Number objects (new Number(5)).
*/
if (start < 0 || start >>> 0 != parseInt(+start))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you want to, but maybe also add to comments why both parseInt() and the + are necessary. For example:

parseInt('0o1101') === 0
parseInt(+'0o1101') === 577

but also that parseInt() is necessary to do int coercion at 2^53 to make sure there's no wrap around.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, should be able to use !== here b/c start is being coerced to a number before comparison.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but may also want to note that this is technically an int32 check, not uint32, b/c the value will be coerced to a uint32 below with start >>>= 0. so we're saving ourselves an operation.

@trevnorris
Copy link
Contributor

two comments, and amazing job on the tests.

@@ -322,13 +322,31 @@ Object.defineProperty(Buffer.prototype, 'offset', {
function slowToString(encoding, start, end) {
var loweredCase = false;

start = start >>> 0;
end = end === undefined || end === Infinity ? this.length : end >>> 0;
if (start >= this.length || end <= 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe end < 0 would better here instead of end <= 0? In order to treat false as this.length, for example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so all falsey values will evaluate to default value? I'd be cool with that, except we also assert that true is coerced to 1. seems like a small inconsistency to evaluate the two types differently.

eh. guess it's instead falsey vs. truthy. okay, i'm cool with it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to revise my opinion. I think we should imitate similar behavior as v8 in other scenarios. Here's one using Typed Arrays:

let ab = new ArrayBuffer(8);
new Uint8Array(ab, 0, false).length === 0;
new Uint8Array(ab, 0, null).length === 0;
new Uint8Array(ab, 0, NaN).length === 0;
new Uint8Array(ab, 0, undefined).length === 8;

So end only receives the default length iff end === undefined.

There are cases where Typed Arrays throw where we can't, but we'll leave those aside.

@trevnorris
Copy link
Contributor

@thefourtheye want to move forward with this?

@trevnorris trevnorris self-assigned this Nov 12, 2015
@matthewloring
Copy link

@thefourtheye I can take this over and move it forward if you're ok with that.

@thefourtheye
Copy link
Contributor Author

@matthewloring please go ahead. Thanks for taking it up :)

trevnorris pushed a commit that referenced this pull request Dec 7, 2015
If `start` is not a valid number in the range, then the default value
zero will be used. Same way, if `end` is not a valid number in the
accepted range, then, by default, the length of the buffer is assumed.

Fixes: #2668
Ref: #2919
PR-URL: #4019
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
thefourtheye added a commit that referenced this pull request Dec 7, 2015
Verify that start and end are coerced properly.

Ref: #2919
PR-URL: #4019
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
rvagg pushed a commit that referenced this pull request Dec 8, 2015
If `start` is not a valid number in the range, then the default value
zero will be used. Same way, if `end` is not a valid number in the
accepted range, then, by default, the length of the buffer is assumed.

Fixes: #2668
Ref: #2919
PR-URL: #4019
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
thefourtheye added a commit that referenced this pull request Dec 8, 2015
Verify that start and end are coerced properly.

Ref: #2919
PR-URL: #4019
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
@thefourtheye thefourtheye deleted the buffer-slow-to-string-range-fix branch January 28, 2016 07:49
scovetta pushed a commit to scovetta/node that referenced this pull request Apr 2, 2016
If `start` is not a valid number in the range, then the default value
zero will be used. Same way, if `end` is not a valid number in the
accepted range, then, by default, the length of the buffer is assumed.

Fixes: nodejs#2668
Ref: nodejs#2919
PR-URL: nodejs#4019
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
scovetta pushed a commit to scovetta/node that referenced this pull request Apr 2, 2016
Verify that start and end are coerced properly.

Ref: nodejs#2919
PR-URL: nodejs#4019
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. semver-major PRs that contain breaking changes and should be released in the next major version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants