Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: Make numeric values and operations default to mathematical values #2007

Merged
merged 1 commit into from
Oct 13, 2020

Conversation

caiolima
Copy link
Contributor

As reported on #1964, the fact that we are using Number as default representation of numeric literals and operations creates uncounted bugs throughout the spec. This PR is proposing to rollback the default value to mathematical value. Since the major intention of PR #1135 was to avoid implicit conversions between numeric versions, we are keeping this change but it necessary to properly cast those numeric types when they are manipulated, since mixing different numeric types is not allowed.

The goals we want to achieve here are:

  1. Resolve bugs on current version of spec like Definition of modulo is broken #1960 or Typo in 6.2.4 #1229.
  2. Mathematical values should never be leaked and are internal concept of ECMA262 specification. If this happens, it should be considered as editorial error.
  3. Avoid the maximum of explicit conversions between Number, BigInt and Mathematical values.

Cc. @littledan @waldemarhorwat @ljharb @syg @michaelficarra @bakkot

PS. It's still a WIP PR, but I think we have enough changes to discuss some changes made before I keep moving forward.

@caiolima caiolima force-pushed the mathematical-values-fix branch from f585605 to a14ccc2 Compare May 22, 2020 17:38
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
Copy link
Contributor Author

@caiolima caiolima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some early comments on places I need to change already.

spec.html Outdated Show resolved Hide resolved
spec.html Show resolved Hide resolved
spec.html Show resolved Hide resolved
spec.html Outdated
1. Let _n_ be 0.
1. For each element _e_ of _elements_, do
1. Perform ! CreateDataPropertyOrThrow(_array_, ! ToString(_n_), _e_).
1. Perform ! CreateDataPropertyOrThrow(_array_, ! ToString(𝔽(_n_)), _e_).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be better to use:

1. Let _nNumber_ be the Number value for _n)_.
1. Perform ! CreateDataPropertyOrThrow(_array_, ! ToString(_nNumber_), _e_).

This pattern is already used in other places of this PR, so it would be good to be consistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the more terse form. Is there a benefit to splitting it out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this case there is no much benefit, but there are other places where we keep applying 𝔽(_n_) very often. The advantage of splitting them out is that it is easier for someone not involved with ECMA262 understand, while 𝔽(_n_) requires understanding what 𝔽 means. I'm fine with leaving it as it is. What do you think of other places where we are using splitted version?

spec.html Outdated Show resolved Hide resolved
spec.html Outdated
1. Else,
1. Let _e_, _n_, and _f_ be integers such that _f_ &ge; 0, 10<sup>_f_</sup> &le; _n_ &lt; 10<sup>_f_ + 1</sup>, the Number value for ℝ(_n_) &times; 10<sub>ℝ</sub><sup>ℝ(_e_) - ℝ(_f_)</sup> is _x_, and _f_ is as small as possible. Note that the decimal representation of _n_ has _f_ + 1<sub>ℝ</sub> digits, _n_ is not divisible by 10, and the least significant digit of _n_ is not necessarily uniquely determined by these criteria.
1. Let _e_, _n_ be integers, and _f_ be a Number whose mathematical value is an integer such that ℝ(_f_) &ge; 0, 10<sup>ℝ(_f_)</sup> &le; _n_ &lt; 10<sup>ℝ(_f_) + 1</sup>, the Number value for _n_ &times; 10<sup>_e_ - ℝ(_f_)</sup> is ℝ(_x_), and _f_ is as small as possible. Note that the decimal representation of _n_ has ℝ(_f_) + 1 digits, _n_ is not divisible by 10, and the least significant digit of _n_ is not necessarily uniquely determined by these criteria.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good example of place where we could create a new variable to store MV of x and f. We have multiple calls of ℝ(x) and ℝ(f).

@ljharb
Copy link
Member

ljharb commented May 22, 2020

Marking this as a draft; when it's no longer WIP, please mark it as ready for review :-)

@ljharb ljharb marked this pull request as draft May 22, 2020 18:23
@ljharb ljharb changed the title WIP - Make numeric values and operations default to mathematical values Editorial: Make numeric values and operations default to mathematical values May 22, 2020
@ljharb ljharb added editor call to be discussed in the next editor call editorial change labels May 22, 2020
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
@tc39 tc39 deleted a comment from ljharb May 22, 2020
@michaelficarra
Copy link
Member

@ljharb I don't know what comment of yours GitHub is saying I deleted, but if I did, it was a mistake.

spec.html Outdated Show resolved Hide resolved
@bakkot
Copy link
Contributor

bakkot commented May 26, 2020

I think we generally write Number values with *, as in *1*, and write mathematical values without it, as in 1. (And BigInts as *1n*.)

If we keep to that convention, I don't know that the subscripts are necessary.

@ljharb
Copy link
Member

ljharb commented May 26, 2020

That's true, but explicit notation seems more readable to me than a style-based convention.

Copy link

@waldemarhorwat waldemarhorwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this!

My main observations:

  • I like the use of ℝ(x) as a coercion operator to mathematical values. I don't think we should use ℝ as a subscript to indicate integers or real numbers; instead, those should have no suffix.

  • Having generic operations other than the simplest such as abs() across Numbers and real numbers doesn't work because of various Number glitches (±0, ±∞, NaN, rounding, etc.). We should define those only on real numbers and call utility functions if they're needed on Numbers.

  • There are more places in the algorithms where keeping integral Number counters causes problems if they ever exceed 253. I commented on one example, but there are more and I haven't done an exhaustive pass to find them all. I consider uses of integral Numbers for spec-internal counters suspicious unless proven otherwise.

spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
@@ -4892,9 +4893,9 @@ <h1>ToInt32 ( _argument_ )</h1>
<emu-alg>
1. Let _number_ be ? ToNumber(_argument_).
1. If _number_ is *NaN*, *+0*, *-0*, *+&infin;*, or *-&infin;*, return *+0*.
1. Let _int_ be the Number value that is the same sign as _number_ and whose magnitude is floor(abs(_number_)).
1. Let _int_ be the mathematical value that is the same sign as _number_ and whose magnitude is floor(abs(_number_)).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird to ask for a mathematical value whose magnitude is a Number. Make where the conversion happens a bit more explicit.

spec.html Outdated
@@ -8940,11 +8941,11 @@ <h1>[[DefineOwnProperty]] ( _P_, _Desc_ )</h1>

<emu-clause id="sec-arraycreate" aoid="ArrayCreate">
<h1>ArrayCreate ( _length_ [ , _proto_ ] )</h1>
<p>The abstract operation ArrayCreate takes argument _length_ (a non-negative integer) and optional argument _proto_. It is used to specify the creation of new Array exotic objects. It performs the following steps when called:</p>
<p>The abstract operation ArrayCreate takes argument _length_ (a non-negative integral Number) and optional argument _proto_. It is used to specify the creation of new Array exotic objects. It performs the following steps when called:</p>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a "non-negative integral Number"? With real numbers it's clear: 0 is non-negative. But is the Number -0 negative or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to define it, as we are doing for integral Number. Would it be a problem if we define that non-negative integral Number is integral Numbers greater than +0, considering that +0 is greater than -0?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caiolima +0 is not greater than -0. I would imagine the term "non-negative" when referring to integral Numbers simply refers to whether the sign bit is set. So this would exclude -0. If we want to include -0, I would just add "or -0". But we could use mathematical values here instead, right?

@@ -28583,7 +28588,7 @@ <h1>TimeClip ( _time_ )</h1>
<p>The abstract operation TimeClip takes argument _time_ (a Number). It calculates a number of milliseconds. It performs the following steps when called:</p>
<emu-alg>
1. If _time_ is not finite, return *NaN*.
1. If abs(_time_) &gt; 8.64 &times; 10<sup>15</sup>, return *NaN*.
1. If abs(_time_) &gt; 8.64<sub>𝔽</sub> &times; 10<sub>𝔽</sub><sup>15<sub>𝔽</sub></sup>, return *NaN*.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exponentiation is not defined on Numbers. And it's weird to invoke it here using a Number exponent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does Math.pow/** do on Numbers?

spec.html Outdated
1. Find a value _t_ such that YearFromTime(_t_) is _ym_ and MonthFromTime(_t_) is _mn_ and DateFromTime(_t_) is 1; but if this is not possible (because some argument is out of range), return *NaN*.
1. Return Day(_t_) + _dt_ - 1.
1. Return Day(_t_) + _dt_ - 1<sub>𝔽</sub>.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the order of operations here? It matters because Number arithmetic is not associative.

spec.html Outdated
@@ -30529,8 +30536,8 @@ <h1>String.prototype.substring ( _start_, _end_ )</h1>
1. Let _len_ be the length of _S_.
1. Let _intStart_ be ? ToInteger(_start_).
1. If _end_ is *undefined*, let _intEnd_ be _len_; else let _intEnd_ be ? ToInteger(_end_).
1. Let _finalStart_ be min(max(_intStart_, 0), _len_).
1. Let _finalEnd_ be min(max(_intEnd_, 0), _len_).
1. Let _finalStart_ be min(max(ℝ(_intStart_), 0), _len_).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this notation for "mathematical value".

@caiolima
Copy link
Contributor Author

caiolima commented May 27, 2020

Thank you very much for the reviews so far. I'm going to keep addressing them in the coming days.

  • I like the use of ℝ(x) as a coercion operator to mathematical values. I don't think we should use ℝ as a subscript to indicate integers or real numbers; instead, those should have no suffix.

That's the intention, but IIUC from your comment, you would like to don't define subscript ℝ? From the changes I've done so far, I agree that they won't be necessary.

  • Having generic operations other than the simplest such as abs() across Numbers and real numbers doesn't work because of various Number glitches (±0, ±∞, NaN, rounding, etc.). We should define those only on real numbers and call utility functions if they're needed on Numbers.

I think I see your point here. So far I've been using Number operations like min or max considering semantics of Math.min and Math.max, but it is not quite correct, since we don't have special cases for (±0, ±∞, NaN) on those generic operations AFICT. I'll only consider those operations valid on mathematical values and add a note that if we have Numbers or BigInts as input on those operations, it is an Editorial error.

  • There are more places in the algorithms where keeping integral Number counters causes problems if they ever exceed 253. I commented on one example, but there are more and I haven't done an exhaustive pass to find them all. I consider uses of integral Numbers for spec-internal counters suspicious unless proven otherwise.

Those are indeed bugs and thank you for bringing this. I changed a bunch of them, but I'm quite sure that there are leftovers like the one you found.

@michaelficarra
Copy link
Member

Some updates from today's editor call:

  • Number literals should use the subscript 𝔽 as in *123*<sub>𝔽</sub> to distinguish them from mathematical values, written as 123, and BigInt values, written as *123*<sub>ℤ</sub>.
  • No subscript 𝔽 for *NaN*, *+Infinity*, or *-Infinity*.
  • *+0*<sub>𝔽</sub> and *-0*<sub>𝔽</sub> should always be written with the sign indicated and with the subscript 𝔽. Never use *±0*<sub>𝔽</sub>.
  • Never include subscripts on operators.
  • Do not use X is zero. Use something like X is *+0*<sub>𝔽</sub> or *-0*<sub>𝔽</sub> for Numbers. Use something like X is 0 for mathematical values.
  • Never negate the mathematical value literal 0, as in -0 or ±0.

@bakkot
Copy link
Contributor

bakkot commented May 27, 2020

you would like to don't define subscript ℝ? From the changes I've done so far, I agree that they won't be necessary.

Editor group agrees with this. A string of digits without a subscript will always be a mathematical value. (They would also be written normally, not in bold.)

@waldemarhorwat
Copy link

I like the conclusions from the last two comments.

In some contexts it may be attractive to abbreviate "+Infinity or -Infinity" as "±Infinity" and likewise "+0𝔽 or -0𝔽" as "±0𝔽". I wouldn't mind such ± abbreviations — they can be explicitly defined somewhere in the spec but should be fairly obvious to even casual readers. I'm also OK with always writing them out the long way as well if that's what we chose.

@ljharb ljharb merged commit 5c7ef70 into tc39:master Oct 13, 2020
Ms2ger added a commit to tc39/proposal-temporal that referenced this pull request Oct 14, 2020
See tc39/ecma262#2007. More work will be needed, but
this already fixes the warnings.
Ms2ger added a commit to tc39/proposal-temporal that referenced this pull request Oct 14, 2020
See tc39/ecma262#2007. More work will be needed, but
this already fixes the warnings.
rkirsling added a commit to rkirsling/ecma262 that referenced this pull request Oct 15, 2020
jmdyck added a commit to jmdyck/ecma262 that referenced this pull request Dec 10, 2020
The initial value of a 'length' property should be a Number,
so use the appropriate notation, after PR tc39#2007.
jmdyck added a commit to jmdyck/ecma262 that referenced this pull request Dec 10, 2020
The initial value of a 'length' property should be a Number,
so use the appropriate notation, after PR tc39#2007.
ljharb pushed a commit to jmdyck/ecma262 that referenced this pull request Dec 16, 2020
…9#2248)

The initial value of a 'length' property should be a Number,
so use the appropriate notation, after PR tc39#2007.
Ms2ger added a commit to tc39/proposal-temporal that referenced this pull request Jan 15, 2021
ptomato pushed a commit to tc39/proposal-temporal that referenced this pull request Jan 15, 2021
anba added a commit to anba/ecma262 that referenced this pull request Aug 4, 2022
….substr

`"a".substr(0, Infinity)` should return `"a"`, but tc39#2007 incorrectly changed the result to be `""`.

If we change `substr`'s algorithm to be more similar to
`String.prototype.substring` and `String.prototype.slice`, we can easily
fix this issue:

1. `min(intStart, size)` guarantees that `intStart` is now in `[0, size]`.
2. Clamping `intLength` guarantees it's also in `[0, size]`.

With these two changes, `intEnd = min(intStart + intLength, size)` is
now in range `[intStart, size]`, so we no longer have to check for
`intStart ≥ intEnd`, but instead can directly perform the `substring`
operation.
ljharb pushed a commit to anba/ecma262 that referenced this pull request Sep 21, 2022
….substr (tc39#2844)

`"a".substr(0, Infinity)` should return `"a"`, but tc39#2007 incorrectly changed the result to be `""`.

If we change `substr`'s algorithm to be more similar to
`String.prototype.substring` and `String.prototype.slice`, we can easily
fix this issue:

1. `min(intStart, size)` guarantees that `intStart` is now in `[0, size]`.
2. Clamping `intLength` guarantees it's also in `[0, size]`.

With these two changes, `intEnd = min(intStart + intLength, size)` is
now in range `[intStart, size]`, so we no longer have to check for
`intStart ≥ intEnd`, but instead can directly perform the `substring`
operation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants