-
Notifications
You must be signed in to change notification settings - Fork 57
Truncate BigInts when setting UintXArray or IntXArray instead of throwing #137
Comments
The decision to not do #106 wasn't centered on losing data after the decimal point. It was more about having a mental model which is based on keeping track of whether you are working with a Number or a BigInt. See the committee notes. What do you think of this argument? |
That's not the issue. Consider the following code: const val = 0x10000000000000000000000000001n;
const buf = new Uint8Array(8);
buf[0] = val; With #106,
The only argument I found in the notes that seems to apply to my suggestion is the following:
I don't see what issues this can cause in practice. In fact, it's already the case with IntXArrays and UintXArrays, since they allow you to write a string, for example. I do agree with the following argument:
But it applies to #106 and is not an issue if modulo is applied to the numeric value directly, without converting to |
@waldemarhorwat said,
What do you say to this concern? |
Me too. But my proposal doesn't involve any coercions (any more than Numbers are now "coerced" into uint8 when writing into Uint8Array).
Adding a set of BigIntXArrays and BigUintXArrays just for the type you get when reading from it doesn't seem worthwhile to me, considering you can convert it yourself by calling |
We discussed this point at TC39, about there already being coercions and information loss and this not making it worse. However, the point here is about coercions between types. I don't think TC39 will permit coercions between types here, as it's a big mismatch with our general no-coercions design.
Agreed. I don't want to add these. For this reason, if we frame things according to @waldemarhorwat's choice, I'd rather take the no coercions route. I believe we have discussed the idea of adding coercions but not array types and it was rejected already, without resting on the decision of whether to round or throw after the decimal point. |
This proposal doesn't add any new coercions between types either. TypedArrays don't store values as Numbers. When you write a Number to a Uint8Array and then read it, the conversion goes Number -> byte -> Number. With this proposal, if you write a BigInt and then read it, it goes BigInt -> byte -> Number. Where's the new coercion here?
I don't understand why not adding these leads to the no coercions route.
@waldemarhorwat mentioned rounding as a problem, see the last quote in #137 (comment).
What does this mean? The decimal point is not relevant here, see my example in #137 (comment). |
I'd really prefer not to have to present a change to the committee unless there is a very strong motivation for the change, given the back and forth so far. It seems like you can get by without this change with an explicit cast. When working with BigInts and Numbers, it's necessary to understand which type you are working with, given how arithmetic operations do not permit mixed types. Given that, and the name Big in 64-bit arrays, will it ever be unintuitive to know whether you need to insert an explicit cast? A concrete example would help. The long series of 8-bit Sets at the beginning of this post doesn't seem so relevant to the question; that could be handled by making a BigInt64Array out of the underlying ArratBuffer. |
Consider the following scenario:
That's the problem, an explicit cast is an "obvious" but incorrect solution. You have to perform the modulo operation yourself before casting, which is less than intuitive.
That's not an option for Node.js, or anyone who needs to write in non-platform byte order. |
This scenario helps a lot. Thanks for explaining. Another way to look at this issue, rather than it being about TypedArrays, would be about whether we should throw exceptions when casting from BigInt to Number and losing information. The cast from BigInt to Number will silently round if the BigInt is too big. Maybe this is the problematic decision. By contrast, the cast from Number to BigInt will throw an exception of the Number has something which cannot be accurately represented. @jakobkummerow previously pointed this out. We decided on the rounding behavior for BigInt-to-Number in #14 based on @waldemarhorwat's guidance, and didn't really revisit it when we decided to throw some exceptions in #15 . Was this an oversight? |
I agree that casting from BigInt to Number is worth discussing, but it's a separate issue. If casting BigInt to Number throws when the BigInt is too big, then the step 4 in my scenario will be an error. That's better than incorrect behavior, but less than ideal. If setting an index in a Uint8(16,32)Array to a BigInt value performs a modulo operation on the BigInt value, then the developers won't have to wonder how to correctly convert BigInt to Number without rounding or errors. |
Developers who do things that are mixed between BigInts and Numbers will have to use explicit cast operations. This is core to the design of BigInt. If you're using TypedArrays, you're doing an even more advanced thing than basic arithmetic, so I expect the need for explicit casts to be even less of a problem. The Number constructor applied to a string does rounding as well. |
If we do that, then it will complicate the implementation of a 64-bit integer -> double conversion as found in many source languages. For example, in Java It's OK if the default |
OK, at this point, I'm strongly leaning to maintaining the current specification as is. |
Writing a BigInt to a UintXArray or IntXArray is a "thing that is mixed between BigInts and Numbers" only because the current spec requires the value you write to be a Number. My proposal makes it a thing that is not mixed between BigInts and Numbers. Also, the explicit cast is not the problem - having to do the modulo operation yourself to avoid rounding errors is the problem.
I'm not proposing to change the behavior of the Number constructor, so I don't see how this is relevant. It seems you misunderstand my proposal. Perhaps I should draft up a diff for the spec to make things clearer? |
Feel free to draft up a specification change. I think we've already discussed this in committee, though, and we're unlikely to make this change. |
It's quite simple: diff --git a/spec.html b/spec.html
index 69525af..8e2aebe 100644
--- a/spec.html
+++ b/spec.html
@@ -1665,7 +1665,7 @@ <h1>IntegerIndexedElementSet ( _O_, _index_, _value_ )</h1>
1. Let _arrayTypeName_ be the String value of _O_.[[TypedArrayName]].
1. Let _elementType_ be the String value of the Element Type value in <emu-xref href="#table-49"></emu-xref> for _arrayTypeName_.
1. <ins>If _arrayTypeName_ is `"BigUint64Array"` or `"BigInt64Array"`, let _numValue_ be ? ToBigInt(_value_).</ins>
- 1. <ins>Otherwise,</ins> let _numValue_ be ? ToNumber(_value_).
+ 1. <ins>Otherwise,</ins> let _numValue_ be ? ToNumeric(_value_).
1. Let _buffer_ be _O_.[[ViewedArrayBuffer]].
1. If IsDetachedBuffer(_buffer_) is *true*, throw a *TypeError* exception.
1. If IsInteger(_index_) is *false*, return *false*.
@@ -1688,7 +1688,7 @@ <h1>SetValueInBuffer ( _arrayBuffer_, _byteIndex_, _type_, _value_, _isTypedArra
1. Assert: IsDetachedBuffer(_arrayBuffer_) is *false*.
1. Assert: There are sufficient bytes in _arrayBuffer_ starting at _byteIndex_ to represent a value of _type_.
1. Assert: _byteIndex_ is an integer value ≥ 0.
- 1. Assert: Type(_value_) is <ins>BigInt if _type_ is `"BigInt64"` or `"BigUint64"`; otherwise, Type(_value_) is Number</ins>.
+ 1. Assert: Type(_value_) is <ins>BigInt if _type_ is `"BigInt64"` or `"BigUint64"`; otherwise, Type(_value_) is Number or BigInt</ins>.
1. Let _block_ be _arrayBuffer_.[[ArrayBufferData]].
1. Let _elementSize_ be the Number value of the Element Size value specified in <emu-xref href="#table-49"></emu-xref> for Element Type _type_.
1. If _isLittleEndian_ is not present, set _isLittleEndian_ to the value of the [[LittleEndian]] field of the surrounding agent's Agent Record. Would you like me to create a PR with it? |
I don't think this change would be enough to implement the behavior you've been describing; you'd really want to reland most of #106, with some small tweaks. Anyway, I was just saying feel free to write it since you suggested it; ultimately making the design decision is the hard part, and we'll be able to work out the spec text. I think the next step would be to find a TC39 delegate who is interested in championing your change. I don't plan to champion it, because I'm unconvinced it makes sense as a change, based on the reasoning I gave earlier in this thread. |
You're right - the conversion operations for TypedArray must not have first step that calls What else is missing in your opinion?
As I understood, the reasoning you have given so far refers to avoiding implicit casts to Number. But as you can see, there are no implicit casts to Number here. In fact, my proposal allows one to avoid a cast to Number, explicit or implicit, in the scenario I described in #137 (comment). I really don't understand your opposition. |
Whether you are doing some kind of underlying cast or not depends on your perspective. One way if thinking about TypedArrays is in terms of the underlying ArrayBuffer; another way is in terms of containing Numbers (or now BigInts) in a certain range. It's not only about casts, but about overloading. In general, operations are made to work for BigInt or Number, but not both, except in particular cases where it is very justified. The idea is to reinforce the concept that programmers keep track of types and keep things generally monomorphic. |
I'd prefer to solve problems like this by just supporting BigUint8Array (and BigInt typed arrays of the other integer widths) directly so you never need to convert to Number on reading or writing. |
OK, let's consider adding BigInt8Array as a possible follow-on proposal, and stick with the minimal proposal for now with strict checks and without this type. |
I provided a scenario where requiring Numbers on writing can lead to incorrect code, is that not sufficient justification?
What's the value of this idea in this particular case? Does it help prevent bugs? Does it make things simpler?
Why? Adding BigUint8Array won't help people who are already using Uint8Array, as I mentioned in #137 (comment). |
See this section of the explainer for background. |
I agree with that section, but I don't see how it applies here. This is not a "messy situation", and there is a good (and intuitive) answer to converting a BigInt to a [u]int{8|16|32}. |
Rather than all this The only potential concern I see is that you'd have to specify the endianess of |
@BlueRaja Interesting idea, but is it worth the complexity? I don't see significant problems with requiring an explicit cast here. |
OK, I haven't heard any strong evidence for why we need this truncation. It's possible to write incorrect code with BigInts, but the Number constructor is an explicit, information-losing cast that can be pointed to and located as the source of the problem. Given that the proposal is already at Stage 3, and BigInt8Array is a compatible follow-on proposal, I'm closing this issue. |
Currently, the following throws a TypeError:
Simply casting to
Number
is not sufficient, since it can cause a loss of precision. The code for writing a 64-bit integer into a Node.js Buffer instance (which is also aUint8Array
instance) would look something like this:This is not only verbose, but also weird, because both
Number
andBigInt
are supersets ofUint8
.This is clearly not a case of avoiding silent truncations, since all
TypedArray
s (including the newBigInt64Array
andBigUint64Array
) still allow them when setting values of the correct type. For example, the following silently truncates according to the spec:My informal proposal: when setting indexes of
Int8Array
,Uint8Array
,Int16Array
,Uint16Array
,Int32Array
orUint32Array
toBigInt
values, it should do the same as withNumber
values - that is, set the index to the value modulo2^8
,2^16
or2^32
, respectively.Note: this is not the same as #106. That patch performed a conversion to
Number
, which can cause a loss of precision.The text was updated successfully, but these errors were encountered: