Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON: Add support for Int128, UInt128 and Half #88962

Merged
merged 9 commits into from
Jul 18, 2023

Conversation

jozkee
Copy link
Member

@jozkee jozkee commented Jul 16, 2023

...and add Number support for Utf8JsonReader.CopyString(...).
EDIT: Instead of changing CopyString to accept numbers, we will just enable it on an internal helper since we could regret the decission.

Fixes #87994

@dotnet-issue-labeler
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@ghost
Copy link

ghost commented Jul 16, 2023

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis
See info in area-owners.md if you want to be subscribed.

Issue Details

...and add Number support for Utf8JsonReader.CopyString(...).

Fixes #87994
Fixes #84375

Author: Jozkee
Assignees: Jozkee
Labels:

area-System.Text.Json, new-api-needs-documentation

Milestone: 8.0.0

Comment on lines 10 to 11
private const int MaxFormatLength = 16;
private const int MaxEscapedFormatLength = MaxFormatLength * JsonConstants.MaxExpansionFactorWhileEscaping;
Copy link
Member Author

@jozkee jozkee Jul 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely what's a good number here since I couldn't find a good way to determine what's the max amount of bytes Half.TryFormat can consume.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably you mean Half.TryParse? @tannergooding might know

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you trying to do here in particular? The underlying parsing algorithm needs to be able to track up to 20 significant digits (for Half, it's 113 for Single, and 768 for Double) to ensure a correct result.

However, the entire input string must always be passed in such that it can process non-significant digits (such as leading zeros) and all trailing digits so that it can determine if the rounding goes up or down.

Imagine for example if the user defines 000...005 or 0.500...1, etc. All the zero digits that represented by the ... must be processed to ensure the result is correct and to ensure the relevant end of string is located.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you trying to do here in particular?

MaxFormatLength and MaxEscapedFormatLength are meant to be upper limits on the amount of utf8 bytes that can be parsed as Half.

However, the entire input string must always be passed in such that it can process non-significant digits (such as leading zeros) and all trailing digits so that it can determine if the rounding goes up or down.

This is probably what we need to do, we should not do length constraints and instead try to parse the whole number. I think that the Utf8JsonReader does not limit the lenght of the number tokens either e.g:

var s = new string('1', 100_000);
byte[] encodedS = Encoding.UTF8.GetBytes(s);
var r = new Utf8JsonReader(encodedS);
Console.WriteLine(r.Read()); // prints True
Console.WriteLine(r.TokenType); // prints Number

We can use pooling for large buffers and regular byte arrays for even larger ones.

@tannergooding, given that Half.TryParse(ROS<byte>, out Half) is not available on .NET 7 (and this code targets it), is BitConverter.ToHalf a good substitute?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I suspect BitConverter.ToHalf doesn't have the TryParse wiggle room.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not an equivalent API. BitConverter.ToHalf simply does a reinterpret cast of raw bytes into a Half.

You'd need to do something similar to the default interface implementation for IUtf8SpanParsable done by INumberBase<T> here: https://source.dot.net/#System.Private.CoreLib/src/libraries/System.Private.CoreLib/src/System/Numerics/INumberBase.cs,555

Which is to say, you have to transcode the input string to UTF-16, then try to parse.

#if NET8_0_OR_GREATER
Span<byte> buffer = stackalloc byte[MaxFormatLength];
#else
Span<char> buffer = stackalloc char[MaxFormatLength];
Copy link
Member

@eiriktsarpalis eiriktsarpalis Jul 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because UTF-8 parsing overloads don't exist in .NET 7 presumably? Perhaps a comment explaining that might help (since it's difficult to tell without intellisense).

}
finally
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally don't use try/finally to guard against exceptions in code that uses rented buffers. Not returning a buffer because of an exception is not a big problem all things considered (the problems start if a buffer gets used after being returns or gets returned more than once). You should still move the returning logic above the if (!success) statement above though.

Copy link
Member

@eiriktsarpalis eiriktsarpalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than a few pending issues that should be addressed this looks good to me. Great work David!

Fix handling of floating-point literals on HalfConverter
Remove CopyString tests related to Number support
@jozkee jozkee changed the title JSON: Add support for Int128, UInt128 and Half… JSON: Add support for Int128, UInt128 and Half Jul 18, 2023
@jozkee
Copy link
Member Author

jozkee commented Jul 18, 2023

@eiriktsarpalis can you please take another look at the last commits, I found a couple of issues:

  • For formating infinites we were writing them as and -∞, specifying CultureInfo.InvariantCulture fixed it.
  • For parsing NaN and infinites, Half.TryParse was more lax than the current S.T.Json policy and accepted them with any casing i.e: InFiNiTy could be parsed correctly; I fixed it by SequenceEquals the exact bytes we want when the TryParse method returned NaN, PositiveInfinity or NegativeInfinity.

@eiriktsarpalis
Copy link
Member

eiriktsarpalis commented Jul 18, 2023

For formating infinites we were writing them as ∞ and -∞, specifying CultureInfo.InvariantCulture fixed it.

What configuration is being used when the corresponding values for double and float are being used?

For parsing NaN and infinites, Half.TryParse was more lax than the current S.T.Json policy and accepted them with any casing i.e: InFiNiTy could be parsed correctly; I fixed it by SequenceEquals the exact bytes we want when the TryParse method returned NaN, PositiveInfinity or NegativeInfinity.

Is this an issue though? Citing Postel's law and whatnot perhaps there is a case to be made for tolerating case insensitive identifiers. I'd be surprised if the Half.TryParse behavior is uninentional so perhaps it's there to address valid representations? Perhaps it's an indication that we should also make the double and float parsing logic case insensitive as well. cc @tannergooding

ArrayPool<byte>.Shared.Return(rentedByteBuffer);
}

if (rentedCharBuffer != null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rentedCharBuffer is only being used in non-net8.0 targets, so perhaps this transcode and parse logic could be moved inside the TryParse helper so that it only gets used in the relevant targets.

#endif
out Int128 result)
{
return Int128.TryParse(buffer, CultureInfo.InvariantCulture, out result);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the size of this method body seems too tiny to warrant extracting to a separate helper.

@jozkee
Copy link
Member Author

jozkee commented Jul 18, 2023

What configuration is being used when the corresponding values for double and float are being used?

InvariantCulture as well, that's the default for Utf8Formatter.TryFormat:

return value.TryFormat(utf8Destination, out bytesWritten, formatText, CultureInfo.InvariantCulture);

formatText is also equivalent to what we use for the new Converters.

@jozkee
Copy link
Member Author

jozkee commented Jul 18, 2023

Is this an issue though?

I think it is currently an issue since it differs from the current strictness used on double and float. For consistency I suggest we follow the current convention, we should however evaluate changing it in a separate issue/thread.

@jozkee
Copy link
Member Author

jozkee commented Jul 18, 2023

One of our tests found an assertion failure on Half.TryParse that has reproed consistently only on OSX.

https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-88962-merge-9757e201f6374e20b5/System.Text.Json.Tests/1/console.78feb75a.log?helixlogtype=result

/private/tmp/helix/working/9EC208C7/w/A9F108F7/e /private/tmp/helix/working/9EC208C7/w/A9F108F7/e
  Discovering: System.Text.Json.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Text.Json.Tests (found 7160 of 7231 test cases)
  Starting:    System.Text.Json.Tests (parallel test collections = on, max threads = 6)
Process terminated. Assertion failed.
   at System.Globalization.Ordinal.EqualsIgnoreCaseUtf8_Scalar(Byte& charA, Int32 lengthA, Byte& charB, Int32 lengthB) in /_/src/libraries/System.Private.CoreLib/src/System/Globalization/Ordinal.Utf8.cs:line 308
   at System.Number.TryParseFloat[TChar,TFloat](ReadOnlySpan`1 value, NumberStyles styles, NumberFormatInfo info, TFloat& result) in /_/src/libraries/System.Private.CoreLib/src/System/Number.Parsing.cs:line 1229
   at System.Half.TryParse(ReadOnlySpan`1 utf8Text, NumberStyles style, IFormatProvider provider, Half& result) in /_/src/libraries/System.Private.CoreLib/src/System/Half.cs:line 2238
   at System.Text.Json.Serialization.Converters.HalfConverter.TryParse(ReadOnlySpan`1 buffer, Half& result) in /_/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Value/HalfConverter.cs:line 197
   at System.Text.Json.Serialization.Converters.HalfConverter.ReadCore(Utf8JsonReader& reader) in /_/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Value/HalfConverter.cs:line 50
   at System.Text.Json.Serialization.Converters.HalfConverter.ReadNumberWithCustomHandling(Utf8JsonReader& reader, JsonNumberHandling handling, JsonSerializerOptions options) in /_/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Value/HalfConverter.cs:line 115
   at System.Text.Json.Serialization.JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value, Boolean& isPopulatedValue) in /_/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonConverterOfT.cs:line 193
   at System.Text.Json.Serialization.Metadata.JsonPropertyInfo`1.ReadJsonAndSetMember(Object obj, ReadStack& state, Utf8JsonReader& reader) in /_/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/Metadata/JsonPropertyInfoOfT.cs:line 308
   at System.Text.Json.Serialization.Converters.ObjectDefaultConverter`1.OnTryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value) in /_/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Object/ObjectDefaultConverter.cs:line 49
   at System.Text.Json.Serialization.JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value, Boolean& isPopulatedValue) in /_/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonConverterOfT.cs:line 258
   at System.Text.Json.Serialization.JsonConverter`1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state) in /_/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonConverterOfT.ReadCore.cs:line 51
   at System.Text.Json.Serialization.Metadata.JsonTypeInfo`1.Deserialize(Utf8JsonReader& reader, ReadStack& state) in /_/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/Metadata/JsonTypeInfoOfT.ReadHelper.cs:line 22
   at System.Text.Json.JsonSerializer.ReadFromSpan[TValue](ReadOnlySpan`1 utf8Json, JsonTypeInfo`1 jsonTypeInfo, Nullable`1 actualByteCount) in /_/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Read.Span.cs:line 160
   at System.Text.Json.JsonSerializer.ReadFromSpan[TValue](ReadOnlySpan`1 json, JsonTypeInfo`1 jsonTypeInfo) in /_/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Read.String.cs:line 443
   at System.Text.Json.JsonSerializer.Deserialize[TValue](String json, JsonSerializerOptions options) in /_/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Read.String.cs:line 55
   at System.Text.Json.Serialization.Tests.JsonSerializerWrapper.StringSerializerWrapper.DeserializeWrapper[T](String json, JsonSerializerOptions options) in /_/src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Serialization/JsonSerializerWrapper.Reflection.cs:line 129
   at System.Text.Json.Serialization.Tests.NumberHandlingTests.<>c__DisplayClass39_0.<<FloatingPointConstants_Fail>b__1>d.MoveNext() in /_/src/libraries/System.Text.Json/tests/Common/NumberHandlingTests.cs:line 883
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine) in /_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncMethodBuilderCore.cs:line 38
   at System.Text.Json.Serialization.Tests.NumberHandlingTests.<>c__DisplayClass39_0.<FloatingPointConstants_Fail>b__1()

@jozkee jozkee merged commit e2c04e0 into dotnet:main Jul 18, 2023
@jozkee jozkee deleted the json-new-numbers-support branch July 18, 2023 19:35
@ghost ghost locked as resolved and limited conversation to collaborators Aug 18, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[API Proposal]: Add System.Text.Json built-in support for more numeric types
4 participants