-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON: Add support for Int128, UInt128 and Half #88962
Conversation
… for Utf8JsonReader.CopyString(...)
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis |
private const int MaxFormatLength = 16; | ||
private const int MaxEscapedFormatLength = MaxFormatLength * JsonConstants.MaxExpansionFactorWhileEscaping; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely what's a good number here since I couldn't find a good way to determine what's the max amount of bytes Half.TryFormat
can consume.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably you mean Half.TryParse
? @tannergooding might know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you trying to do here in particular? The underlying parsing algorithm needs to be able to track up to 20 significant digits (for Half
, it's 113 for Single
, and 768 for Double
) to ensure a correct result.
However, the entire input string must always be passed in such that it can process non-significant digits (such as leading zeros) and all trailing digits so that it can determine if the rounding goes up or down.
Imagine for example if the user defines 000...005
or 0.500...1
, etc. All the zero digits that represented by the ...
must be processed to ensure the result is correct and to ensure the relevant end of string is located.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you trying to do here in particular?
MaxFormatLength and MaxEscapedFormatLength are meant to be upper limits on the amount of utf8 bytes that can be parsed as Half.
However, the entire input string must always be passed in such that it can process non-significant digits (such as leading zeros) and all trailing digits so that it can determine if the rounding goes up or down.
This is probably what we need to do, we should not do length constraints and instead try to parse the whole number. I think that the Utf8JsonReader does not limit the lenght of the number tokens either e.g:
var s = new string('1', 100_000);
byte[] encodedS = Encoding.UTF8.GetBytes(s);
var r = new Utf8JsonReader(encodedS);
Console.WriteLine(r.Read()); // prints True
Console.WriteLine(r.TokenType); // prints Number
We can use pooling for large buffers and regular byte arrays for even larger ones.
@tannergooding, given that Half.TryParse(ROS<byte>, out Half)
is not available on .NET 7 (and this code targets it), is BitConverter.ToHalf
a good substitute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I suspect BitConverter.ToHalf doesn't have the TryParse wiggle room.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not an equivalent API. BitConverter.ToHalf
simply does a reinterpret cast of raw bytes into a Half
.
You'd need to do something similar to the default interface implementation for IUtf8SpanParsable
done by INumberBase<T>
here: https://source.dot.net/#System.Private.CoreLib/src/libraries/System.Private.CoreLib/src/System/Numerics/INumberBase.cs,555
Which is to say, you have to transcode the input string to UTF-16, then try to parse.
...raries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Value/HalfConverter.cs
Outdated
Show resolved
Hide resolved
...raries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Value/HalfConverter.cs
Outdated
Show resolved
Hide resolved
...ries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Value/Int128Converter.cs
Outdated
Show resolved
Hide resolved
#if NET8_0_OR_GREATER | ||
Span<byte> buffer = stackalloc byte[MaxFormatLength]; | ||
#else | ||
Span<char> buffer = stackalloc char[MaxFormatLength]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this because UTF-8 parsing overloads don't exist in .NET 7 presumably? Perhaps a comment explaining that might help (since it's difficult to tell without intellisense).
src/libraries/System.Text.Json/src/System/Text/Json/Reader/Utf8JsonReader.TryGet.cs
Outdated
Show resolved
Hide resolved
} | ||
finally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally don't use try/finally to guard against exceptions in code that uses rented buffers. Not returning a buffer because of an exception is not a big problem all things considered (the problems start if a buffer gets used after being returns or gets returned more than once). You should still move the returning logic above the if (!success)
statement above though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than a few pending issues that should be addressed this looks good to me. Great work David!
Fix handling of floating-point literals on HalfConverter Remove CopyString tests related to Number support
@eiriktsarpalis can you please take another look at the last commits, I found a couple of issues:
|
What configuration is being used when the corresponding values for
Is this an issue though? Citing Postel's law and whatnot perhaps there is a case to be made for tolerating case insensitive identifiers. I'd be surprised if the |
ArrayPool<byte>.Shared.Return(rentedByteBuffer); | ||
} | ||
|
||
if (rentedCharBuffer != null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rentedCharBuffer
is only being used in non-net8.0 targets, so perhaps this transcode and parse logic could be moved inside the TryParse
helper so that it only gets used in the relevant targets.
...raries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Value/HalfConverter.cs
Show resolved
Hide resolved
...raries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Value/HalfConverter.cs
Show resolved
Hide resolved
#endif | ||
out Int128 result) | ||
{ | ||
return Int128.TryParse(buffer, CultureInfo.InvariantCulture, out result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: the size of this method body seems too tiny to warrant extracting to a separate helper.
InvariantCulture as well, that's the default for Utf8Formatter.TryFormat: Line 19 in f7ad726
formatText is also equivalent to what we use for the new Converters.
|
I think it is currently an issue since it differs from the current strictness used on |
...raries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Value/HalfConverter.cs
Outdated
Show resolved
Hide resolved
One of our tests found an assertion failure on
|
...and add Number support forUtf8JsonReader.CopyString(...)
.EDIT: Instead of changing CopyString to accept numbers, we will just enable it on an internal helper since we could regret the decission.
Fixes #87994