Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix BigInteger.Parse returns bogus value for exponents above 1000 #55397

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -517,15 +517,19 @@ private static unsafe bool ParseNumber(ref char* str, char* strEnd, NumberStyles
{
exp = exp * 10 + (ch - '0');
ch = ++p < strEnd ? *p : '\0';
if (exp > 1000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in review here.

This isn't quite the right fix and likely introduces a different bug. For example, if you have an exponent of 21474836492 it will succesfully parse the first 9 digits giving 214748364. Multiply by 10 gives 2147483640, add 9 overflows to -2147483647. Then further multiplied by 10, gives exactly 10, adding 2 gives 12.

Both of these last two digits are problematic because of this and so we need something that accounts for overflow here and surfaces it as invalid input/failure to parse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tannergooding , I agree with you in this case. I have idea about processing this cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tannergooding , I had just repeat my summer investigation about algorithms of BigInteger, and I remember, that there is BigNumber class with BigNumberBuffer inside it and this buffer contains scale with int type.
BigNumber provide some functionality to BigInteger and if int type with it's MaxValue is invalid, then .NET Command should create separate task for change it.
But firstly, lets make little calculation:
itemSize * arraySize = 32bit * 2147483647 = 68719476704 bit = 8589934588 byte ≈ 8388608 kbyte = 8192 mb = 8 gb
thus, theoretically we can provide 8 Gb for store one BigInteger by current architecture! I think, this is more than enough

Copy link
Member

@tannergooding tannergooding Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simply make this be:

int exp = 0;
do
{
    exp = exp * 10 + (ch - '0');
    ch = ++p < strEnd ? *p : '\0';
} while ((ch >= '0') && (ch <= '9') && (exp < (int.MaxValue / 10))); // exp < 214748364

if ((ch >= '0') || (ch <= '9'))
{
    // We still had remaining characters but bailed early because
    // the exponent was going to overflow. If exp is exactly 214748364
    // then we can technically handle one more character being 0-7
    // but the additional complexity might not be worthwhile.

    Debug.Assert(exp >= 214748364);
    return false;
}

Copy link
Member

@tannergooding tannergooding Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is to account for overflow and ensure its handled so we can throw for Parse and return false for TryParse.

This trivially handles it by accounting for both cases of overflow:

  1. due to exp * 10 which can only overflow for exp > (int.MaxValue / 10)
  2. due to the subsequent add which can only overflow for exp == (int.MaxValue / 10) and ch > '7'

{
exp = 9999;
while (ch >= '0' && ch <= '9')
{
ch = ++p < strEnd ? *p : '\0';
}
}
} while (ch >= '0' && ch <= '9');
} while ((ch >= '0') && (ch <= '9') && (exp < int.MaxValue / 10));

if ((ch >= '0') && (ch <= '9'))
{
// We still had remaining characters but bailed early because
// the exponent was going to overflow. If exp is exactly 214748364
// then we can technically handle one more character being 0-9
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// then we can technically handle one more character being 0-9
// then we can technically handle one more character being 0-7

2147483647 is int.MaxValue so the last digit, if we handled it, could only be 0-7

// but the additional complexity might not be worthwhile.

Debug.Assert(exp >= 214748364);
return false;
}

if (negExp)
{
exp = -exp;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -439,6 +439,25 @@ public static void CustomFormatPerMille()
RunCustomFormatToStringTests(s_random, "#\u2030000000", CultureInfo.CurrentCulture.NumberFormat.NegativeSign, 6, PerMilleSymbolFormatter);
}

public static IEnumerable<object[]> RunFormatScientificNotationToBigIntegerAndViceVersaData()
Copy link
Member

@tannergooding tannergooding Nov 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should also be a test covering 2147483639, 2147483647, and a couple arbitrary strings bigger than that (such as 21474836492).

The latter two should end up returning false for TryParse and throwing an OverflowException to parity what int.TryParse does.
The first case will probably result in an OutOfMemoryException but otherwise should be parsed as a valid set of integer digits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latter two should end up returning false for TryParse and throwing an OverflowException to parity what int.TryParse does.

Actually after thinking it over and talking with @bartonjs and @GrabYourPitchforks, I think that we should always throw PlatformNotSupportedException in this case as well. It is more consistent and allows us to expand in the future without an "additional" breaking change and avoids false being returned for a "technically valid" big integer that otherwise we simply don't support.

{
yield return new object[] { "1E+1000", "1E+1000" };
yield return new object[] { "1E+1001", "1E+1001" };
}

[Theory]
[MemberData(nameof(RunFormatScientificNotationToBigIntegerAndViceVersaData))]
public static void RunFormatScientificNotationToBigIntegerAndViceVersa(string testingValue, string expectedResult)
{
BigInteger parsedValue;
string actualResult;

parsedValue = BigInteger.Parse(testingValue, NumberStyles.AllowExponent);
actualResult = parsedValue.ToString("E0");

Assert.Equal(expectedResult, actualResult);
}

private static void RunSimpleProviderToStringTests(Random random, string format, NumberFormatInfo provider, int precision, StringFormatter formatter)
{
string test;
Expand Down