Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix BigInteger.Parse returns bogus value for exponents above 1000 #55397

Closed

Conversation

Maximys
Copy link
Contributor

@Maximys Maximys commented Jul 9, 2021

Fixes issue 17296

@ghost
Copy link

ghost commented Jul 9, 2021

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixing invalid behaviour of BigInteger.Parse and BigInteger.ToString.

Author: Maximys
Assignees: -
Labels:

area-System.Numerics

Milestone: -

@terrajobst terrajobst added the community-contribution Indicates that the PR has been added by a community member label Jul 19, 2021
@jeffhandley jeffhandley requested a review from pgovind July 24, 2021 04:49
@jeffhandley
Copy link
Member

@pgovind This PR is assigned to you for follow-up/decision before the RC1 snap.

@danmoseley danmoseley changed the title Fix/17296 biginteger tostring fixes Fix BigInteger.Parse returns bogus value for exponents above 1000 Jul 24, 2021
@pgovind
Copy link

pgovind commented Aug 2, 2021

Thanks for the contribution here @Maximys. Due to it being so late in the cycle, this is going to miss out on the .NET 6 release. We'll instead be able to review and merge it after the RC1 snap when we begin accepting more broad changes into main again (ETA: 3-4 weeks)

@pgovind pgovind added this to the Future milestone Aug 2, 2021
@Maximys
Copy link
Contributor Author

Maximys commented Aug 3, 2021

@pgovind , ok, thank you so much, I'll wait

@jeffhandley jeffhandley modified the milestones: Future, 7.0.0 Sep 4, 2021
@jeffhandley
Copy link
Member

@Maximys We're still trying to finish up a few things that are part of .NET 6.0 RC2. It'll be another week or two before we can review this thoroughly. I've set the milestone to 7.0.0 to indicate our intention of having this in that release though. Thanks for your contribution and patience!

@pgovind pgovind assigned tannergooding and unassigned pgovind Sep 30, 2021
@Maximys
Copy link
Contributor Author

Maximys commented Oct 2, 2021

@pgovind , I had just commit fixes by review. Can you check it?

@tannergooding
Copy link
Member

Is there a smaller fix which only addresses the bogus values for exponents above 1000 issue without also refactoring all the parsing logic simultaneously?

This feels like it should be two separate PRs.

@Maximys Maximys force-pushed the fix/17296-biginteger-tostring-fixes branch from 153b1ca to e22878b Compare October 9, 2021 04:33
@Maximys
Copy link
Contributor Author

Maximys commented Oct 9, 2021

@tannergooding , I had just remove additional commits and changes from my branch.

Copy link
Member

@jeffhandley jeffhandley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for minimizing the changes, @Maximys.

@tannergooding - how does this look to you now? I honestly don't understand what the previous logic was trying to accomplish.

@@ -517,14 +517,6 @@ private static unsafe bool ParseNumber(ref char* str, char* strEnd, NumberStyles
{
exp = exp * 10 + (ch - '0');
ch = ++p < strEnd ? *p : '\0';
if (exp > 1000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in review here.

This isn't quite the right fix and likely introduces a different bug. For example, if you have an exponent of 21474836492 it will succesfully parse the first 9 digits giving 214748364. Multiply by 10 gives 2147483640, add 9 overflows to -2147483647. Then further multiplied by 10, gives exactly 10, adding 2 gives 12.

Both of these last two digits are problematic because of this and so we need something that accounts for overflow here and surfaces it as invalid input/failure to parse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tannergooding , I agree with you in this case. I have idea about processing this cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tannergooding , I had just repeat my summer investigation about algorithms of BigInteger, and I remember, that there is BigNumber class with BigNumberBuffer inside it and this buffer contains scale with int type.
BigNumber provide some functionality to BigInteger and if int type with it's MaxValue is invalid, then .NET Command should create separate task for change it.
But firstly, lets make little calculation:
itemSize * arraySize = 32bit * 2147483647 = 68719476704 bit = 8589934588 byte ≈ 8388608 kbyte = 8192 mb = 8 gb
thus, theoretically we can provide 8 Gb for store one BigInteger by current architecture! I think, this is more than enough

Copy link
Member

@tannergooding tannergooding Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simply make this be:

int exp = 0;
do
{
    exp = exp * 10 + (ch - '0');
    ch = ++p < strEnd ? *p : '\0';
} while ((ch >= '0') && (ch <= '9') && (exp < (int.MaxValue / 10))); // exp < 214748364

if ((ch >= '0') || (ch <= '9'))
{
    // We still had remaining characters but bailed early because
    // the exponent was going to overflow. If exp is exactly 214748364
    // then we can technically handle one more character being 0-7
    // but the additional complexity might not be worthwhile.

    Debug.Assert(exp >= 214748364);
    return false;
}

Copy link
Member

@tannergooding tannergooding Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is to account for overflow and ensure its handled so we can throw for Parse and return false for TryParse.

This trivially handles it by accounting for both cases of overflow:

  1. due to exp * 10 which can only overflow for exp > (int.MaxValue / 10)
  2. due to the subsequent add which can only overflow for exp == (int.MaxValue / 10) and ch > '7'

@Maximys Maximys force-pushed the fix/17296-biginteger-tostring-fixes branch from e22878b to 494acf8 Compare November 6, 2021 06:38
{
// We still had remaining characters but bailed early because
// the exponent was going to overflow. If exp is exactly 214748364
// then we can technically handle one more character being 0-9
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// then we can technically handle one more character being 0-9
// then we can technically handle one more character being 0-7

2147483647 is int.MaxValue so the last digit, if we handled it, could only be 0-7

@@ -439,6 +439,25 @@ public static void CustomFormatPerMille()
RunCustomFormatToStringTests(s_random, "#\u2030000000", CultureInfo.CurrentCulture.NumberFormat.NegativeSign, 6, PerMilleSymbolFormatter);
}

public static IEnumerable<object[]> RunFormatScientificNotationToBigIntegerAndViceVersaData()
Copy link
Member

@tannergooding tannergooding Nov 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should also be a test covering 2147483639, 2147483647, and a couple arbitrary strings bigger than that (such as 21474836492).

The latter two should end up returning false for TryParse and throwing an OverflowException to parity what int.TryParse does.
The first case will probably result in an OutOfMemoryException but otherwise should be parsed as a valid set of integer digits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latter two should end up returning false for TryParse and throwing an OverflowException to parity what int.TryParse does.

Actually after thinking it over and talking with @bartonjs and @GrabYourPitchforks, I think that we should always throw PlatformNotSupportedException in this case as well. It is more consistent and allows us to expand in the future without an "additional" breaking change and avoids false being returned for a "technically valid" big integer that otherwise we simply don't support.

@tannergooding
Copy link
Member

@Maximys are you still working on this?

No particular rush, just wanted to ensure that was the case since many people took a break over the holidays. There are still several pending comments above that would need to be resolved.

@tannergooding tannergooding added the needs-author-action An issue or pull request that requires more info or actions from the author. label Jan 7, 2022
@ghost ghost added the no-recent-activity label Jan 21, 2022
@ghost
Copy link

ghost commented Jan 21, 2022

This pull request has been automatically marked no recent activity because it has not had any activity for 14 days. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will remove no recent activity.

@ghost
Copy link

ghost commented Feb 5, 2022

This pull request will now be closed since it had been marked no-recent-activity but received no further activity in the past 14 days. It is still possible to reopen or comment on the pull request, but please note that it will be locked if it remains inactive for another 30 days.

@ghost ghost closed this Feb 5, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Mar 7, 2022
@dakersnar
Copy link
Contributor

@tannergooding Should I try to take this over and get it across the finish line? Do we still want this for 7.0?

@tannergooding
Copy link
Member

It's definitely something worth fixing for early .NET 8 at the very least. We can always bar check for .NET 7

@dakersnar
Copy link
Contributor

Sounds good, I'm going to cherry pick these commits and make a new PR that includes the requested tweaks.

This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Numerics community-contribution Indicates that the PR has been added by a community member needs-author-action An issue or pull request that requires more info or actions from the author. no-recent-activity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BigInteger.Parse returns bogus value for exponents above 1000
6 participants