Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] test_decimal_round fails with DATAGEN_SEED=3 #9847

Open
thirtiseven opened this issue Nov 24, 2023 · 4 comments
Open

[BUG] test_decimal_round fails with DATAGEN_SEED=3 #9847

thirtiseven opened this issue Nov 24, 2023 · 4 comments
Assignees
Labels
bug Something isn't working cudf_dependency An issue or PR with this label depends on a new feature in cudf

Comments

@thirtiseven
Copy link
Collaborator

Repro:

SPARK_RAPIDS_TEST_DATAGEN_SEED=3 ./integration_tests/run_pyspark_from_build.sh -k 'test_decimal_round'

(part of) results:

Row(round(a, 0)=7.759831900331442e+18, round(1.234, 2)=Decimal('1.23'), round(a, -1)=7.759831900331442e+18, round(a, 1)=7.759831900331442e+18, round(a, 2)=7.759831900331442e+18, round(a, 10)=7.759831900331442e+18)
-Row(round(a, 0)=0.0, round(1.234, 2)=Decimal('1.23'), round(a, -1)=0.0, round(a, 1)=0.0, round(a, 2)=0.0, round(a, 10)=0.0)
-Row(round(a, 0)=0.0, round(1.234, 2)=Decimal('1.23'), round(a, -1)=0.0, round(a, 1)=0.0, round(a, 2)=0.0, round(a, 10)=-2.1999999599842113e-09)
-Row(round(a, 0)=0.0, round(1.234, 2)=Decimal('1.23'), round(a, -1)=0.0, round(a, 1)=0.0, round(a, 2)=0.0, round(a, 10)=0.0)
+Row(round(a, 0)=-0.0, round(1.234, 2)=Decimal('1.23'), round(a, -1)=-0.0, round(a, 1)=-0.0, round(a, 2)=-0.0, round(a, 10)=-0.0)
+Row(round(a, 0)=-0.0, round(1.234, 2)=Decimal('1.23'), round(a, -1)=-0.0, round(a, 1)=-0.0, round(a, 2)=-0.0, round(a, 10)=-2.1999999599842113e-09)
+Row(round(a, 0)=-0.0, round(1.234, 2)=Decimal('1.23'), round(a, -1)=-0.0, round(a, 1)=-0.0, round(a, 2)=-0.0, round(a, 10)=-0.0)
 Row(round(a, 0)=1.480257386189947e+17, round(1.234, 2)=Decimal('1.23'), round(a, -1)=1.480257386189947e+17, round(a, 1)=1.480257386189947e+17, round(a, 2)=1.480257386189947e+17, round(a, 10)=1.480257386189947e+17)
 Row(round(a, 0)=2348355944448.0, round(1.234, 2)=Decimal('1.23'), round(a, -1)=2348355944448.0, round(a, 1)=2348355944448.0, round(a, 2)=2348355944448.0, round(a, 10)=2348355944448.0)
-Row(round(a, 0)=0.0, round(1.234, 2)=Decimal('1.23'), round(a, -1)=0.0, round(a, 1)=0.0, round(a, 2)=0.0, round(a, 10)=-1.0999999799921056e-09)
-Row(round(a, 0)=0.0, round(1.234, 2)=Decimal('1.23'), round(a, -1)=0.0, round(a, 1)=0.0, round(a, 2)=0.0, round(a, 10)=0.00042851248872466385)
+Row(round(a, 0)=-0.0, round(1.234, 2)=Decimal('1.23'), round(a, -1)=-0.0, round(a, 1)=-0.0, round(a, 2)=-0.0, round(a, 10)=-1.0999999799921056e-09)
+Row(round(a, 0)=0.0, round(1.234, 2)=Decimal('1.23'), round(a, -1)=0.0, round(a, 1)=0.0, round(a, 2)=0.0, round(a, 10)=0.0004285126051399857)

I seen many mismatched are from #9349, but I believe there will be some from #9350

@thirtiseven thirtiseven added bug Something isn't working ? - Needs Triage Need team to review and classify labels Nov 24, 2023
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Nov 28, 2023
@jlowe
Copy link
Member

jlowe commented Nov 29, 2023

The -0.0 vs 0.0 is not an actual mismatch in the test (it's using approx float comparison already), that's an artifact of the difftool used to produce the output when the test fails which is just looking at raw text and no longer doing the floating point approx comparison logic.

The real issue is more along the lines of #9350. In this specific case with DATAGEN_SEED=3, the problem is only with single-precision floating point value 6.121944898040965e-05f which gets rounded incorrectly up instead of down. I tracked this down to a problem in libcudf where the calculations for single-precision floating point rounding are forced to use single-precision floating point throughout which introduces enough error to cause the issue. Filed rapidsai/cudf#14528

@jlowe jlowe added cudf_dependency An issue or PR with this label depends on a new feature in cudf ? - Needs Triage Need team to review and classify labels Nov 29, 2023
@revans2
Copy link
Collaborator

revans2 commented Nov 29, 2023

@jlowe do you think that this might happen with double values too for rounding to larger numbers of digits? Right now we go up to 10, which exposed a problem for float. But should we try larger values for double?

@jlowe
Copy link
Member

jlowe commented Nov 29, 2023

do you think that this might happen with double values too for rounding to larger numbers of digits?

Yes, I'm sure that will happen at some point for specific values and rounding places, even for doubles. We'll won't be able to match Spark's behavior here if we stick with floating-point types, since Spark goes to BigDecimal and thus will have a lot more precision to wield. Fixing the cudf issue I filed for floats may give us enough additional precision to handle most (all?) of the FLOAT32 cases, but it won't do anything to improve the known limitations when rounding with doubles to high numbers of decimal places. We may want to do some experiments to know how many decimal places we can reliably round to and consider fallback to the CPU when we see rounding to a decimal place we cannot reliably match.

@wence-
Copy link

wence- commented Nov 30, 2023

We may want to do some experiments to know how many decimal places we can reliably round to and consider fallback to the CPU when we see rounding to a decimal place we cannot reliably match.

I think for float32 trying to round anything more than 7 decimal places (inclusive) might result in some "strange" results (even if you use double precision).

Consider a float32 value and its two closest neighbours:

x.1234567890 // decimal digits
1.00009977817535400391
1.00009989738464355469 // <-- we're trying to round this one
1.00010001659393310547

Rounding to six digits produces 1.00010001658383310547, rounding to seven digits reproduces the input (you might want to see 1.0000999 as the "correctly rounded in infinite precision" value). But none of the nearby floats are a good representation of the value rounded to 7 digits. So you'd need to produce a double result to represent the rounding "correctly" (whatever that means).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cudf_dependency An issue or PR with this label depends on a new feature in cudf
Projects
None yet
Development

No branches or pull requests

7 participants