Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute gcd with u64 instead of i64 because of overflows #11036

Merged
merged 4 commits into from
Jun 21, 2024

Conversation

LorrensP-2158466
Copy link
Contributor

Which issue does this PR close?

Closes #11031.

Rationale for this change

The issue mentioned 2 problems, I'll give an explanation here why this happened:

LCM depends on GCD, and the problems are indeed in GCD, so I expect that a fix to GCD will also fix LCM.

Problem 1

This is because of using i64 when computing the gcd, but the gcd is never negative. Rust doc also says that i64::MIN.wrapping_abs() == i64::MIN, so in the error case mentioned the gcd_compute goes like this:\

  • Initial values
    a = i64::MAX, b = i64::MIN
  • After wrapping_abs:
    a = still i64::MAX, b = still i64::MIN
  • After the binary shifts:
    a = still i64::MAX, b = -1 (here lies the problem)
  • we swap a and b because a > b:
    a = -1, b = i64::MAX
  • Now we subtract a from b:
    i64::MAX - (-1) = i64::MAX + 1 (here is our integer overflow)\

And because LCM depends on GCD, it also crashed similarly.

Problem 2

The exact same reason as problem 1: when we do the wrapping_abs, we don't change the value of i64::MIN, causing the function to stall because it keeps switching a and b, which either have values -1 or 2.

These edge cases cause overflows or infinite loops, which are bugs, so we fix them.

Solution

I propose we do the computations with u64 and in the end, cast the result back to i64; this is safe since the input is never bigger than i64::MAX or smaller than i64::MIN.
This fixes both GCD and LCM

What changes are included in this PR?

Compute GCD with u64 and after computation, cast them back to i64

Are these changes tested?

Tested this against the values used in the issue, also created an extra test case for this.
Other test also passed.

Are there any user-facing changes?

No


if a == 0 {
return b;
return b as i64;
Copy link
Contributor

@jayzhan211 jayzhan211 Jun 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we move unsigned_abs() below, we don't need to cast

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, nice catch! Thanks!

// from issue https://github.com/apache/datafusion/issues/11031 we know that the previous implementation could
// not handle cases were one or both of the inputs were an i64::MAX or i64::MIN coupled with other values (neg or pos)
#[test]
fn test_gcd_i64_maxes() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these tests can be implemented as slt.

@jonahgao
Copy link
Member

jonahgao commented Jun 21, 2024

The following query seems to return an incorrect result.

DataFusion CLI v39.0.0
> select gcd(-9223372036854775808, -9223372036854775808);
+--------------------------------------------------------------+
| gcd(Int64(-9223372036854775808),Int64(-9223372036854775808)) |
+--------------------------------------------------------------+
| -9223372036854775808                                         |
+--------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.007 seconds.

In PostgreSQL:

postgres=# select gcd(-9223372036854775808, -9223372036854775808);
ERROR:  bigint out of range

Update: I tested the current main branch, and the result is also incorrect. Therefore, maybe we can address this case later.

@LorrensP-2158466
Copy link
Contributor Author

LorrensP-2158466 commented Jun 21, 2024

I think that is expected if we want to return i64, if we enter the loop right before we return, we will have following values: a = 1, b = 0 and shift = 63, doing 1 << 63 wil result in i64::MIN, which is the final result. But the gcd can't be negative, so i think the only way to fix that issue is to return a u64, which will alter the API. Should we open up a separate issue and fix that after this one? I can also update the original comment and title to just say that we fix the GCD function.

I don't know why postgres' implementation results in a overflow, because when I change the output to u64 in datafusion, it works as expected.

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Jun 21, 2024
@jonahgao
Copy link
Member

so i think the only way to fix that issue is to return a u64, which will alter the API.

@LorrensP-2158466 How about returning an overflow error when the final result cannot fit into i64? This way, it won't cause an API change. Something like this:

i64::try_from(a << shift).map_err(|_| exec_datafusion_err!("Overflow in gcd"))

I think we could have a separate fix for it.

Copy link
Member

@jonahgao jonahgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @LorrensP-2158466

@LorrensP-2158466
Copy link
Contributor Author

How about returning an overflow error when the final result cannot fit into i64? This way, it won't cause an API change. Something like this:

yes that is much nicer.

I think we could have a separate fix for it.

I agree, do you want me to open it up or do you want to do it?

Also Thanks @2010YOUY01! Your tool caught some weird edge cases.

@jonahgao
Copy link
Member

I agree, do you want me to open it up or do you want to do it?

Just go ahead; it will be very appreciated @LorrensP-2158466

@jonahgao jonahgao merged commit 8aad936 into apache:main Jun 21, 2024
23 checks passed
@LorrensP-2158466 LorrensP-2158466 deleted the compute_gcd_with_u64 branch June 21, 2024 16:54
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jun 22, 2024
* compute gcd with unsigned ints

* add test for the i64::MAX cases

* move unsigned_abs below zero test to remove unnecessary casts

* add slt test for gcd on max values instead of unit tests
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jun 22, 2024
* compute gcd with unsigned ints

* add test for the i64::MAX cases

* move unsigned_abs below zero test to remove unnecessary casts

* add slt test for gcd on max values instead of unit tests
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
* compute gcd with unsigned ints

* add test for the i64::MAX cases

* move unsigned_abs below zero test to remove unnecessary casts

* add slt test for gcd on max values instead of unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bugs in LCM/GCD scalar functions (found by SQLancer)
3 participants