gh-119372: recover inf's and zeros in _Py_c_quot #119457

skirpichev · 2024-05-23T11:03:28Z

In some cases, previosly computed as (nan+nanj), we could recover meaningful component values in the result, see e.g. the C11, Annex G.5.2, routine _Cdivd():

>>> from cmath import inf, infj, nanj
>>> (1+1j)/(inf+infj)  # was (nan+nanj)
0j
>>> (inf+nanj)/complex(2**1000, 2**-1000)
(inf-infj)

Issue: Invalid corner cases (resulting in nan+nanj) in complex division #119372

In some cases, previosly computed as (nan+nanj), we could recover meaningful component values in the result, see e.g. the C11, Annex G.5.2, routine _Cdivd(): >>> from cmath import inf, infj, nanj >>> (1+1j)/(inf+infj) # was (nan+nanj) 0j >>> (inf+nanj)/complex(2**1000, 2**-1000) (inf-infj)

serhiy-storchaka

LGTM.

Lib/test/test_complex.py

serhiy-storchaka · 2024-05-23T14:46:42Z

Lib/test/test_complex.py

+        # test recover of infs if numerator has infinities and denominator is finite
+        self.assertComplexesAreIdentical(complex('(inf-infj)')/(1+0j), complex('(inf-infj)'))
+        self.assertComplexesAreIdentical(complex('(inf+infj)')/(0.0+1j), complex('(inf-infj)'))
+        self.assertComplexesAreIdentical(complex('(nan+infj)')/complex(2**1000, 2**-1000), complex('(inf+infj)'))


I wonder, how did you came to this test? Its result looks paradoxical at first glance, but makes sense after thinking about it more.

That was just a coverage test, that triggers underflow in ratio. The hard part, as you noted, was to figure out why it's true. I've just checked the exact answer in the Diofant (probably will work in the SymPy too), e.g. for the real part:

In [43]: x = Symbol('x', extended_real=True) In [44]: y = Symbol('y', positive=True, finite=True) In [45]: expr = re(expand((x + I*oo)*(y-I/y))) In [46]: expr.subs({x: Dummy(real=True)}) Out[46]: ∞ In [47]: expr.subs({x: Dummy(positive=True)}) Out[47]: ∞ In [48]: limit(expr, x, -oo) Out[48]: ∞

BTW, Mathematica does more brave things, but hardly it's correct:

In[14]:= Hold[(x + I*Infinity)*(2^10-I/2^10)] (* no assumptions on x ! *) 10 I Out[14]= Hold[(x + I Infinity) (2 - ---)] 10 2 In[15]:= ReleaseHold[%] (1 + 1048576 I) Infinity Out[15]= ------------------------ Sqrt[1099511627777] In[16]:= ReleaseHold[%14/.{x->1-I*Infinity}] Infinity::indet: Indeterminate expression -I Infinity + I Infinity encountered. Out[16]= Indeterminate

However, it looks I should restore remark about corner cases. Not all are handled by this code (have no chance, because one component computed as non-nan). Here is an example:

>>> complex('(1+infj)')/complex(2**1000, 2**-1000) (nan+infj)

This is correctly handled by C code both for gcc and clang, but it turns they use a different algorithm for tdiv (not Smith's method, rather something like Priest's algorithm; see full _Cdivd() code before handling nan+nanj case). Maybe it worth to adopt this, but probably it's out of the scope of this pr.

Edit:
An example, where current algorithm fails (bad imaginary part) is $(2^{1023}+i 2^{-1023})/(2^{677}+i 2^{-677})\approx 2^{346}-i 2^{-1008}$ (taken from M.Baudin: "A Robust Complex Division in Scilab"). gcc v12.2 (and, probably, clang), that follows to example in C standard, does produce the correct answer.

example code and "fixed" Smith's algorithm (uses extended precision to compute ratio)

#include <complex.h> #include <stdio.h> #include <math.h> /* CPython's version */ complex tdiv_smith(complex a, complex b) { const double abs_breal = fabs(creal(b)); const double abs_bimag = fabs(cimag(b)); if (abs_breal >= abs_bimag) { if (abs_breal != 0.0) { const double ratio = cimag(b) / creal(b); const double denom = creal(b) + cimag(b) * ratio; return CMPLX((creal(a) + cimag(a) * ratio) / denom, (cimag(a) - creal(a) * ratio) / denom); } } else if (abs_bimag >= abs_breal) { const double ratio = creal(b) / cimag(b); const double denom = creal(b) * ratio + cimag(b); return CMPLX((creal(a) * ratio + cimag(a)) / denom, (cimag(a) * ratio - creal(a)) / denom); } return CMPLX(NAN, NAN); } complex tdiv_smith2(complex a, complex b) { const double abs_breal = fabs(creal(b)); const double abs_bimag = fabs(cimag(b)); if (abs_breal >= abs_bimag) { if (abs_breal != 0.0) { const long double ratio = (long double) cimag(b) / (long double) creal(b); const double denom = creal(b) + cimag(b) * ratio; return CMPLX((creal(a) + cimag(a) * ratio) / denom, (cimag(a) - creal(a) * ratio) / denom); } } else if (abs_bimag >= abs_breal) { const long double ratio = (long double) creal(b) / (long double) cimag(b); const double denom = creal(b) * ratio + cimag(b); return CMPLX((creal(a) * ratio + cimag(a)) / denom, (cimag(a) * ratio - creal(a)) / denom); } return CMPLX(NAN, NAN); } int main() { complex a = CMPLX(0x1p+1023, 0x1p-1023); complex b = CMPLX(0x1p677, 0x1p-677); complex z; z = a / b; printf("%e%+ei\n", creal(z), cimag(z)); z = tdiv_smith(a, b); printf("%e%+ei\n", creal(z), cimag(z)); z = tdiv_smith2(a, b); printf("%e%+ei\n", creal(z), cimag(z)); return 0; }

$ cc -std=c11 a.c -lm && ./a.out 1.433437e+104-3.645561e-304i 1.433437e+104+0.000000e+00i 1.433437e+104-3.645561e-304i

It's quite odd to see NaNs disappearing in an arithmetic operation on operands that include NaNs. However, that seems unavoidable if we want to follow the C standard Annex G semantics, since its rules imply that complex(inf, nan) should be regarded as a complex infinity, and then the division of a finite complex number by a complex infinity should be a complex zero (so can't have any NaN components).

It's quite odd to see NaNs disappearing in an arithmetic operation on operands that include NaNs.

I think that could be reasonably interpreted only as shown above, i.e. nan in components means "anything from the extended real line".

since its rules imply that complex(inf, nan) should be regarded as a complex infinity

I don't think there is such thing in the C standard. It has two types of infinities: when one component is finite, nonzero and other is infinite; other type (with few cases) is a directed infinity (phase is finite and absolute value is infinite). I.e. (1+infj) and (inf+infj). That brings an interesting clash with CAS like Mathematica, that has no notion like the first type if infinities. E.g. ArcSin[Infinity] is just -I Infinity c.f. cmath.asin(). See mpmath/mpmath#793

But CPython already adopts Annex G semantics in this respect, via cmath.isinf() and cmath.isfinite().

@mdickinson, what do you think about changing Smith's algorithm to something more closer to C standard?

@skirpichev

I don't think there is such thing in the C standard.

Take a look at Annex G. Quoting from §G.3 "Conventions" in C99, paragraph 1:

A complex or imaginary value with at least one infinite part is regarded as an infinity (even if its other part is a NaN).

The language is slightly changed in the most recent publicly available draft of C23 (n3096.pdf):

A complex or imaginary value with at least one infinite part is regarded as an infinity (even if its other part is a quiet NaN).

mdickinson

Changes LGTM, and the new addition looks clean. This brings us in line with C's Annex G and the explicit requirements in §G.5.1p4 (n3096.pdf).

I'm a bit worried about backwards compatibility. This is admittedly a corner case, and I agree that the new semantics are better than the old ones, but it's also changing behaviour that's been there for decades.

I'm not too worried that there's "real" code out there whose mathematical correctness relies on the old division semantics - that seems unlikely. But it does seem likely that this change could break code that's relying on reproducing the Python semantics. The most obvious example would be the behaviour of the NumPy complex types - the NumPy folks probably have tests that ensure that complex128 behaves the same as complex in corner cases.

It may be worth checking the NumPy behaviour here - if complex128 currently has the same behaviour as Python's complex in these corner cases, then we could open an issue on the NumPy tracker alerting them to the upcoming change and giving them the chance to complain.

Objects/complexobject.c

picnixz · 2024-05-28T07:46:30Z

My 2 cents:

the NumPy folks probably have tests that ensure that complex128 behaves the same as complex in corner cases.

Here are some relevant places where the complex128 tests take place:

https://github.com/numpy/numpy/blob/main/numpy/_core/tests/test_umath.py#L632

(+ below and above code)

Most of the basic operations tests are in https://github.com/numpy/numpy/blob/main/numpy/_core/tests/test_umath.py but the corner cases being tested are not as exhaustive as here.

Technically, if the corner cases handling changes, then NumPy (by default) would just raise errors. However, with different error handler flags, you indeed might have breaking changes where a user code would incorrectly check that something is nan but not check that something is inf (or vice-versa). I think the more plausible breaking case is (1+1j)/(cmath.inf+cmath.infj) which currently gives nan+nanj instead of 0.

skirpichev · 2024-05-28T14:13:17Z

@mdickinson, I'm not sure that numpy people are trying to reproduce CPython complex behaviour, here is the code:
https://github.com/numpy/numpy/blob/012e63d8b04636bc8e0a7b05312e5a8cc985f727/numpy/_core/src/npymath/npy_math_complex.c.src#L92-L120
Yes, they are using Smith's algorithm. But that's all. In some "corner cases" behaviour already different:

>>> np.__version__
'1.24.4'
>>> np.complex128(1)/np.complex128(0)
<stdin>:1: RuntimeWarning: divide by zero encountered in scalar divide
<stdin>:1: RuntimeWarning: invalid value encountered in scalar divide
(inf+nanj)

we could open an issue on the NumPy tracker alerting them to the upcoming change and giving them the chance to complain.

FYI: numpy/numpy#26560

skirpichev · 2024-06-18T13:09:45Z

@serhiy-storchaka, it seems Mark OK with this change. We have numpy issue with pr, do we need something else?

In some cases, previously computed as (nan+nanj), we could recover meaningful component values in the result, see e.g. the C11, Annex G.5.2, routine _Cdivd().

bedevere-app bot added the awaiting review label May 23, 2024

bedevere-app bot mentioned this pull request May 23, 2024

Invalid corner cases (resulting in nan+nanj) in complex division #119372

Closed

serhiy-storchaka approved these changes May 23, 2024

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting review labels May 23, 2024

Tests reformatted

fc3f91d

serhiy-storchaka approved these changes May 23, 2024

View reviewed changes

serhiy-storchaka requested a review from mdickinson May 23, 2024 16:58

mdickinson reviewed May 27, 2024

View reviewed changes

Objects/complexobject.c Show resolved Hide resolved

Objects/complexobject.c Outdated Show resolved Hide resolved

skirpichev added 2 commits May 27, 2024 18:07

Address review: use Py_IS_* macros

99669a1

Merge branch 'main' into fix-complex-tdiv-119372

d4718bc

skirpichev mentioned this pull request May 27, 2024

Deprecate Py_IS_NAN/INFINITY/FINITE? #119613

Closed

Merge branch 'main' into fix-complex-tdiv-119372

b1b3e76

vstinner mentioned this pull request May 28, 2024

gh-119613: use C99+ functions instead of Py_IS_NAN/INFINITY/FINITE #119619

Merged

skirpichev added 2 commits May 30, 2024 18:09

Merge branch 'master' into fix-complex-tdiv-119372

f6f5510

Partially revert 99669a1 (don't use Py_IS_* macros)

1c9ccc4

skirpichev mentioned this pull request Jun 2, 2024

BUG: fix corner cases in complex division (were nan+nanj) numpy/numpy#26595

Open

Merge branch 'main' into fix-complex-tdiv-119372

6e17df6

serhiy-storchaka merged commit 2cb84b1 into python:main Jun 29, 2024
36 checks passed

bedevere-app bot removed the awaiting merge label Jun 29, 2024

skirpichev deleted the fix-complex-tdiv-119372 branch June 29, 2024 08:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-119372: recover inf's and zeros in _Py_c_quot #119457

gh-119372: recover inf's and zeros in _Py_c_quot #119457

skirpichev commented May 23, 2024 •

edited by bedevere-app bot

Loading

serhiy-storchaka left a comment

serhiy-storchaka May 23, 2024

skirpichev May 23, 2024 •

edited

Loading

mdickinson May 27, 2024

skirpichev May 29, 2024

mdickinson Jun 29, 2024

mdickinson left a comment •

edited

Loading

picnixz commented May 28, 2024

skirpichev commented May 28, 2024

skirpichev commented Jun 18, 2024

gh-119372: recover inf's and zeros in _Py_c_quot #119457

gh-119372: recover inf's and zeros in _Py_c_quot #119457

Conversation

skirpichev commented May 23, 2024 • edited by bedevere-app bot Loading

serhiy-storchaka left a comment

Choose a reason for hiding this comment

serhiy-storchaka May 23, 2024

Choose a reason for hiding this comment

skirpichev May 23, 2024 • edited Loading

Choose a reason for hiding this comment

mdickinson May 27, 2024

Choose a reason for hiding this comment

skirpichev May 29, 2024

Choose a reason for hiding this comment

mdickinson Jun 29, 2024

Choose a reason for hiding this comment

mdickinson left a comment • edited Loading

Choose a reason for hiding this comment

picnixz commented May 28, 2024

skirpichev commented May 28, 2024

skirpichev commented Jun 18, 2024

skirpichev commented May 23, 2024 •

edited by bedevere-app bot

Loading

skirpichev May 23, 2024 •

edited

Loading

mdickinson left a comment •

edited

Loading