[NUMERICAL] VS Algorithm causes lots of overflow / underflow for high degrees #263

ravipra · 2021-06-17T17:50:49Z

Hi,

Thank you for the very useful package.

Been able to use it for the past one year successfully. But getting "invalid value encountered" runtime warnings today.

The errors are:

/home/user/.local/lib/python3.9/site-packages/bezier/hazmat/curve_helpers.py:256
: RuntimeWarning: invalid value encountered in multiply 
  result += binom_val * lambda2_pow * nodes[:, [index]] 
/home/user/.local/lib/python3.9/site-packages/bezier/hazmat/curve_helpers.py:256
: RuntimeWarning: invalid value encountered in add 
  result += binom_val * lambda2_pow * nodes[:, [index]] 
/home/user/.local/lib/python3.9/site-packages/bezier/hazmat/curve_helpers.py:256
: RuntimeWarning: overflow encountered in multiply 
  result += binom_val * lambda2_pow * nodes[:, [index]] 
/home/user/.local/lib/python3.9/site-packages/bezier/hazmat/curve_helpers.py:257
: RuntimeWarning: invalid value encountered in multiply 
  result *= lambda1

The code I am using is:

import bezier
import numpy as np
import sys

def bezier_values(vals):
  inds = list(range(len(vals)))
  nodes = np.asfortranarray([inds, vals])
  curve = bezier.Curve(nodes, len(vals)-1)

  def f(ind):
    s_val = ind/(len(inds)-1)
    return curve.evaluate(s_val).tolist()[1][0]

  return list(map(f, inds))

lines = []
for line in sys.stdin:
  lines.append(line)

vals = list(map(int, lines))
b_vals = bezier_values(vals)

Attaching the data file (data.txt) that I am using.

Could you please let me know what I am doing wrong?

Thank you.

Ravi
data.txt

The text was updated successfully, but these errors were encountered:

ravipra · 2021-06-17T17:54:08Z

I am using arch Linux and installed it as:

BEZIER_NO_EXTENSION=true python  -m pip install --upgrade bezier --no-binary=bezier
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: bezier in ./.local/lib/python3.9/site-packages (2021.2.12)
Requirement already satisfied: numpy>=1.20.1 in ./.local/lib/python3.9/site-packages (from bezier) (1.20.1)

dhermes · 2021-07-01T20:29:27Z

Thanks for filing! It is making me think about the trade-offs of using the (faster) VS Algorithm vs. using the (standard) de Casteljau algorithm. (A related issue cropped up in #156, fixed in 5768824 by using a float64 / c_double instead of an int32 / c_int. Using twice as many bits for a stack variable is essentially free.)

The primary issue with the implementation (the VS Algorithm) is that it is computing large powers of the input (s^k) and large binomial coefficients (n C k). For a very high degree (here the degree is 1021), this can cause underflow for s^k (when 0 < s < 1) and overflow for n C k.

I'm still digging into the specifics of how / where this fails, but I'll likely implement a fix that switches from the VS Algorithm to the de Casteljau algorithm if the degree exceeds a fixed number (probably 54).

The implementation computes binomial coefficients iteratively via:

binom_val = (binom_val * (degree - index + 1)) / index

However, this approach has 4 types of accuracy issues:

The intermediate computation binom_val * (degree - index + 1) cannot be computed exactly (i.e. requires round off)
The actual value n C k / binom_val cannot be represented exactly
The intermediate computation binom_val * (degree - index + 1) overflows
The actual value n C k / binom_val overflows

Using the script below, the following binomial coefficients are the earliest such instances of a problem for a 64-bit floating point value (double) and a 32-bit signed integer (int / int32):

Inexact computation (float64):        55 C 26 = 3560597348629860 != 3560597348629859.5
Inexact representation (float64):     57 C 25 = 9929472283517787 != 9929472283517788.0
Overflow in computation (int32):      30 C 15 = 155117520 != -131213634
Overflow in representation (int32):   34 C 16 = 2203961430 != -2091005866
Overflow in computation (float64):    1021 C 496
Overflow in representation (float64): 1030 C 500

Script to determine these values:

import fractions

import numpy as np


def computed_exactly(n):
    b_np = np.float64(1.0)
    b_fr = fractions.Fraction(1)
    for k in range(1, n):
        b_np = (b_np * (n + 1 - k)) / k
        b_fr = (b_fr * (n + 1 - k)) / k
        if b_fr.denominator != 1:
            raise ValueError("Non-integer", b_fr)
        if b_np != b_fr:
            return n, k, b_np, b_fr.numerator

    return None


def represented_exactly(n):
    b_fr = fractions.Fraction(1)
    for k in range(1, n):
        b_fr = (b_fr * (n + 1 - k)) / k
        if b_fr.denominator != 1:
            raise ValueError("Non-integer", b_fr)
        b_np = np.float64(b_fr.numerator)
        if b_np != b_fr:
            return n, k, b_np, b_fr.numerator

    return None


def computed_overflow_int32(n):
    b_np = np.int32(1)
    b_fr = fractions.Fraction(1)
    for k in range(1, n):
        b_fr = (b_fr * (n + 1 - k)) / k
        if b_fr.denominator != 1:
            raise ValueError("Non-integer", b_fr)

        try:
            b_np = (b_np * np.int32(n + 1 - k)) // np.int32(k)
        except FloatingPointError as exc:
            if exc.args != ("overflow encountered in int_scalars",):
                raise
            machine_limits_int32 = np.iinfo(np.int32)
            work_too_hard = np.int64(b_np) * np.int64(n + 1 - k)
            above = work_too_hard - machine_limits_int32.max - 1
            if not (0 < above < machine_limits_int32.max):
                raise ValueError("Overflow too large")
            overflow_numerator = above + machine_limits_int32.min
            overflow_value = overflow_numerator // np.int64(k)
            return n, k, overflow_value, b_fr.numerator

        if not isinstance(b_np, np.int32):
            raise TypeError("Non-int32", b_np)
        if b_np != b_fr:
            return n, k, b_np, b_fr.numerator

    return None


def represented_overflow_int32(n):
    b_fr = fractions.Fraction(1)
    for k in range(1, n):
        b_fr = (b_fr * (n + 1 - k)) / k
        if b_fr.denominator != 1:
            raise ValueError("Non-integer", b_fr)

        b_np = np.int32(b_fr.numerator)
        if b_np != b_fr:
            return n, k, b_np, b_fr.numerator

    return None


def computed_overflow(n):
    b_np = np.float64(1.0)
    for k in range(1, n):
        try:
            b_np = (b_np * (n + 1 - k)) / k
        except FloatingPointError as exc:
            if exc.args != ("overflow encountered in double_scalars",):
                raise
            return n, k

        if np.isinf(b_np):
            return n, k

    return None


def represented_overflow(n):
    b_fr = fractions.Fraction(1)
    for k in range(1, n):
        b_fr = (b_fr * (n + 1 - k)) / k
        if b_fr.denominator != 1:
            raise ValueError("Non-integer", b_fr)

        try:
            b_np = np.float64(b_fr.numerator)
        except OverflowError as exc:
            if exc.args != ("int too large to convert to float",):
                raise
            return n, k

        if np.isinf(b_np):
            return n, k

    return None


def main():
    np.seterr(all="raise")

    # Find the first instance where `n C k` cannot be computed exactly as a
    # float64.
    for n in range(1, 128):
        mismatch = computed_exactly(n)
        if mismatch is None:
            continue

        n, k, numerical, exact = mismatch
        print(
            "Inexact computation (float64):        "
            f"{n} C {k} = {exact} != {numerical}"
        )
        break

    # Find the first instance where `n C k` cannot be represented exactly as a
    # float64.
    for n in range(1, 128):
        mismatch = represented_exactly(n)
        if mismatch is None:
            continue

        n, k, numerical, exact = mismatch
        print(
            "Inexact representation (float64):     "
            f"{n} C {k} = {exact} != {numerical}"
        )
        break

    # Find the first instance where computing `n C k` overflows as an int32.
    for n in range(1, 128):
        mismatch = computed_overflow_int32(n)
        if mismatch is None:
            continue

        n, k, numerical, exact = mismatch
        print(
            "Overflow in computation (int32):      "
            f"{n} C {k} = {exact} != {numerical}"
        )
        break

    # Find the first instance where `n C k` overflows as an int32.
    for n in range(1, 128):
        mismatch = represented_overflow_int32(n)
        if mismatch is None:
            continue

        n, k, numerical, exact = mismatch
        print(
            "Overflow in representation (int32):   "
            f"{n} C {k} = {exact} != {numerical}"
        )
        break

    # Find the first instance where computing `n C k` overflows as a float64.
    for n in range(1, 2048):
        mismatch = computed_overflow(n)
        if mismatch is None:
            continue

        n, k = mismatch
        print(f"Overflow in computation (float64):    {n} C {k}")
        break

    # Find the first instance where `n C k` overflows as a float64.
    for n in range(1, 2048):
        mismatch = represented_overflow(n)
        if mismatch is None:
            continue

        n, k = mismatch
        print(f"Overflow in representation (float64): {n} C {k}")
        break


if __name__ == "__main__":
    main()

dhermes · 2021-07-02T16:42:26Z

As I dig a wee bit more, note that the 4 warnings you mentioned only show up on the FIRST instance of the warning, but if we convert all NumPy warnings to exceptions via np.seterr(all="raise"), then all 1022 inputs fail.

invalid value encountered in multiply occurs for index 0 (line 256)
underflow encountered in multiply occurs for indices 1, 2, ..., 248 (line 254)
invalid value encountered in add occurs for indices 249, 250, ..., 1018 (line 256)
overflow encountered in multiply occurs for indices 1019, 1020, 1021 (line 256)

UPDATES: (I'll post the instrumented code at some point).

For s = 0/1021, the invalid value occurs when doing binom_val * lambda2_pow (for index = 496)
For s = 1/1021 through 248/1021, the invalid value occurs when doing lambda2_pow *= lambda2 (for progressively larger index values as s increases; index = 103 up to 501)
For s = 249/1021 through 1018/1021, the invalid value occurs when doing the addition to update result += ..., i.e. the error happens when doing result + FOO on the RHS; 100% of these fail when index = 501
For s = 1019/1021 through 1021/1021, the invalid value occurs when doing the multiplication (...) * nodes[:, [index]] where the part elided is the product (with higher operator precedence) binom_val * lambda2_pow; these occur at index = 489, 485, 485

dhermes · 2021-07-06T19:30:38Z

@ravipra Another note (unrelated to my findings above), you can vectorize and massively speed up the computation. Instead of

  def f(ind):
    s_val = ind/(len(inds)-1)
    return curve.evaluate(s_val).tolist()[1][0]

  return list(map(f, inds))

create an array of the s-values and evaluate them all at once:

s_values = np.linspace(0.0, 1.0, len(inds))
evaluated = curve.evaluate_multi(s_values)

dhermes · 2021-07-07T16:13:50Z

This is the implementation I'll use. It ports quite nicely to Fortran and attempts to "optimize" the computation by using contiguous memory.

def de_casteljau(nodes, s_values):
    dimension, columns = nodes.shape
    (num_s,) = s_values.shape
    degree = columns - 1

    # TODO: Determine if it's possible to do broadcast multiply in Fortran
    #       order below to avoid allocating all of `lambda{1,2}`.
    lambda1 = np.empty((dimension, num_s, columns - 1), order="F")
    lambda2 = np.empty((dimension, num_s, columns - 1), order="F")
    workspace = np.empty((dimension, num_s, columns - 1), order="F")
    lambda1_1d = 1.0 - s_values  # Probably not worth allocating this
    for i in range(num_s):
        lambda1[:, i, :] = lambda1_1d[i]
        lambda2[:, i, :] = s_values[i]
        workspace[:, i, :] = (
            lambda1_1d[i] * nodes[:, :degree] + s_values[i] * nodes[:, 1:]
        )

    for i in range(degree - 1, 0, -1):
        workspace[:, :, :i] = (
            lambda1[:, :, :i] * workspace[:, :, :i]
            + lambda2[:, :, :i] * workspace[:, :, 1 : (i + 1)]
        )

    # NOTE: This returns an array with `evaluated.flags.owndata` false, though
    #       it is Fortran contiguous.
    return workspace[:, :, 0]

dhermes · 2021-07-08T21:32:42Z

@ravipra I have fixed this in #264 and hope to get a release out soon with the fix.

dhermes · 2021-07-08T21:52:39Z

@ravipra Here is the "fix" in action with your inputs:

In [1]: import bezier

In [2]: import numpy as np

In [3]: with open("./issue-263/data.txt", "r") as file_obj:
   ...:     raw_data = file_obj.read()
   ...: 

In [4]: data_lines = raw_data.strip().split("\n")

In [5]: values = list(map(int, data_lines))

In [6]: indicies = list(range(len(values)))

In [7]: nodes = np.asfortranarray([indicies, values])

In [8]: curve = bezier.Curve.from_nodes(nodes)

In [9]: curve
Out[9]: <Curve (degree=1021, dimension=2)>

In [10]: s_values = np.linspace(0.0, 1.0, len(indicies))

In [11]: evaluated = curve.evaluate_multi(s_values)

In [12]: evaluated.max()
Out[12]: 6650.24031866094

In [13]: evaluated.min()
Out[13]: -2374.7425879401285

In [14]: evaluated
Out[14]: 
array([[ 0.00000000e+00,  1.00000000e+00,  2.00000000e+00, ...,
         1.01900000e+03,  1.02000000e+03,  1.02100000e+03],
       [-3.00000000e+00,  3.72447859e+03,  5.40495424e+03, ...,
         4.41543898e+02,  4.33333153e+02,  4.32000000e+02]])

In [15]: bezier.__version__
Out[15]: '2021.2.13.dev1'

but also note that the de Casteljau algorithm is much less efficient here

In [16]: %time _ = bezier.hazmat.curve_helpers.evaluate_multi_vs(nodes, 1.0 - s_values, s_values)
.../site-packages/bezier/hazmat/curve_helpers.py:259: RuntimeWarning: overflow encountered in multiply
  result += binom_val * lambda2_pow * nodes[:, [index]]
.../site-packages/bezier/hazmat/curve_helpers.py:260: RuntimeWarning: invalid value encountered in multiply
  result *= lambda1
.../site-packages/bezier/hazmat/curve_helpers.py:259: RuntimeWarning: invalid value encountered in multiply
  result += binom_val * lambda2_pow * nodes[:, [index]]
.../site-packages/bezier/hazmat/curve_helpers.py:259: RuntimeWarning: invalid value encountered in add
  result += binom_val * lambda2_pow * nodes[:, [index]]
CPU times: user 37.7 ms, sys: 2.68 ms, total: 40.4 ms
Wall time: 38.7 ms

In [17]: %time _ = bezier.hazmat.curve_helpers.evaluate_multi_de_casteljau(nodes, 1.0 - s_values, s_values)
CPU times: user 5.17 s, sys: 1.51 s, total: 6.69 s
Wall time: 6.7 s

dhermes self-assigned this Jul 1, 2021

dhermes added numerical-stability type: investigating labels Jul 1, 2021

dhermes added type: enhancement and removed type: investigating labels Jul 6, 2021

dhermes changed the title ~~RuntimeWarning: invalid value encountered ...~~ [NUMERICAL] VS Algorithm causes lots of overflow / underflow for high degrees Jul 6, 2021

dhermes mentioned this issue Jul 8, 2021

Fall back to de Casteljau algorithm for evaluation on high degree curves. #264

Merged

dhermes closed this as completed in #264 Jul 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NUMERICAL] VS Algorithm causes lots of overflow / underflow for high degrees #263

[NUMERICAL] VS Algorithm causes lots of overflow / underflow for high degrees #263

ravipra commented Jun 17, 2021

ravipra commented Jun 17, 2021

dhermes commented Jul 1, 2021 •

edited

Loading

dhermes commented Jul 2, 2021 •

edited

Loading

dhermes commented Jul 6, 2021

dhermes commented Jul 7, 2021 •

edited

Loading

dhermes commented Jul 8, 2021

dhermes commented Jul 8, 2021 •

edited

Loading

[NUMERICAL] VS Algorithm causes lots of overflow / underflow for high degrees #263

[NUMERICAL] VS Algorithm causes lots of overflow / underflow for high degrees #263

Comments

ravipra commented Jun 17, 2021

ravipra commented Jun 17, 2021

dhermes commented Jul 1, 2021 • edited Loading

dhermes commented Jul 2, 2021 • edited Loading

dhermes commented Jul 6, 2021

dhermes commented Jul 7, 2021 • edited Loading

dhermes commented Jul 8, 2021

dhermes commented Jul 8, 2021 • edited Loading

dhermes commented Jul 1, 2021 •

edited

Loading

dhermes commented Jul 2, 2021 •

edited

Loading

dhermes commented Jul 7, 2021 •

edited

Loading

dhermes commented Jul 8, 2021 •

edited

Loading