Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Three test_aggregation[int-method_median] tests failing #3779

Closed
ArchangeGabriel opened this issue Feb 18, 2020 · 2 comments · Fixed by #3787
Closed

Three test_aggregation[int-method_median] tests failing #3779

ArchangeGabriel opened this issue Feb 18, 2020 · 2 comments · Fixed by #3787

Comments

@ArchangeGabriel
Copy link
Contributor

Follow-up of #3777.

The three failing tests seems to be failing because dask_array=None, which is likely the same kind of issue as #3777: dask-dependant tests ran while dask is not available. The other printed error is strange to me, because numpy is at version 1.18.1 on this system.

_______________ TestVariable.test_aggregation[int-method_median] _______________

values = array([0, 0, 0, 0, 0, 0, 0, 0, 0, 1]), axis = None, skipna = None
kwargs = {}, func = <function _dask_or_eager_func.<locals>.f at 0x7f3927bdbe50>
msg = 'median is not available with skipna=False with the installed version of numpy; upgrade to numpy 1.12 or newer to use skipna=True or skipna=None'

    def f(values, axis=None, skipna=None, **kwargs):
        if kwargs.pop("out", None) is not None:
            raise TypeError(f"`out` is not valid for {name}")
    
        values = asarray(values)
    
        if coerce_strings and values.dtype.kind in "SU":
            values = values.astype(object)
    
        func = None
        if skipna or (skipna is None and values.dtype.kind in "cfO"):
            nanname = "nan" + name
            func = getattr(nanops, nanname)
        else:
            func = _dask_or_eager_func(name, dask_module=dask_module)
    
        try:
>           return func(values, axis=axis, **kwargs)

xarray/core/duck_array_ops.py:307: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (array([0, 0, 0, 0, 0, 0, 0, 0, 0, 1]),), kwargs = {'axis': None}
dispatch_args = (array([0, 0, 0, 0, 0, 0, 0, 0, 0, 1]),)

    def f(*args, **kwargs):
        if list_of_args:
            dispatch_args = args[0]
        else:
            dispatch_args = args[array_args]
>       if any(isinstance(a, dask_array.Array) for a in dispatch_args):

xarray/core/duck_array_ops.py:40: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

.0 = <tuple_iterator object at 0x7f3927966f70>

>   if any(isinstance(a, dask_array.Array) for a in dispatch_args):
E   AttributeError: 'NoneType' object has no attribute 'Array'

xarray/core/duck_array_ops.py:40: AttributeError

During handling of the above exception, another exception occurred:

self = <xarray.tests.test_units.TestVariable object at 0x7f3927966640>
func = method_median, dtype = <class 'int'>

    @pytest.mark.parametrize(
        "func",
        (
            method("all"),
            method("any"),
            method("argmax"),
            method("argmin"),
            method("argsort"),
            method("cumprod"),
            method("cumsum"),
            method("max"),
            method("mean"),
            method("median"),
            method("min"),
            pytest.param(
                method("prod"),
                marks=pytest.mark.xfail(reason="not implemented by pint"),
            ),
            method("std"),
            method("sum"),
            method("var"),
        ),
        ids=repr,
    )
    def test_aggregation(self, func, dtype):
        array = np.linspace(0, 1, 10).astype(dtype) * (
            unit_registry.m if func.name != "cumprod" else unit_registry.dimensionless
        )
        variable = xr.Variable("x", array)
    
        units = extract_units(func(array))
>       expected = attach_units(func(strip_units(variable)), units)

xarray/tests/test_units.py:1389: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
xarray/tests/test_units.py:374: in __call__
    return func(*all_args, **all_kwargs)
xarray/core/common.py:46: in wrapped_func
    return self.reduce(func, dim, axis, skipna=skipna, **kwargs)
xarray/core/variable.py:1537: in reduce
    data = func(input_data, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

values = array([0, 0, 0, 0, 0, 0, 0, 0, 0, 1]), axis = None, skipna = None
kwargs = {}, func = <function _dask_or_eager_func.<locals>.f at 0x7f3927bdbe50>
msg = 'median is not available with skipna=False with the installed version of numpy; upgrade to numpy 1.12 or newer to use skipna=True or skipna=None'

    def f(values, axis=None, skipna=None, **kwargs):
        if kwargs.pop("out", None) is not None:
            raise TypeError(f"`out` is not valid for {name}")
    
        values = asarray(values)
    
        if coerce_strings and values.dtype.kind in "SU":
            values = values.astype(object)
    
        func = None
        if skipna or (skipna is None and values.dtype.kind in "cfO"):
            nanname = "nan" + name
            func = getattr(nanops, nanname)
        else:
            func = _dask_or_eager_func(name, dask_module=dask_module)
    
        try:
            return func(values, axis=axis, **kwargs)
        except AttributeError:
            if isinstance(values, dask_array_type):
                try:  # dask/dask#3133 dask sometimes needs dtype argument
                    # if func does not accept dtype, then raises TypeError
                    return func(values, axis=axis, dtype=values.dtype, **kwargs)
                except (AttributeError, TypeError):
                    msg = "%s is not yet implemented on dask arrays" % name
            else:
                msg = (
                    "%s is not available with skipna=False with the "
                    "installed version of numpy; upgrade to numpy 1.12 "
                    "or newer to use skipna=True or skipna=None" % name
                )
>           raise NotImplementedError(msg)
E           NotImplementedError: median is not available with skipna=False with the installed version of numpy; upgrade to numpy 1.12 or newer to use skipna=True or skipna=None

xarray/core/duck_array_ops.py:321: NotImplementedError
______________ TestDataArray.test_aggregation[int-method_median] _______________

values = array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), axis = None, skipna = None
kwargs = {}, func = <function _dask_or_eager_func.<locals>.f at 0x7f39286dbc10>
msg = 'median is not available with skipna=False with the installed version of numpy; upgrade to numpy 1.12 or newer to use skipna=True or skipna=None'

    def f(values, axis=None, skipna=None, **kwargs):
        if kwargs.pop("out", None) is not None:
            raise TypeError(f"`out` is not valid for {name}")
    
        values = asarray(values)
    
        if coerce_strings and values.dtype.kind in "SU":
            values = values.astype(object)
    
        func = None
        if skipna or (skipna is None and values.dtype.kind in "cfO"):
            nanname = "nan" + name
            func = getattr(nanops, nanname)
        else:
            func = _dask_or_eager_func(name, dask_module=dask_module)
    
        try:
>           return func(values, axis=axis, **kwargs)

xarray/core/duck_array_ops.py:307: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),), kwargs = {'axis': None}
dispatch_args = (array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),)

    def f(*args, **kwargs):
        if list_of_args:
            dispatch_args = args[0]
        else:
            dispatch_args = args[array_args]
>       if any(isinstance(a, dask_array.Array) for a in dispatch_args):

xarray/core/duck_array_ops.py:40: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

.0 = <tuple_iterator object at 0x7f39288ad880>

>   if any(isinstance(a, dask_array.Array) for a in dispatch_args):
E   AttributeError: 'NoneType' object has no attribute 'Array'

xarray/core/duck_array_ops.py:40: AttributeError

During handling of the above exception, another exception occurred:

self = <xarray.tests.test_units.TestDataArray object at 0x7f39288ad190>
func = method_median, dtype = <class 'int'>

    @pytest.mark.parametrize(
        "func",
        (
            pytest.param(
                function("all"),
                marks=pytest.mark.xfail(reason="not implemented by pint yet"),
            ),
            pytest.param(
                function("any"),
                marks=pytest.mark.xfail(reason="not implemented by pint yet"),
            ),
            function("argmax"),
            function("argmin"),
            function("max"),
            function("mean"),
            pytest.param(
                function("median"),
                marks=pytest.mark.xfail(reason="not implemented by xarray"),
            ),
            function("min"),
            pytest.param(
                function("prod"),
                marks=pytest.mark.xfail(reason="not implemented by pint yet"),
            ),
            function("sum"),
            function("std"),
            function("var"),
            function("cumsum"),
            pytest.param(
                function("cumprod"),
                marks=pytest.mark.xfail(reason="not implemented by pint yet"),
            ),
            pytest.param(
                method("all"),
                marks=pytest.mark.xfail(reason="not implemented by pint yet"),
            ),
            pytest.param(
                method("any"),
                marks=pytest.mark.xfail(reason="not implemented by pint yet"),
            ),
            method("argmax"),
            method("argmin"),
            method("max"),
            method("mean"),
            method("median"),
            method("min"),
            pytest.param(
                method("prod"),
                marks=pytest.mark.xfail(
                    reason="comparison of quantity with ndarrays in nanops not implemented"
                ),
            ),
            method("sum"),
            method("std"),
            method("var"),
            method("cumsum"),
            pytest.param(
                method("cumprod"),
                marks=pytest.mark.xfail(reason="pint does not implement cumprod yet"),
            ),
        ),
        ids=repr,
    )
    def test_aggregation(self, func, dtype):
        array = np.arange(10).astype(dtype) * (
            unit_registry.m if func.name != "cumprod" else unit_registry.dimensionless
        )
        data_array = xr.DataArray(data=array, dims="x")
    
        # units differ based on the applied function, so we need to
        # first compute the units
        units = extract_units(func(array))
>       expected = attach_units(func(strip_units(data_array)), units)

xarray/tests/test_units.py:2226: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
xarray/tests/test_units.py:374: in __call__
    return func(*all_args, **all_kwargs)
xarray/core/common.py:46: in wrapped_func
    return self.reduce(func, dim, axis, skipna=skipna, **kwargs)
xarray/core/dataarray.py:2235: in reduce
    var = self.variable.reduce(func, dim, axis, keep_attrs, keepdims, **kwargs)
xarray/core/variable.py:1537: in reduce
    data = func(input_data, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

values = array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), axis = None, skipna = None
kwargs = {}, func = <function _dask_or_eager_func.<locals>.f at 0x7f39286dbc10>
msg = 'median is not available with skipna=False with the installed version of numpy; upgrade to numpy 1.12 or newer to use skipna=True or skipna=None'

    def f(values, axis=None, skipna=None, **kwargs):
        if kwargs.pop("out", None) is not None:
            raise TypeError(f"`out` is not valid for {name}")
    
        values = asarray(values)
    
        if coerce_strings and values.dtype.kind in "SU":
            values = values.astype(object)
    
        func = None
        if skipna or (skipna is None and values.dtype.kind in "cfO"):
            nanname = "nan" + name
            func = getattr(nanops, nanname)
        else:
            func = _dask_or_eager_func(name, dask_module=dask_module)
    
        try:
            return func(values, axis=axis, **kwargs)
        except AttributeError:
            if isinstance(values, dask_array_type):
                try:  # dask/dask#3133 dask sometimes needs dtype argument
                    # if func does not accept dtype, then raises TypeError
                    return func(values, axis=axis, dtype=values.dtype, **kwargs)
                except (AttributeError, TypeError):
                    msg = "%s is not yet implemented on dask arrays" % name
            else:
                msg = (
                    "%s is not available with skipna=False with the "
                    "installed version of numpy; upgrade to numpy 1.12 "
                    "or newer to use skipna=True or skipna=None" % name
                )
>           raise NotImplementedError(msg)
E           NotImplementedError: median is not available with skipna=False with the installed version of numpy; upgrade to numpy 1.12 or newer to use skipna=True or skipna=None

xarray/core/duck_array_ops.py:321: NotImplementedError
_______________ TestDataset.test_aggregation[int-method_median] ________________

values = <Quantity([0 0 0 0 0 0 0 0 0 1], 'pascal')>, axis = 0, skipna = None
kwargs = {}, func = <function _dask_or_eager_func.<locals>.f at 0x7f392619b820>
msg = 'median is not available with skipna=False with the installed version of numpy; upgrade to numpy 1.12 or newer to use skipna=True or skipna=None'

    def f(values, axis=None, skipna=None, **kwargs):
        if kwargs.pop("out", None) is not None:
            raise TypeError(f"`out` is not valid for {name}")
    
        values = asarray(values)
    
        if coerce_strings and values.dtype.kind in "SU":
            values = values.astype(object)
    
        func = None
        if skipna or (skipna is None and values.dtype.kind in "cfO"):
            nanname = "nan" + name
            func = getattr(nanops, nanname)
        else:
            func = _dask_or_eager_func(name, dask_module=dask_module)
    
        try:
>           return func(values, axis=axis, **kwargs)

xarray/core/duck_array_ops.py:307: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (<Quantity([0 0 0 0 0 0 0 0 0 1], 'pascal')>,), kwargs = {'axis': 0}
dispatch_args = (<Quantity([0 0 0 0 0 0 0 0 0 1], 'pascal')>,)

    def f(*args, **kwargs):
        if list_of_args:
            dispatch_args = args[0]
        else:
            dispatch_args = args[array_args]
>       if any(isinstance(a, dask_array.Array) for a in dispatch_args):

xarray/core/duck_array_ops.py:40: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

.0 = <tuple_iterator object at 0x7f39269995e0>

>   if any(isinstance(a, dask_array.Array) for a in dispatch_args):
E   AttributeError: 'NoneType' object has no attribute 'Array'

xarray/core/duck_array_ops.py:40: AttributeError

During handling of the above exception, another exception occurred:

self = <xarray.tests.test_units.TestDataset object at 0x7f3927adf880>
func = method_median, dtype = <class 'int'>

    @pytest.mark.parametrize(
        "func",
        (
            pytest.param(
                function("all"),
                marks=pytest.mark.xfail(reason="not implemented by pint"),
            ),
            pytest.param(
                function("any"),
                marks=pytest.mark.xfail(reason="not implemented by pint"),
            ),
            function("argmax"),
            function("argmin"),
            function("max"),
            function("min"),
            function("mean"),
            pytest.param(
                function("median"),
                marks=pytest.mark.xfail(
                    reason="np.median does not work with dataset yet"
                ),
            ),
            function("sum"),
            pytest.param(
                function("prod"),
                marks=pytest.mark.xfail(reason="not implemented by pint"),
            ),
            function("std"),
            function("var"),
            function("cumsum"),
            pytest.param(
                function("cumprod"),
                marks=pytest.mark.xfail(reason="fails within xarray"),
            ),
            pytest.param(
                method("all"), marks=pytest.mark.xfail(reason="not implemented by pint")
            ),
            pytest.param(
                method("any"), marks=pytest.mark.xfail(reason="not implemented by pint")
            ),
            method("argmax"),
            method("argmin"),
            method("max"),
            method("min"),
            method("mean"),
            method("median"),
            method("sum"),
            pytest.param(
                method("prod"),
                marks=pytest.mark.xfail(reason="not implemented by pint"),
            ),
            method("std"),
            method("var"),
            method("cumsum"),
            pytest.param(
                method("cumprod"), marks=pytest.mark.xfail(reason="fails within xarray")
            ),
        ),
        ids=repr,
    )
    def test_aggregation(self, func, dtype):
        unit_a = (
            unit_registry.Pa if func.name != "cumprod" else unit_registry.dimensionless
        )
        unit_b = (
            unit_registry.kg / unit_registry.m ** 3
            if func.name != "cumprod"
            else unit_registry.dimensionless
        )
        a = xr.DataArray(data=np.linspace(0, 1, 10).astype(dtype) * unit_a, dims="x")
        b = xr.DataArray(data=np.linspace(-1, 0, 10).astype(dtype) * unit_b, dims="x")
        x = xr.DataArray(data=np.arange(10).astype(dtype) * unit_registry.m, dims="x")
        y = xr.DataArray(
            data=np.arange(10, 20).astype(dtype) * unit_registry.s, dims="x"
        )
    
        ds = xr.Dataset(data_vars={"a": a, "b": b}, coords={"x": x, "y": y})
    
>       actual = func(ds)

xarray/tests/test_units.py:3733: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
xarray/tests/test_units.py:374: in __call__
    return func(*all_args, **all_kwargs)
xarray/core/common.py:83: in wrapped_func
    return self.reduce(
xarray/core/dataset.py:4230: in reduce
    variables[name] = var.reduce(
xarray/core/variable.py:1535: in reduce
    data = func(input_data, axis=axis, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

values = <Quantity([0 0 0 0 0 0 0 0 0 1], 'pascal')>, axis = 0, skipna = None
kwargs = {}, func = <function _dask_or_eager_func.<locals>.f at 0x7f392619b820>
msg = 'median is not available with skipna=False with the installed version of numpy; upgrade to numpy 1.12 or newer to use skipna=True or skipna=None'

    def f(values, axis=None, skipna=None, **kwargs):
        if kwargs.pop("out", None) is not None:
            raise TypeError(f"`out` is not valid for {name}")
    
        values = asarray(values)
    
        if coerce_strings and values.dtype.kind in "SU":
            values = values.astype(object)
    
        func = None
        if skipna or (skipna is None and values.dtype.kind in "cfO"):
            nanname = "nan" + name
            func = getattr(nanops, nanname)
        else:
            func = _dask_or_eager_func(name, dask_module=dask_module)
    
        try:
            return func(values, axis=axis, **kwargs)
        except AttributeError:
            if isinstance(values, dask_array_type):
                try:  # dask/dask#3133 dask sometimes needs dtype argument
                    # if func does not accept dtype, then raises TypeError
                    return func(values, axis=axis, dtype=values.dtype, **kwargs)
                except (AttributeError, TypeError):
                    msg = "%s is not yet implemented on dask arrays" % name
            else:
                msg = (
                    "%s is not available with skipna=False with the "
                    "installed version of numpy; upgrade to numpy 1.12 "
                    "or newer to use skipna=True or skipna=None" % name
                )
>           raise NotImplementedError(msg)
E           NotImplementedError: median is not available with skipna=False with the installed version of numpy; upgrade to numpy 1.12 or newer to use skipna=True or skipna=None

xarray/core/duck_array_ops.py:321: NotImplementedError

However I’m not much knowledgeable on all this, so I’ll defer to you for finding the root cause.

@keewis
Copy link
Collaborator

keewis commented Feb 19, 2020

Thanks for reporting, we don't seem to be testing a lot without dask installed.

These fail because _dask_or_eager_func uses dask_array.Array for instance checks instead of the already imported dask_array_type. Also, we should probably go through dask_array_compat and replace the references to da.Array (where da can be None) with dask_array_type.

@ArchangeGabriel
Copy link
Contributor Author

Well, the thing is we don’t package dask (yet!), so I tested with all we had (i.e. 7 of the 20 existing optdeps), which is a configuration quite different from your usual test matrix. ;)

I might package dask in the future, but they are too much missing dependencies for it in our repos for now, and I don’t have enough time currently to go into packaging each of them and their own dependencies, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants