RFC: `item()` to return scalar for arrays with exactly 1 element. #815

randolf-scholz · 2024-06-20T07:58:41Z

def item(self) -> Scalar:
     """If array contains exactly one element, retun it as a scalar, else raises ValueError."""

Examples:

Demo:

import pytest
import xarray as xr
import pandas as pd
import polars as pl
import numpy as np

@pytest.mark.parametrize("data", [[], [1, 2, 3]])
@pytest.mark.parametrize(
    "array_type", [torch.tensor, np.array, pd.Series, pd.Index, pl.Series, xr.DataArray]
)
def test_item_valueerror(data, array_type):
    array = array_type(data)
    with pytest.raises(ValueError):
        array.item()


@pytest.mark.parametrize(
    "array_type", [torch.tensor, np.array, pd.Series, pd.Index, pl.Series, xr.DataArray]
)
def test_item(array_type):
    array = array_type([1])
    array.item()

Currently, only torch fails, because it raises RuntimeError instead of ValueError.

The text was updated successfully, but these errors were encountered:

vnmabus · 2024-06-20T08:16:20Z

This was discussed in #710 , along with the more general to_list, which works also for ND arrays.

randolf-scholz · 2024-06-20T08:51:24Z

item() is a bit different from to_list, and honestly I find it confusing that a method named to_list can return something that is not a list.

rgommers · 2024-06-21T13:11:25Z

.item() is more constrained than to_list indeed, and a bit cleaner. I checked other libraries - NumPy, PyTorch, JAX and CuPy implement .item(), Dask does not. (TF doesn't have it in the docs, so probably also not - but I can't check). CuPy/JAX do the transfer to CPU if the ndarray is on GPU.

This is a minor convenience method though, since float() & co work as well. They are clearer, since type-stable, and it also work for Dask. The only downside is that if you want some dtype-generic implementation to return a single element, you have to write a little utility for it to call int/float/complex/bool as appropriate. Something like:

def as_pyscalar(x):
    if xp.isdtype(x, 'real floating'):
        return float(x)
    elif xp.isdtype(x, 'complex floating'):
        return complex(x)
    elif xp.isdtype(x, 'integral'):
        return int(x)
    elif xp.isdtype(x, 'bool'):
        return bool(x)
    else:
        # raise error, or handle custom/non-standard dtypes if desired

Static typing of such a function, and of .item(), would also be a little annoying as it requires overloads.

asmeurer · 2024-06-25T16:14:22Z

item also works on arrays with multiple dimensions, whereas we decided to make it so float does not.

>>> np.array([1]).item()
1

rgommers · 2024-06-27T19:30:18Z

We discussed this in a call today, and concluded that this fell into a bucket of functionality that is useful, but also easy to implement on top of what's already in the standard. In addition, there are problems with trying to add this: a item() method is hard, because it's missing in some libraries and missing methods cannot be worked around in array-api-compat. If we'd do this, a function would be the way to go - but since that's not present in any libraries, it'd be new - hence more work, and likely to incur resistance from array library maintainers.

Outcome:

Create the array-api-extra package where this kind of function can live, and add it there (probably as as_pyscalar or a similarly descriptive name, not as item)
Only reconsider adding it to the standard itself in the future if most/all array libraries have already added that function.

randolf-scholz · 2024-06-28T11:26:39Z

On a very fundamental level, I believe .item() makes no sense on DataFrame-like objects (pandas.DataFrame, polars.DataFrame, pyarrow.Table, etc.) because these are designed to represent heterogeneous data types.

From a mathematical PoV, item() acts on array-like data with homogeneous type, as a representation of the natural isomorphism V →K, when V is a 1-dimensional vector space over K.

NeilGirdhar · 2024-08-13T19:40:51Z

Is this usage guaranteed?

If so, should it be added somewhere to the specification? I looked for it here.

FWIW I also like the item method since it's all I've ever needed and it's simpler than tolist. I wonder if it should be on the array namespace rather than the array: (def item(x: Array, /) -> complex | bool) since it can be implemented using the array's public interface. (This is a common test in OO design for what should be a method versus a bare function.)

asmeurer · 2024-08-13T22:34:57Z

Yes, __float__ and so on are guaranteed (modulo the "lazy" note). See https://data-apis.org/array-api/latest/API_specification/generated/array_api.array.__float__.html#array_api.array.__float__. Though Ralf's helper should also include a if x.ndim != 1 or x.size != 1: raise ValueError check.

rgommers added the API extension Adds new functions or objects to the API. label Jun 21, 2024

kgryte added RFC Request for comments. Feature requests and proposed changes. Needs Discussion Needs further discussion. labels Jun 21, 2024

asmeurer mentioned this issue Sep 3, 2024

RFC: add materialize to materialize lazy arrays #839

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: `item()` to return scalar for arrays with exactly 1 element. #815

RFC: `item()` to return scalar for arrays with exactly 1 element. #815

randolf-scholz commented Jun 20, 2024

vnmabus commented Jun 20, 2024

randolf-scholz commented Jun 20, 2024

rgommers commented Jun 21, 2024

asmeurer commented Jun 25, 2024

rgommers commented Jun 27, 2024

randolf-scholz commented Jun 28, 2024

NeilGirdhar commented Aug 13, 2024 •

edited

Loading

asmeurer commented Aug 13, 2024 •

edited

Loading

RFC: item() to return scalar for arrays with exactly 1 element. #815

RFC: item() to return scalar for arrays with exactly 1 element. #815

Comments

randolf-scholz commented Jun 20, 2024

vnmabus commented Jun 20, 2024

randolf-scholz commented Jun 20, 2024

rgommers commented Jun 21, 2024

asmeurer commented Jun 25, 2024

rgommers commented Jun 27, 2024

randolf-scholz commented Jun 28, 2024

NeilGirdhar commented Aug 13, 2024 • edited Loading

asmeurer commented Aug 13, 2024 • edited Loading

RFC: `item()` to return scalar for arrays with exactly 1 element. #815

RFC: `item()` to return scalar for arrays with exactly 1 element. #815

NeilGirdhar commented Aug 13, 2024 •

edited

Loading

asmeurer commented Aug 13, 2024 •

edited

Loading