Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: implement __array_function__ for ExtensionArray #35032

Closed

Conversation

simonjayhawkins
Copy link
Member

xref #26380

@simonjayhawkins simonjayhawkins added Compat pandas objects compatability with Numpy or Python functions ExtensionArray Extending pandas with custom dtypes or arrays. labels Jun 27, 2020
@simonjayhawkins
Copy link
Member Author

while adding dispatch from numpy to concat_compat, it was discovered that the axis argument is ignored for EAs. changing this fixes another bug, so the relevant change has been broken off into a separate PR #35038 to be reviewed first and reduce the diff here to just the changes required to get tests passing with __array_function__

on master

>>> arr = pd.array([1, 2, 3, None], dtype="Int64")
>>> arr
<IntegerArray>
[1, 2, 3, <NA>]
Length: 4, dtype: Int64
>>>
>>> from pandas.core.dtypes.concat import concat_compat
>>>
>>> concat_compat([arr, arr])
<IntegerArray>
[1, 2, 3, <NA>, 1, 2, 3, <NA>]
Length: 8, dtype: Int64
>>>
>>> concat_compat([arr, arr], axis=1)
<IntegerArray>
[1, 2, 3, <NA>, 1, 2, 3, <NA>]
Length: 8, dtype: Int64
>>>
>>> np.concatenate([arr, arr])
array([1, 2, 3, <NA>, 1, 2, 3, <NA>], dtype=object)
>>>
>>> np.concatenate([arr, arr], axis=1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<__array_function__ internals>", line 5, in concatenate
numpy.AxisError: axis 1 is out of bounds for array of dimension 1
>>>

@TomAugspurger
Copy link
Contributor

Thanks. I think this should probably be a mixin that arrays can opt into by inheriting from it.

@simonjayhawkins
Copy link
Member Author

for out internal EAs we could opt-in for all. It maybe that we don't need to opt-in for PandasArray and just opt-in for StringArray

@simonjayhawkins
Copy link
Member Author

It maybe that we don't need to opt-in for PandasArray and just opt-in for StringArray

opt-in for a subclass of a class that doesn't opt-in seems more trouble than it's worth since would need to override some of the inherited methods.

so will add tests back for PandasArray and add ArrayFunctionMixin shortly.

@simonjayhawkins simonjayhawkins marked this pull request as ready for review July 17, 2020 17:29
@simonjayhawkins
Copy link
Member Author

simonjayhawkins commented Sep 14, 2020

tests failing


        msg = "axis 1 is out of bounds for array of dimension 1"
        with pytest.raises(np.AxisError, match=msg):
>           np.concatenate([data] * 3, axis=1)
E           Failed: DID NOT RAISE <class 'numpy.AxisError'>

the changes to make this work was broken off in #35038 but reverted in #36115

so I either need to add this changes back in here or delete this test.

@simonjayhawkins
Copy link
Member Author

Thanks. I think this should probably be a mixin that arrays can opt into by inheriting from it.

after some more thought I not so sure this is a good idea, otherwise we won't be able to use the numpy api in the codebase in the future to avoid special casing EAs in the code which was my motivation for investigating this.

of course, this would be an additional burden on 3rd party EA authors so needs further consideration, xref #32586

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions ExtensionArray Extending pandas with custom dtypes or arrays.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants