Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] BUG: Empty slicing an array backwards beyond the start should be empty #40642

Closed
mroeschke opened this issue Mar 18, 2024 · 8 comments
Closed

Comments

@mroeschke
Copy link
Contributor

mroeschke commented Mar 18, 2024

Describe the bug, including details regarding any error messages, version, and platform.

I think post #39240, slicing with an empty slice under the same conditions returns the first element when the result should be empty

In [14]: import pyarrow as pa
    ...: print(pa.__version__)
    ...: array = pa.chunked_array([list("abcde")])
    ...: array[slice(-6, -6, -1)]
16.0.0.dev318
Out[14]: 
<pyarrow.lib.ChunkedArray object at 0x11a1ce4d0>
[
  [
    "a"
  ]
]

In [15]: list("abcde")[slice(-6, -6, -1)]
Out[15]: []

cc @LucasG0 @jorisvandenbossche

Component(s)

Python

@jorisvandenbossche
Copy link
Member

Thanks for the catch! Yet another annoying corner case ..

I was wondering if there isn't some existing functionality to "normalize" slice objects, and https://stackoverflow.com/questions/6246084/sanitize-python-slice pointed to https://docs.python.org/3/reference/datamodel.html#slice.indices

We could test if that helps for us. Although for the case above it still gives negative start/stop values .. :

In [3]: slice(-6, -6, -1).indices(5)
Out[3]: (-1, -1, -1)

@jorisvandenbossche
Copy link
Member

(and so this case actually worked before, so going to mark as critical for 16.0)

@LucasG0
Copy link
Contributor

LucasG0 commented Mar 20, 2024

Using slices.indices seems to be a good call, thanks @jorisvandenbossche. I do not think still having negative values is an issue here, I created #40682.

@kou kou changed the title BUG: Empty slicing an array backwards beyond the start should be empty [Python] BUG: Empty slicing an array backwards beyond the start should be empty Mar 20, 2024
@raulcd
Copy link
Member

raulcd commented Apr 9, 2024

I am moving this to 17.0.0. Let me know if this is required for the 16.0.0 release.

@raulcd raulcd modified the milestones: 16.0.0, 17.0.0 Apr 9, 2024
@jorisvandenbossche jorisvandenbossche modified the milestones: 17.0.0, 16.0.0 Apr 9, 2024
@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Apr 9, 2024

Moved this back to 16.0, because it is a regression that we caused ourselves that for now only lives on main, so we can prevent this gets released. Will look into updating/merging the PR.

jorisvandenbossche added a commit that referenced this issue Apr 9, 2024
…is now empty (#40682)

### What changes are included in this PR?

`_normalize_slice` now relies on `slice.indices` (https://docs.python.org/3/reference/datamodel.html#slice.indices).

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Fixing wrong data returned in an edge case.
* GitHub Issue: #40642
* GitHub Issue: #38768

Lead-authored-by: LucasG0 <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
@jorisvandenbossche
Copy link
Member

Resolved by #40682

@jorisvandenbossche
Copy link
Member

@raulcd not fully familiar with the cherry-picking script, but the PR was pointing to an older already closed issues, and I only renamed it to point to this issue after merging with the merge script. Just mentioning in case that might affect it getting picked up by the backporting.

raulcd pushed a commit that referenced this issue Apr 10, 2024
…is now empty (#40682)

### What changes are included in this PR?

`_normalize_slice` now relies on `slice.indices` (https://docs.python.org/3/reference/datamodel.html#slice.indices).

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Fixing wrong data returned in an edge case.
* GitHub Issue: #40642
* GitHub Issue: #38768

Lead-authored-by: LucasG0 <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
@raulcd
Copy link
Member

raulcd commented Apr 10, 2024

@jorisvandenbossche this has been cherry-picked to the 16.0.0 maintenance branch. The cherry-pick script picked it up without further action on my side.

verma-kartik pushed a commit to verma-kartik/arrow that referenced this issue Apr 11, 2024
…start is now empty (apache#40682)

### What changes are included in this PR?

`_normalize_slice` now relies on `slice.indices` (https://docs.python.org/3/reference/datamodel.html#slice.indices).

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Fixing wrong data returned in an edge case.
* GitHub Issue: apache#40642
* GitHub Issue: apache#38768

Lead-authored-by: LucasG0 <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
vibhatha pushed a commit to vibhatha/arrow that referenced this issue Apr 15, 2024
…start is now empty (apache#40682)

### What changes are included in this PR?

`_normalize_slice` now relies on `slice.indices` (https://docs.python.org/3/reference/datamodel.html#slice.indices).

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Fixing wrong data returned in an edge case.
* GitHub Issue: apache#40642
* GitHub Issue: apache#38768

Lead-authored-by: LucasG0 <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
tolleybot pushed a commit to tmct/arrow that referenced this issue May 2, 2024
…start is now empty (apache#40682)

### What changes are included in this PR?

`_normalize_slice` now relies on `slice.indices` (https://docs.python.org/3/reference/datamodel.html#slice.indices).

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Fixing wrong data returned in an edge case.
* GitHub Issue: apache#40642
* GitHub Issue: apache#38768

Lead-authored-by: LucasG0 <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
…start is now empty (apache#40682)

### What changes are included in this PR?

`_normalize_slice` now relies on `slice.indices` (https://docs.python.org/3/reference/datamodel.html#slice.indices).

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Fixing wrong data returned in an edge case.
* GitHub Issue: apache#40642
* GitHub Issue: apache#38768

Lead-authored-by: LucasG0 <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants