Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ListArray slicing on GPU #3248

Merged
merged 20 commits into from
Sep 26, 2024
Merged

fix: ListArray slicing on GPU #3248

merged 20 commits into from
Sep 26, 2024

Conversation

ianna
Copy link
Collaborator

@ianna ianna commented Sep 18, 2024

No description provided.

@ianna ianna linked an issue Sep 18, 2024 that may be closed by this pull request
@ianna ianna marked this pull request as draft September 18, 2024 18:37
Copy link
Collaborator Author

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpivarski - found the issue :-)
The numpy.ndarray head that is 0 is not passed to the GPU kernel!

in

elif is_integer_like(head):
assert advanced is None
nexthead, nexttail = ak._slicing.head_tail(tail)
lenstarts = self._starts.length
nextcarry = ak.index.Index64.empty(lenstarts, self._backend.index_nplike)
assert (
nextcarry.nplike is self._backend.index_nplike
and self._starts.nplike is self._backend.index_nplike
and self._stops.nplike is self._backend.index_nplike
)
self._maybe_index_error(
self._backend[
"awkward_ListArray_getitem_next_at",
nextcarry.dtype.type,
self._starts.dtype.type,
self._stops.dtype.type,
](
nextcarry.data,
self._starts.data,
self._stops.data,
lenstarts,
head,
),
slicer=head,

head in this case can be an array and it can be regularized to a proper backend, then the GPU kernel needs to be updated to handle a 'cp.array(0)'
@ianna ianna marked this pull request as ready for review September 19, 2024 13:21
@ianna ianna marked this pull request as ready for review September 19, 2024 15:04
Copy link
Collaborator Author

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martindurant - as discussed at today's awkward-uproot meeting this is a temporary fix. I will rewrite the way we handle this CUDA kernel wrapping it in a Python function and handling the 'at' type correctly.

@martindurant
Copy link
Contributor

+1, please let me know when this is out.

@martindurant
Copy link
Contributor

Any chance of a few more test cases like [:, 1:], [:, :1], [:, 1::2], [:, ::-1] ?

@ianna ianna marked this pull request as draft September 20, 2024 14:39
Copy link
Collaborator Author

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpivarski - Please, check when you have time. I think, using roles in this case is appropriate. I cannot find tests where the awkward_ListArray_getitem_jagged_expand is used yet. It looks like it should be invoked when a ListArray sliced with ListOffsetArray. It could be a different PR though. Thanks!

dev/generate-kernel-signatures.py Show resolved Hide resolved
src/awkward/contents/regulararray.py Show resolved Hide resolved
@ianna ianna marked this pull request as ready for review September 20, 2024 15:35
@ianna ianna changed the title fix: slicing on GPU fix: ListArray slicing on GPU Sep 20, 2024
@jpivarski
Copy link
Member

I cannot find tests where the awkward_ListArray_getitem_jagged_expand is used yet.

I used the trick of adding

        if index[0] == "awkward_ListArray_getitem_jagged_expand":
            raise Exception("HERE")

to

def __getitem__(self, index: KernelKeyType) -> NumpyKernel:
return NumpyKernel(awkward_cpp.cpu_kernels.kernel[index], index)

and found exactly 1 test where it's used: tests/test_0111_jagged_and_masked_getitem.py in test_double_jagged.

    def test_double_jagged():
        array = ak.highlevel.Array(
            [[[0, 1, 2, 3], [4, 5]], [[6, 7, 8], [9, 10, 11, 12, 13]]], check_valid=True
        ).layout
        array2 = ak.highlevel.Array(
            [[[2, 1, 0], [-1]], [[-1, -2, -3], [2, 1, 1, 3]]], check_valid=True
        ).layout
    
        assert to_list(array[array2]) == [
            [[2, 1, 0], [5]],
            [[8, 7, 6], [11, 10, 10, 12]],
        ]
        assert array.to_typetracer()[array2].form == array[array2].form
    
        content = ak.operations.from_iter(
            [[0, 1, 2, 3], [4, 5], [6, 7, 8], [9, 10, 11, 12, 13]], highlevel=False
        )
        regulararray = ak.contents.RegularArray(content, 2, zeros_length=0)
    
        array1 = ak.highlevel.Array([[2, 1, 0], [-1]], check_valid=True).layout
    
        assert to_list(regulararray[:, array1]) == [[[2, 1, 0], [5]], [[8, 7, 6], [13]]]
        assert regulararray.to_typetracer()[:, array1].form == regulararray[:, array1].form
        assert to_list(regulararray[1:, array1]) == [[[8, 7, 6], [13]]]
        assert (
            regulararray.to_typetracer()[1:, array1].form == regulararray[1:, array1].form
        )
    
        offsets = ak.index.Index64(np.array([0, 2, 4], dtype=np.int64))
        listoffsetarray = ak.contents.ListOffsetArray(offsets, content)
>       assert to_list(listoffsetarray[:, array1]) == [
            [[2, 1, 0], [5]],
            [[8, 7, 6], [13]],
        ]

array      = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
array1     = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>
array2     = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte... [ 2  1  0 -1 -1 -2 -3  2  1  1  3]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
content    = <ListOffsetArray len='4'>
    <offsets><Index dtype='int64' len='5'>
        [ 0  4  6  9 14]
    </Index></offsets>
 ...pe='int64' len='14'>
        [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13]
    </NumpyArray></content>
</ListOffsetArray>
listoffsetarray = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
offsets    = <Index dtype='int64' len='3'>
    [0 2 4]
</Index>
regulararray = <RegularArray size='2' len='2'>
    <content><ListOffsetArray len='4'>
        <offsets><Index dtype='int64' len='5'>
...1  2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</RegularArray>

tests/test_0111_jagged_and_masked_getitem.py:711: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src/awkward/contents/content.py:512: in __getitem__
    return self._getitem(where)
        self       = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
        where      = (slice(None, None, None), <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>)
src/awkward/contents/content.py:557: in _getitem
    out = next._getitem_next(nextwhere[0], nextwhere[1:], None)
        backend    = <awkward._backends.numpy.NumpyBackend object at 0x783910736210>
        items      = [slice(None, None, None), <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>]
        next       = <RegularArray size='2' len='1'>
    <content><ListOffsetArray len='2'>
        <offsets><Index dtype='int64' len='3'>
...          </NumpyArray></content>
        </ListOffsetArray></content>
    </ListOffsetArray></content>
</RegularArray>
        nextwhere  = (slice(None, None, None), <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>)
        self       = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
        this       = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
        where      = (slice(None, None, None), <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>)
src/awkward/contents/regulararray.py:519: in _getitem_next
    nextcontent._getitem_next(nexthead, nexttail, advanced),
        advanced   = None
        head       = slice(None, None, None)
        index_nplike = <awkward._nplikes.numpy.Numpy object at 0x783910703f50>
        nextcarry  = <Index dtype='int64' len='2'>[0 1]</Index>
        nextcontent = <ListArray len='2'>
    <starts><Index dtype='int64' len='2'>
        [0 2]
    </Index></starts>
    <stops><Index dt...0  1  2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListArray>
        nexthead   = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>
        nextsize   = 2
        nexttail   = ()
        self       = <RegularArray size='2' len='1'>
    <content><ListOffsetArray len='2'>
        <offsets><Index dtype='int64' len='3'>
...          </NumpyArray></content>
        </ListOffsetArray></content>
    </ListOffsetArray></content>
</RegularArray>
        start      = 0
        step       = 1
        stop       = 2
        tail       = (<ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>,)
src/awkward/contents/listarray.py:1026: in _getitem_next
    self._backend[
        advanced   = None
        head       = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>
        headlength = 2
        length     = 2
        multistarts = <Index dtype='int64' len='4'>[ 8  7  6 13]</Index>
        multistops = <Index dtype='int64' len='4'>[ 8  7  6 13]</Index>
        nextcarry  = <Index dtype='int64' len='4'>[3 4 3 4]</Index>
        self       = <ListArray len='2'>
    <starts><Index dtype='int64' len='2'>
        [0 2]
    </Index></starts>
    <stops><Index dt...0  1  2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListArray>
        singleoffsets = <Index dtype='int64' len='3'>[0 3 4]</Index>
        tail       = ()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <awkward._backends.numpy.NumpyBackend object at 0x783910736210>
index = ('awkward_ListArray_getitem_jagged_expand', <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, ...)

    def __getitem__(self, index: KernelKeyType) -> NumpyKernel:
        if index[0] == "awkward_ListArray_getitem_jagged_expand":
>           raise Exception("HERE")
E           Exception: HERE

index      = ('awkward_ListArray_getitem_jagged_expand', <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, ...)
self       = <awkward._backends.numpy.NumpyBackend object at 0x783910736210>

src/awkward/_backends/numpy.py:37: Exception

@ianna
Copy link
Collaborator Author

ianna commented Sep 23, 2024

I cannot find tests where the awkward_ListArray_getitem_jagged_expand is used yet.

I used the trick of adding

        if index[0] == "awkward_ListArray_getitem_jagged_expand":
            raise Exception("HERE")

to

def __getitem__(self, index: KernelKeyType) -> NumpyKernel:
return NumpyKernel(awkward_cpp.cpu_kernels.kernel[index], index)

and found exactly 1 test where it's used: tests/test_0111_jagged_and_masked_getitem.py in test_double_jagged.

    def test_double_jagged():
        array = ak.highlevel.Array(
            [[[0, 1, 2, 3], [4, 5]], [[6, 7, 8], [9, 10, 11, 12, 13]]], check_valid=True
        ).layout
        array2 = ak.highlevel.Array(
            [[[2, 1, 0], [-1]], [[-1, -2, -3], [2, 1, 1, 3]]], check_valid=True
        ).layout
    
        assert to_list(array[array2]) == [
            [[2, 1, 0], [5]],
            [[8, 7, 6], [11, 10, 10, 12]],
        ]
        assert array.to_typetracer()[array2].form == array[array2].form
    
        content = ak.operations.from_iter(
            [[0, 1, 2, 3], [4, 5], [6, 7, 8], [9, 10, 11, 12, 13]], highlevel=False
        )
        regulararray = ak.contents.RegularArray(content, 2, zeros_length=0)
    
        array1 = ak.highlevel.Array([[2, 1, 0], [-1]], check_valid=True).layout
    
        assert to_list(regulararray[:, array1]) == [[[2, 1, 0], [5]], [[8, 7, 6], [13]]]
        assert regulararray.to_typetracer()[:, array1].form == regulararray[:, array1].form
        assert to_list(regulararray[1:, array1]) == [[[8, 7, 6], [13]]]
        assert (
            regulararray.to_typetracer()[1:, array1].form == regulararray[1:, array1].form
        )
    
        offsets = ak.index.Index64(np.array([0, 2, 4], dtype=np.int64))
        listoffsetarray = ak.contents.ListOffsetArray(offsets, content)
>       assert to_list(listoffsetarray[:, array1]) == [
            [[2, 1, 0], [5]],
            [[8, 7, 6], [13]],
        ]

array      = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
array1     = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>
array2     = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte... [ 2  1  0 -1 -1 -2 -3  2  1  1  3]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
content    = <ListOffsetArray len='4'>
    <offsets><Index dtype='int64' len='5'>
        [ 0  4  6  9 14]
    </Index></offsets>
 ...pe='int64' len='14'>
        [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13]
    </NumpyArray></content>
</ListOffsetArray>
listoffsetarray = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
offsets    = <Index dtype='int64' len='3'>
    [0 2 4]
</Index>
regulararray = <RegularArray size='2' len='2'>
    <content><ListOffsetArray len='4'>
        <offsets><Index dtype='int64' len='5'>
...1  2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</RegularArray>

tests/test_0111_jagged_and_masked_getitem.py:711: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src/awkward/contents/content.py:512: in __getitem__
    return self._getitem(where)
        self       = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
        where      = (slice(None, None, None), <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>)
src/awkward/contents/content.py:557: in _getitem
    out = next._getitem_next(nextwhere[0], nextwhere[1:], None)
        backend    = <awkward._backends.numpy.NumpyBackend object at 0x783910736210>
        items      = [slice(None, None, None), <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>]
        next       = <RegularArray size='2' len='1'>
    <content><ListOffsetArray len='2'>
        <offsets><Index dtype='int64' len='3'>
...          </NumpyArray></content>
        </ListOffsetArray></content>
    </ListOffsetArray></content>
</RegularArray>
        nextwhere  = (slice(None, None, None), <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>)
        self       = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
        this       = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
        where      = (slice(None, None, None), <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>)
src/awkward/contents/regulararray.py:519: in _getitem_next
    nextcontent._getitem_next(nexthead, nexttail, advanced),
        advanced   = None
        head       = slice(None, None, None)
        index_nplike = <awkward._nplikes.numpy.Numpy object at 0x783910703f50>
        nextcarry  = <Index dtype='int64' len='2'>[0 1]</Index>
        nextcontent = <ListArray len='2'>
    <starts><Index dtype='int64' len='2'>
        [0 2]
    </Index></starts>
    <stops><Index dt...0  1  2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListArray>
        nexthead   = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>
        nextsize   = 2
        nexttail   = ()
        self       = <RegularArray size='2' len='1'>
    <content><ListOffsetArray len='2'>
        <offsets><Index dtype='int64' len='3'>
...          </NumpyArray></content>
        </ListOffsetArray></content>
    </ListOffsetArray></content>
</RegularArray>
        start      = 0
        step       = 1
        stop       = 2
        tail       = (<ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>,)
src/awkward/contents/listarray.py:1026: in _getitem_next
    self._backend[
        advanced   = None
        head       = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>
        headlength = 2
        length     = 2
        multistarts = <Index dtype='int64' len='4'>[ 8  7  6 13]</Index>
        multistops = <Index dtype='int64' len='4'>[ 8  7  6 13]</Index>
        nextcarry  = <Index dtype='int64' len='4'>[3 4 3 4]</Index>
        self       = <ListArray len='2'>
    <starts><Index dtype='int64' len='2'>
        [0 2]
    </Index></starts>
    <stops><Index dt...0  1  2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListArray>
        singleoffsets = <Index dtype='int64' len='3'>[0 3 4]</Index>
        tail       = ()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <awkward._backends.numpy.NumpyBackend object at 0x783910736210>
index = ('awkward_ListArray_getitem_jagged_expand', <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, ...)

    def __getitem__(self, index: KernelKeyType) -> NumpyKernel:
        if index[0] == "awkward_ListArray_getitem_jagged_expand":
>           raise Exception("HERE")
E           Exception: HERE

index      = ('awkward_ListArray_getitem_jagged_expand', <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, ...)
self       = <awkward._backends.numpy.NumpyBackend object at 0x783910736210>

src/awkward/_backends/numpy.py:37: Exception

Thanks! Yes, this test is on

I cannot find tests where the awkward_ListArray_getitem_jagged_expand is used yet.

I used the trick of adding

        if index[0] == "awkward_ListArray_getitem_jagged_expand":
            raise Exception("HERE")

to

def __getitem__(self, index: KernelKeyType) -> NumpyKernel:
return NumpyKernel(awkward_cpp.cpu_kernels.kernel[index], index)

and found exactly 1 test where it's used: tests/test_0111_jagged_and_masked_getitem.py in test_double_jagged.

    def test_double_jagged():
        array = ak.highlevel.Array(
            [[[0, 1, 2, 3], [4, 5]], [[6, 7, 8], [9, 10, 11, 12, 13]]], check_valid=True
        ).layout
        array2 = ak.highlevel.Array(
            [[[2, 1, 0], [-1]], [[-1, -2, -3], [2, 1, 1, 3]]], check_valid=True
        ).layout
    
        assert to_list(array[array2]) == [
            [[2, 1, 0], [5]],
            [[8, 7, 6], [11, 10, 10, 12]],
        ]
        assert array.to_typetracer()[array2].form == array[array2].form
    
        content = ak.operations.from_iter(
            [[0, 1, 2, 3], [4, 5], [6, 7, 8], [9, 10, 11, 12, 13]], highlevel=False
        )
        regulararray = ak.contents.RegularArray(content, 2, zeros_length=0)
    
        array1 = ak.highlevel.Array([[2, 1, 0], [-1]], check_valid=True).layout
    
        assert to_list(regulararray[:, array1]) == [[[2, 1, 0], [5]], [[8, 7, 6], [13]]]
        assert regulararray.to_typetracer()[:, array1].form == regulararray[:, array1].form
        assert to_list(regulararray[1:, array1]) == [[[8, 7, 6], [13]]]
        assert (
            regulararray.to_typetracer()[1:, array1].form == regulararray[1:, array1].form
        )
    
        offsets = ak.index.Index64(np.array([0, 2, 4], dtype=np.int64))
        listoffsetarray = ak.contents.ListOffsetArray(offsets, content)
>       assert to_list(listoffsetarray[:, array1]) == [
            [[2, 1, 0], [5]],
            [[8, 7, 6], [13]],
        ]

array      = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
array1     = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>
array2     = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte... [ 2  1  0 -1 -1 -2 -3  2  1  1  3]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
content    = <ListOffsetArray len='4'>
    <offsets><Index dtype='int64' len='5'>
        [ 0  4  6  9 14]
    </Index></offsets>
 ...pe='int64' len='14'>
        [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13]
    </NumpyArray></content>
</ListOffsetArray>
listoffsetarray = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
offsets    = <Index dtype='int64' len='3'>
    [0 2 4]
</Index>
regulararray = <RegularArray size='2' len='2'>
    <content><ListOffsetArray len='4'>
        <offsets><Index dtype='int64' len='5'>
...1  2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</RegularArray>

tests/test_0111_jagged_and_masked_getitem.py:711: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src/awkward/contents/content.py:512: in __getitem__
    return self._getitem(where)
        self       = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
        where      = (slice(None, None, None), <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>)
src/awkward/contents/content.py:557: in _getitem
    out = next._getitem_next(nextwhere[0], nextwhere[1:], None)
        backend    = <awkward._backends.numpy.NumpyBackend object at 0x783910736210>
        items      = [slice(None, None, None), <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>]
        next       = <RegularArray size='2' len='1'>
    <content><ListOffsetArray len='2'>
        <offsets><Index dtype='int64' len='3'>
...          </NumpyArray></content>
        </ListOffsetArray></content>
    </ListOffsetArray></content>
</RegularArray>
        nextwhere  = (slice(None, None, None), <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>)
        self       = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
        this       = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 2 4]
    </Index></offsets>
    <conte...2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListOffsetArray>
        where      = (slice(None, None, None), <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>)
src/awkward/contents/regulararray.py:519: in _getitem_next
    nextcontent._getitem_next(nexthead, nexttail, advanced),
        advanced   = None
        head       = slice(None, None, None)
        index_nplike = <awkward._nplikes.numpy.Numpy object at 0x783910703f50>
        nextcarry  = <Index dtype='int64' len='2'>[0 1]</Index>
        nextcontent = <ListArray len='2'>
    <starts><Index dtype='int64' len='2'>
        [0 2]
    </Index></starts>
    <stops><Index dt...0  1  2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListArray>
        nexthead   = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>
        nextsize   = 2
        nexttail   = ()
        self       = <RegularArray size='2' len='1'>
    <content><ListOffsetArray len='2'>
        <offsets><Index dtype='int64' len='3'>
...          </NumpyArray></content>
        </ListOffsetArray></content>
    </ListOffsetArray></content>
</RegularArray>
        start      = 0
        step       = 1
        stop       = 2
        tail       = (<ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>,)
src/awkward/contents/listarray.py:1026: in _getitem_next
    self._backend[
        advanced   = None
        head       = <ListOffsetArray len='2'>
    <offsets><Index dtype='int64' len='3'>
        [0 3 4]
    </Index></offsets>
    <content><NumpyArray dtype='int64' len='4'>[ 2  1  0 -1]</NumpyArray></content>
</ListOffsetArray>
        headlength = 2
        length     = 2
        multistarts = <Index dtype='int64' len='4'>[ 8  7  6 13]</Index>
        multistops = <Index dtype='int64' len='4'>[ 8  7  6 13]</Index>
        nextcarry  = <Index dtype='int64' len='4'>[3 4 3 4]</Index>
        self       = <ListArray len='2'>
    <starts><Index dtype='int64' len='2'>
        [0 2]
    </Index></starts>
    <stops><Index dt...0  1  2  3  4  5  6  7  8  9 10 11 12 13]
        </NumpyArray></content>
    </ListOffsetArray></content>
</ListArray>
        singleoffsets = <Index dtype='int64' len='3'>[0 3 4]</Index>
        tail       = ()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <awkward._backends.numpy.NumpyBackend object at 0x783910736210>
index = ('awkward_ListArray_getitem_jagged_expand', <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, ...)

    def __getitem__(self, index: KernelKeyType) -> NumpyKernel:
        if index[0] == "awkward_ListArray_getitem_jagged_expand":
>           raise Exception("HERE")
E           Exception: HERE

index      = ('awkward_ListArray_getitem_jagged_expand', <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>, ...)
self       = <awkward._backends.numpy.NumpyBackend object at 0x783910736210>

src/awkward/_backends/numpy.py:37: Exception

The CUDA tests are implemented for this and the results are equivalent to the ones produced on CPU.

@ianna ianna marked this pull request as draft September 23, 2024 14:27
@ianna ianna marked this pull request as ready for review September 24, 2024 09:52
@ianna ianna self-assigned this Sep 26, 2024
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about this at length in our meeting, and I'm on board with making this kind of change.

We also said that we'll need an automated check across all CUDA kernels (implemented once, where the kernels are built) to discover these C type errors in the future: int vs int*. (This is a missing feature of cupy.RawKernel!)

@ianna ianna merged commit e946646 into main Sep 26, 2024
44 checks passed
@ianna ianna deleted the 3213-bad-cuda-kernel-during-slice branch September 26, 2024 16:01
@martindurant
Copy link
Contributor

I am still getting a breakage with selecting by integer, as opposed to slice (and hereafter, results on the GPU are garbage). I haven't managed to make a reproducer yet!

@ianna
Copy link
Collaborator Author

ianna commented Sep 30, 2024

I am still getting a breakage with selecting by integer, as opposed to slice (and hereafter, results on the GPU are garbage). I haven't managed to make a reproducer yet!

hmm... I think, that what we have discussed with @jpivarski might happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bad CUDA kernel during slice
3 participants