-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++ refactoring: starting _getitem_next. #1031
Conversation
@ioanaif As a hint for how to go about translating and debugging, I showed some examples of my own debugging: 86377cc adds debugging code to compare the Python with the original C++ and 3878e10 is the subsequent commit that removes that debugging code (not sure which would be easier to read). Once the C++ codebase is compiled, adding |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
I should have done this when you first started! @all-contributors please add @ioanaif for code, test |
I've put up a pull request to add @ioanaif! 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great! I just made a few tweaks: "ISOPTION" was from when IndexedArray and IndexedOptionArray were templated versions of the same class; now they're distinct classes. The "nextcarry_outindex" function is only used for handling the option-type, so IndexedArray doesn't need it (and anyway, it's exactly the same function).
* Starting to implement the type tracer for Awkward-Dask. * Actually testing it. * It predicts Forms and Types. * Convenience methods for making typetracers. * TypeTracer.instance(). * TypeTracerArray.__getitem__(slice). * Black. * Through RecordArray. * Through BitMaskedArray. * Added checks throughout 0896. * Rename test-data.yml to kernel-test-data.yml. * Instrumented tests from #959. * Instrumented tests from #1031. * numpy.broadcast_shapes is too new. * Through 0011_v2-listarray in the v2 tests. * Python 2.7 needs Ellipsis to be spelled out. * Through 0020_v2-support-unsigned-indexes in the v2 tests. * Through 0023_v2-regular-array in the v2 tests. * Through all slicing in the v2 tests; next is 0070_v2-argmin-and-argmax. * Through test_0070_v2-argmin-and-argmax.py in the v2 tests; ak.to_list was wrong, and subsequently the reducer implementation. * First use of 'fill' (had to split into 'fill_zero', 'fill_other'). * Through 0111_v2-jagged-and-masked-getitem in the v2 tests; added nplike.known_shape. * Through 0115_v2-generic-reducer-operation in the v2 tests. * Through 1059_v2-localindex in the v2 tests. * Added typetracer to all v2 tests (where possible). * Fixed pre-commit. * Fixed merge.
This PR implements all types of slices allowed by NumPy on Awkward Arrays. Awkward's extensions of NumPy's slicing rules will be covered in a future PR.
In principle, there are 12 classes to cover: BitMaskedArray, ByteMaskedArray, EmptyArray, IndexedArray, IndexedOptionArray, ListArray, ListOffsetArray, NumpyArray, RecordArray, RegularArray, UnionArray, and UnmaskedArray with 10 types of slice items: no-slice (
head == ()
), integer (int
), range-slice (slice
), field (str
), fields (list(str)
),np.newaxis
(which isNone
),Ellipsis
(...
), flat array (Index64
), jagged array (ListOffsetArray
), and option-type array (IndexedOptionArray
). 120 cases.However, putting off Awkward's extensions reduces this to only 8 types of slice by dropping jagged arrays and option-type arrays. (They can be left as
NotImplementedError
.) 96 cases.Also, I did the "weird" classes, so that what remains would be more straightforward: all of EmptyArray (because it's a leaf), NumpyArray (because it's a leaf and very different—much simpler—in Python), RecordArray (because fields are fundamentally different from other slices), and RegularArray (because fake RegularArrays are used to structure the getitem procedure). 64 cases.
Also, I did the "weird" slice types, so that what remains would be more straightforward: no-slice, field, fields,
np.newaxis
, andEllipsis
. 24 cases.Here's what remains, for you to finish off this PR:
As you finish implementing each one, uncomment the corresponding tests in test_1031-start-getitem_next.py. When the PR is done, all of the code should be uncommented. test_1031b-start-getitem_next-specialized.py is for any additional tests you need to add while working things out.