Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a drop_none() #832

Closed
masonproffitt opened this issue Apr 15, 2021 · 8 comments
Closed

Add a drop_none() #832

masonproffitt opened this issue Apr 15, 2021 · 8 comments
Labels
feature New feature or request

Comments

@masonproffitt
Copy link

Originally mentioned in #490 (comment). Mainly all I'm looking for is a nice shortcut for array[~ak.is_none(array, axis=-1)] (I think this would cover any of my own use cases), although the default behavior (axis=None?) for a drop_none function should probably be to do this on every axis. A particularly important case for this function is that functions like np.histogram() and plt.hist() do not handle masked arrays properly (numpy/numpy#10019).

@masonproffitt masonproffitt added the feature New feature or request label Apr 15, 2021
@jpivarski jpivarski linked a pull request Apr 16, 2021 that will close this issue
@jpivarski
Copy link
Member

This is harder than I thought it might be. I'll have to get back to it.

@jpivarski
Copy link
Member

You know, if your goal is plotting, you can ak.flatten the array. That will get rid of the missing values in addition to getting rid of the lists, which you'd have to do for a histogramming function, anyway.

@masonproffitt
Copy link
Author

ak.flatten() doesn't remove None (I wouldn't have run into this issue if it did):

>>> ak.flatten(ak.Array([[1], [2, None]]))
<Array [1, 2, None] type='3 * ?int64'>

@masonproffitt
Copy link
Author

masonproffitt commented Apr 17, 2021

Ah, so ak.flatten seems to only remove Nones that are on the axis-1 axis:

>>> ak.flatten(ak.Array([[[0]], None, [[1], None], [[2, None]]]), axis=1)
<Array [[0], [1], None, [2, None]] type='4 * option[var * ?int64]'>
>>> ak.flatten(ak.Array([[[0]], None, [[1], None], [[2, None]]]), axis=2)
<Array [[0], None, [1], [2, None]] type='4 * option[var * ?int64]'>

Other than axis=0, which removes Nones on that axis:

>>> ak.flatten(ak.Array([[[0]], None, [[1], None], [[2, None]]]), axis=0)
<Array [[[0]], [[1], None], [[2, None]]] type='3 * var * option[var * ?int64]'>

It looks like axis=None does what I want:

>>> ak.flatten(ak.Array([[[0]], None, [[1], None], [[2, None]]]), axis=None)
<Array [0, 1, 2] type='3 * int64'>

I was actually assuming that axis=None was the default like the reducers sum, min, max, etc. Wouldn't axis=None for ak.flatten be more consistent with those and with np.ndarray.flatten? My guess would be that the most common use of ak.flatten is for histogramming anyway (it certainly is for me--I don't think I've ever used it for anything else).

@jpivarski
Copy link
Member

I'm rethinking it because you're not the first person to say they expected the default axis of ak.flatten to be None. I would expect it to be 1 for consistency with functional programming, but then, I'd want the axis of the reducers like ak.sum to be 1 also but they're constrained by NumPy's behavior.

It would be hard to change—a parameter default isn't the sort of thing that can go through a deprecation cycle, unless we change the name of "flatten" (and that's already a good name). This would be a good thing to ask about as a Discussion, under the "deprecation" category: whether we should change the default axis of ak.flatten, to get some feedback.

@jpivarski
Copy link
Member

Actually, there was an idea about that: the name ak.ravel, which means "completely flatten" in NumPy, too.

@agoose77
Copy link
Collaborator

agoose77 commented Apr 19, 2021

@jpivarski weighing in here - when reading jagged branches of a tree that contain 0 or more entries, I noticed that I was always flattening / fill_noneing along the axis=1 direction. I can now see why most people may well prefer having this as the default. It's only when I want to remove all jaggedness that I find myself preferring axis=None as a default.

@jpivarski
Copy link
Member

Completed by #1904.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants