Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate dims methods for iterator forms? #35292

Open
ChrisRackauckas opened this issue Mar 28, 2020 · 4 comments
Open

Deprecate dims methods for iterator forms? #35292

ChrisRackauckas opened this issue Mar 28, 2020 · 4 comments

Comments

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Mar 28, 2020

It seems like the dims-based methods could be from another time. It seems that we now have a way to do this using iterators:

A = rand(4,4)
sum(A,dims=2)
sum(eachcol(A))

Could we make our APIs simpler by dropping the dims-based versions and instead just requiring that the user uses an appropriate iterator before the call? One thing that would have to be true when doing this is that it should be just as efficient, so I wonder if @maleadt could comment on whether there's any extra optimizations on things like GPUs that are available in sum(A,dims=2) vs sum(eachcol(A)). Otherwise, it might be interesting to drop in a Julia 2.0?

(One note is that we'd need iterators on higher dimensions as well, though that could be done)

@tkf
Copy link
Member

tkf commented Mar 29, 2020

I think using eachcol/eachrow/eachslice is problematic in the sense that:

  1. The reduced dimensions are dropped. For example, sum(A,dims=2) :: Matrix while sum(eachcol(A)) :: Vector when A :: Matrix.
  2. It works only when the reducing function (e.g., +) is implicitly "broadcasted." For example, prod(eachcol(A :: Matrix)) throws ATM.

I proposed to add yet another lazy object type to solve these problems: #16606 (comment). See also #33130.

@mcabbott
Copy link
Contributor

Bullet 2 points towards prod.(eachrow(A :: Matrix)) instead. Should this then fuse with other operations?

We could have variants which don't drop dimensions. StarSlice.jl is a sketch of this, writing A[:, *] for the getindex equivalent of eachcol, and A[!, *] for a variant with a 1×4 container, thus sum.(A[!, *]) == sum(A, dims=1). Plus another variant with 2×1 slices, sum(A[:, &]) == sum(A, dims=2).

Also ref EachSlice types of #32310, where I thought the point was to pretend you were passing in slices, while e.g. Distances.jl can still compute efficiently on the whole array.

@tkf
Copy link
Member

tkf commented Apr 1, 2020

Bullet 2 points towards prod.(eachrow(A :: Matrix)) instead.

We could have variants which don't drop dimensions. StarSlice.jl is a sketch of this

Thanks, that's a good point (my understating is that we can still keep array-of-arrays interface). I guess we can also make something like dropdims(A .- mean.(A[!, *])) work as well? I'm thinking to propagate "dropped dims" information by embedding in the wrapper type and returning it from the StarSlice getindex.

Should this then fuse with other operations?

I don't think this is required to be implemented from the get-go but I think (eventually) making things like dest .= sum.(eachrow(A)) non-allocating and locality-aware would be nice.

@simonbyrne
Copy link
Contributor

simonbyrne commented Feb 15, 2021

  • It works only when the reducing function (e.g., +) is implicitly "broadcasted." For example, prod(eachcol(A :: Matrix)) throws ATM.

On the other hand it is equivalent to map(sum, eachrow(X)) (or mapslices(sum, X, dims=2)).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants