Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make region a keyword argument for dimension reducing methods #25501

Closed
ararslan opened this issue Jan 11, 2018 · 35 comments · Fixed by #25989
Closed

Make region a keyword argument for dimension reducing methods #25501

ararslan opened this issue Jan 11, 2018 · 35 comments · Fixed by #25989
Assignees
Labels
collections Data structures holding multiple items, e.g. sets keyword arguments f(x; keyword=arguments)
Milestone

Comments

@ararslan
Copy link
Member

ararslan commented Jan 11, 2018

If possible, I think we should make the region argument in the dimension reduction methods of any and all a keyword argument. map has an incredibly handy method for multiple iterators, e.g. map(<, a, b). Currently any(<, a, b) will use the method any(f, itr, region), but it would be great to be able to reclaim any(<, a, b) to do what it looks like it should do. That would require making region a keyword argument in 0.7 with a deprecation, and introducing the new method in 1.0.

The downside to this is that we have to come up with a default value for region in order to use it as a keyword argument, and when region isn't specified then we want it to dispatch to the simpler, short-circuiting method not based on mapreducedim.

A similar argument could be made for other functions with mapreducedim-based methods, such as sum(f, A, region), but I mostly care about any/all for now. We're doin' 'em all.

@ararslan ararslan added collections Data structures holding multiple items, e.g. sets keyword arguments f(x; keyword=arguments) triage This should be discussed on a triage call labels Jan 11, 2018
@nalimilan
Copy link
Member

We could use region=nothing as a default.

I'm fine with doing this, but then let's use keyword arguments for all similar functions (sum, etc.). Are we also sure region is the best term? It could also be e.g. dim/dims?

@JeffBezanson
Copy link
Sponsor Member

Are we also sure region is the best term? It could also be e.g. dim/dims?

💯 "Region" to me means something like A[m:n, m:n]. dims would be fine. We should try to standardize on dims for "which dimensions" and size or sizes for "the sizes of dimensions".

@JeffBezanson
Copy link
Sponsor Member

Triage says 👍 . Should be applied to all relevant functions though. Issue of what name to use still somewhat up in the air.

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Jan 11, 2018
@JeffBezanson JeffBezanson added this to the 1.0 milestone Jan 11, 2018
@ararslan
Copy link
Member Author

The steps are:

  1. Figure out which functions have this kind of API
  2. Pick a name for the keyword argument
  3. Apply the change to all of them, not just any and all

@ararslan ararslan changed the title Make region a keyword argument for any and all Make region a keyword argument for dimension reducing methods Jan 11, 2018
@malmaud
Copy link
Contributor

malmaud commented Jan 14, 2018

TensorFlow uses axis, as another name candidate.

@ararslan
Copy link
Member Author

Methods which perform dimension reduction using this kind of API:

  • mapreducedim(f, op, A, region, v0)
  • reducedim(op, A, region, v0)
  • sum(A, dims) / sum(f, A, region)
  • prod(A, dims) / prod(f, A, region)
  • maximum(A, dims) / maximum(f, A, region)
  • minimum(A, dims) / minimum(f, A, region)
  • all(A, dims) / all(f, A, region)
  • any(A, dims) / any(f, A, region)

Interestingly, we seem to opt for the "region" terminology only for methods which map prior to reduction, and for those that don't we use "dims." (reducedim is the exception here; it uses "region" and does not map.)

To me, "dims" seems like the most obvious and immediately descriptive term, and it's consistent both with the names of the functions used underneath, i.e. (map)reducedim, as well as with our Dims type alias. With that in mind, here is what I would propose: for each reducing function r in the above list, deprecate the methods to r(A; dims=Colon()) and r(f, A; dims=Colon()). Using : seems like the most natural way to request the current behavior when no dimension is specified, in my opinion.

I'm indifferent as to whether (map)reducedim should be folded into (map)reduce with dims and v0 as keyword arguments, but that would seem reasonable to me.

@nalimilan
Copy link
Member

FWIW, Numpy also uses axis (see e.g. here). So if we decide to keep the axes function, we could as well adopt the axis terminology everywhere.

(Though I also like dims.)

@StefanKarpinski
Copy link
Sponsor Member

On the one hand, we now use axes to get a description of the axes/dimensions of an array, which suggests axis. On the other hand, we still have the following dim names:

  • flipdim
  • ndims
  • permutedims
  • permutedims!
  • reducedim
  • slicedim
  • mapreducedim

So we're a bit internally split on this terminology. Are these reduction region arguments more similar to what axes tells you about or what ndims, etc. tells you about?

@JeffBezanson
Copy link
Sponsor Member

Given what axes does, for us an "axis" is something like -10:10, telling you that the index values for a dimension (see what I did there) range from -10 to 10. So I think dims is a good choice.

@ararslan ararslan self-assigned this Jan 20, 2018
@bjarthur
Copy link
Contributor

bjarthur commented Jan 30, 2018

i thought i was more or less decided here that we didn't want to support a variable number of iterators to any, all, etc. that being said, i'm all in favor of making region a keyword argument, and making the dim/region/axes nomenclature consistent. cc'ing @mbauman @Keno as they were the main proponents there.

@Keno
Copy link
Member

Keno commented Jan 30, 2018

I still think we should use a keyword argument for any/all, even if I don't particularly like the multi-arg meaning.

@ararslan
Copy link
Member Author

Well folks, we have a problem. I've implemented the change to reducing functions. Using sum as one example, I have sum(x, dims=d) set up to call _sum(x, dims), which dispatches appropriately for the given dimensions (using ::Colon for the fallback). The issue now is that the return type is no longer inferred to be concrete when a dimension is specified:

julia> @inferred(sum([1 2; 3 4], dims=1))
ERROR: return type Array{Int64,2} does not match inferred return type Union{Int64, Array{Int64,2}}
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] top-level scope

This may be because when a dimension is specified, it goes through the more general method _sum(::Any, ::Any) rather than _sum(::Any, ::Colon), which in turn goes through a bunch of magic to determine whether the given dimensions are "dimension-like," rather than dispatching to some specific method, say _sum(::Any, ::Vector{Int}).

Any ideas? Is this still feasible? I can put up a WIP PR with my current implementation, if that would be helpful for diagnostics.

@mbauman
Copy link
Sponsor Member

mbauman commented Jan 31, 2018

I've still catching up with all the keyword and compiler improvements, but I think you can get around this by immediately re-dispatching to a positional argument:

julia> f(;x=nothing) = g(x)
       g(::Nothing) = 1
       g(x) = Array{Int}(uninitialized, x)

julia> h() = f(x=1)
h (generic function with 1 method)

julia> @code_warntype h()
…
  end::Array{Int64,1}

julia> h(y) = f(x=y)
h (generic function with 2 methods)

julia> @code_warntype h(1)
…
  end::Array{Int64,1}

julia> @code_warntype h((1,2))
…
  end::Array{Int64,2}

It's just a little deceiving since @code_warntype f(x=1) shows red flags, but that's just because code_warntype (I think) emulates calls from a top-level scope. This kind of thing will typically be in a compiled function, at which point inlining and IPO can do their thing.

@ararslan
Copy link
Member Author

Right, that is what I'm doing: sum(x, dims=d) calls the positional method _sum(x, d), but I'm still getting

julia> using Test

julia> X = [1 2; 3 4];

julia> @inferred Base._sum(X, 1)
1×2 Array{Int64,2}:
 4  6

julia> @inferred Base._sum(X, :)
10

julia> @inferred sum(X) # calls _sum(X, :)
10

julia> @inferred sum(X, dims=1) # calls _sum(X, 1)
ERROR: return type Array{Int64,2} does not match inferred return type Union{Int64, Array{Int64,2}}
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] top-level scope

@mbauman
Copy link
Sponsor Member

mbauman commented Jan 31, 2018

This'll only work if _sum inlines into sum(;dims=). I think you'll have to have a second intermediary function:

sum(…, ;dims=…) = _sum(…, dims)
@inline _sum(…, dims) = __sum(…, dims)
__sum(…, dims) = # real work

(edit: or just inline your _sum as it is, but I'm guessing that it has multiple methods) Edit: this is wrong.

@martinholters
Copy link
Member

martinholters commented Feb 1, 2018

Does @inferred (x -> sum(x, dims=1))(X) work?

EDIT: Ah, that's what #25835 is about. But is that the problem here or does that construct still fail?

@ararslan
Copy link
Member Author

ararslan commented Feb 1, 2018

It does indeed solve the problem.

julia> using Test

julia> X = [1 2; 3 4];

julia> @inferred (x->sum(x, dims=1))(X)
1×2 Array{Int64,2}:
 4  6

julia> @inferred sum(X, dims=1)
ERROR: return type Array{Int64,2} does not match inferred return type AbstractArray
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] top-level scope

(Though interestingly, the inferred return type is no longer a Union but rather an abstract type...)

@ararslan
Copy link
Member Author

ararslan commented Feb 1, 2018

Having been going through and implementing this, I'm actually not so sure I like it. It could be that I'm just used to writing things like sum(X, 1), but that does have a nice symmetry with size(X, 1), and I think size(X, dims=1) to maintain symmetry would be kind of unfortunate.

Really multi-arg any and all was the reason I wanted this, and if we aren't going to do that then I'm not so sure I see an overall benefit here. Is anyone a strong supporter of this change?

@JeffBezanson
Copy link
Sponsor Member

I think the issue is that in sum(X, 1) it's not clear what the 1 refers to. sum or sort might support other kinds of arguments; for example sort(X, order) was discussed today.

@ararslan
Copy link
Member Author

ararslan commented Feb 1, 2018

I agree that it does clarify the intent somewhat, but has there been any confusion about it in the past? Regarding "might support other kinds of arguments," isn't that just taken care of by multiple methods and dispatch?

@JeffBezanson
Copy link
Sponsor Member

Not necessarily; it's not ideal to re-use the same argument slot for a totally different meaning.

@ararslan
Copy link
Member Author

ararslan commented Feb 2, 2018

Losing the ability to concretely infer the return type for these pretty essential functions in global scope is pretty unfortunate. Though perhaps there's a way to get around that that I don't know how to do.

@JeffBezanson
Copy link
Sponsor Member

JeffBezanson commented Feb 2, 2018

It should be fairly easy to get that to infer; I believe all we need is a constant Bool from haskey.

EDIT: This already works.

@ararslan
Copy link
Member Author

ararslan commented Feb 3, 2018

What if instead of deprecating a positional dimension argument to a keyword argument, we go all-in on mapslices? That is, instead of sum(X, 1) -> sum(X, dims=1), we do mapslices(sum, X, 1). Then we have a single convenient function for this kind of reduction that can apply to any function rather than going through and whitelisting all of them. I quite like that consolidation.

@StefanKarpinski
Copy link
Sponsor Member

While I agree that would be a nice "orthogonalization" of APIs, I think this is far too common an operation on matrices to be made that much more clunky. Array reductions are kind of bread and butter when doing any numerical work.

@JeffBezanson
Copy link
Sponsor Member

Agreed --- it would be great for mapslices to be fast enough to do that, but getting it to that point is probably a significant project, and the resulting API is too verbose to fully replace the old one.

@timholy
Copy link
Sponsor Member

timholy commented Feb 4, 2018

Last I looked the one in JuliennedArrays is about as fast as it'll get (EDIT: short of being able to elide the SubArray creation or store the wrapper on the stack). But it will still lose compared to sum(A, 2) simply for reasons of cache-efficiency, unless we dispatch specifically on sum in mapslices: sum(A, 2) can compute its result while traversing A in cache-order, whereas mapslices does not.

@ararslan
Copy link
Member Author

ararslan commented Feb 4, 2018

Yes, I figured if we did go the mapslices route then we'd specialize on ::typeof(sum) and others to use the existing implementations under the hood.

@ararslan ararslan added the triage This should be discussed on a triage call label Feb 8, 2018
@ararslan
Copy link
Member Author

ararslan commented Feb 8, 2018

Re-marking for triage since I no longer think we should do this after partially implementing it if we aren't going to use multi-arg any/all. sum(A, dims=1) just seems annoying compared to sum(A, 1).

@JeffBezanson
Copy link
Sponsor Member

Triage still mostly likes the keyword argument.

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Feb 8, 2018
@ararslan ararslan assigned JeffBezanson and unassigned ararslan Feb 9, 2018
@mbauman
Copy link
Sponsor Member

mbauman commented Feb 12, 2018

I had been very hesitant to support this, but I'm now fully on board. Two points crystallized my opinion — documenting them here for posterity:

  • What does any(==, a, b) do? (It errors — you likely wanted any(a .== b), but that's very different)
  • We may be able to fold max and maximum back together in the future with this change.

@StefanKarpinski
Copy link
Sponsor Member

If we are thinking of merging max and maximum we may want to do some thinking about it now since I believe we would have to fold max into maximum rather than the other way around because of this behavior:

julia> max([1,2,3], [2,1,3])
3-element Array{Int64,1}:
 2
 1
 3

The behavior we would presumably want for a combined max(args..., dims=dims) function would be semantically to do broadcasting across all args and reduction of dims in the resulting broadcast value, although, of course, the implementation should be more efficient than actually constructing the broadcast result. This is compatible with maximum but not with max, which is why it requires consideration since max is the name we probably want to keep.

@JeffBezanson
Copy link
Sponsor Member

I agree that having both the names max and maximum around is not ideal, since it's not obvious what the difference is. However I feel they should remain separate functions, since they're just not the same operation. It would be nice if we could find a better name for maximum, but we can't simply combine them any more than we can combine + and sum.

@piever
Copy link
Contributor

piever commented Feb 13, 2018

Would this apply also to method that shift data along some dimensions, rather than reducing it?

For example, is the plan to have:

circshift(v, 1, dims = 2) == circshift(v, (0, 1)) # if v is a 2-dimensional array

Or the two issues are unrelated and circshift will stay as is?

@JeffBezanson
Copy link
Sponsor Member

I think they're unrelated, since circshift accepts the amount to shift in each dimension, not a list of which dimensions to operate on.

mbauman added a commit that referenced this issue Mar 29, 2018
This is a follow-up to issue #25501 and seems to make sense in that same vein.  This also adds NEWS for squeeze and all the other related functions (#25501 was missing NEWS).
mbauman added a commit that referenced this issue Apr 3, 2018
* Deprecate squeeze(A, dims) to use keyword arguments

This is a follow-up to issue #25501 and seems to make sense in that same vein.  This also adds NEWS for squeeze and all the other related functions (#25501 was missing NEWS).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets keyword arguments f(x; keyword=arguments)
Projects
None yet
Development

Successfully merging a pull request may close this issue.