Make region a keyword argument for dimension reducing methods #25501

ararslan · 2018-01-11T00:06:41Z

If possible, I think we should make the region argument in the dimension reduction methods of any and all a keyword argument. map has an incredibly handy method for multiple iterators, e.g. map(<, a, b). Currently any(<, a, b) will use the method any(f, itr, region), but it would be great to be able to reclaim any(<, a, b) to do what it looks like it should do. That would require making region a keyword argument in 0.7 with a deprecation, and introducing the new method in 1.0.

The downside to this is that we have to come up with a default value for region in order to use it as a keyword argument, and when region isn't specified then we want it to dispatch to the simpler, short-circuiting method not based on mapreducedim.

~~A similar argument could be made for other functions with mapreducedim-based methods, such as sum(f, A, region), but I mostly care about any/all for now.~~ We're doin' 'em all.

The text was updated successfully, but these errors were encountered:

nalimilan · 2018-01-11T07:25:16Z

We could use region=nothing as a default.

I'm fine with doing this, but then let's use keyword arguments for all similar functions (sum, etc.). Are we also sure region is the best term? It could also be e.g. dim/dims?

JeffBezanson · 2018-01-11T18:15:55Z

Are we also sure region is the best term? It could also be e.g. dim/dims?

💯 "Region" to me means something like A[m:n, m:n]. dims would be fine. We should try to standardize on dims for "which dimensions" and size or sizes for "the sizes of dimensions".

JeffBezanson · 2018-01-11T19:22:05Z

Triage says 👍 . Should be applied to all relevant functions though. Issue of what name to use still somewhat up in the air.

ararslan · 2018-01-11T19:23:22Z

The steps are:

Figure out which functions have this kind of API
Pick a name for the keyword argument
Apply the change to all of them, not just any and all

malmaud · 2018-01-14T01:56:24Z

TensorFlow uses axis, as another name candidate.

ararslan · 2018-01-16T21:28:31Z

Methods which perform dimension reduction using this kind of API:

mapreducedim(f, op, A, region, v0)
reducedim(op, A, region, v0)
sum(A, dims) / sum(f, A, region)
prod(A, dims) / prod(f, A, region)
maximum(A, dims) / maximum(f, A, region)
minimum(A, dims) / minimum(f, A, region)
all(A, dims) / all(f, A, region)
any(A, dims) / any(f, A, region)

Interestingly, we seem to opt for the "region" terminology only for methods which map prior to reduction, and for those that don't we use "dims." (reducedim is the exception here; it uses "region" and does not map.)

To me, "dims" seems like the most obvious and immediately descriptive term, and it's consistent both with the names of the functions used underneath, i.e. (map)reducedim, as well as with our Dims type alias. With that in mind, here is what I would propose: for each reducing function r in the above list, deprecate the methods to r(A; dims=Colon()) and r(f, A; dims=Colon()). Using : seems like the most natural way to request the current behavior when no dimension is specified, in my opinion.

I'm indifferent as to whether (map)reducedim should be folded into (map)reduce with dims and v0 as keyword arguments, but that would seem reasonable to me.

nalimilan · 2018-01-20T15:01:56Z

FWIW, Numpy also uses axis (see e.g. here). So if we decide to keep the axes function, we could as well adopt the axis terminology everywhere.

(Though I also like dims.)

StefanKarpinski · 2018-01-20T16:46:28Z

On the one hand, we now use axes to get a description of the axes/dimensions of an array, which suggests axis. On the other hand, we still have the following dim names:

flipdim
ndims
permutedims
permutedims!
reducedim
slicedim
mapreducedim

So we're a bit internally split on this terminology. Are these reduction region arguments more similar to what axes tells you about or what ndims, etc. tells you about?

JeffBezanson · 2018-01-20T17:13:52Z

Given what axes does, for us an "axis" is something like -10:10, telling you that the index values for a dimension (see what I did there) range from -10 to 10. So I think dims is a good choice.

bjarthur · 2018-01-30T14:51:43Z

i thought i was more or less decided here that we didn't want to support a variable number of iterators to any, all, etc. that being said, i'm all in favor of making region a keyword argument, and making the dim/region/axes nomenclature consistent. cc'ing @mbauman @Keno as they were the main proponents there.

Keno · 2018-01-30T15:08:17Z

I still think we should use a keyword argument for any/all, even if I don't particularly like the multi-arg meaning.

ararslan · 2018-01-31T19:09:02Z

Well folks, we have a problem. I've implemented the change to reducing functions. Using sum as one example, I have sum(x, dims=d) set up to call _sum(x, dims), which dispatches appropriately for the given dimensions (using ::Colon for the fallback). The issue now is that the return type is no longer inferred to be concrete when a dimension is specified:

julia> @inferred(sum([1 2; 3 4], dims=1))
ERROR: return type Array{Int64,2} does not match inferred return type Union{Int64, Array{Int64,2}}
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] top-level scope

This may be because when a dimension is specified, it goes through the more general method _sum(::Any, ::Any) rather than _sum(::Any, ::Colon), which in turn goes through a bunch of magic to determine whether the given dimensions are "dimension-like," rather than dispatching to some specific method, say _sum(::Any, ::Vector{Int}).

Any ideas? Is this still feasible? I can put up a WIP PR with my current implementation, if that would be helpful for diagnostics.

mbauman · 2018-01-31T19:24:48Z

I've still catching up with all the keyword and compiler improvements, but I think you can get around this by immediately re-dispatching to a positional argument:

julia> f(;x=nothing) = g(x)
       g(::Nothing) = 1
       g(x) = Array{Int}(uninitialized, x)

julia> h() = f(x=1)
h (generic function with 1 method)

julia> @code_warntype h()
…
  end::Array{Int64,1}

julia> h(y) = f(x=y)
h (generic function with 2 methods)

julia> @code_warntype h(1)
…
  end::Array{Int64,1}

julia> @code_warntype h((1,2))
…
  end::Array{Int64,2}

It's just a little deceiving since @code_warntype f(x=1) shows red flags, but that's just because code_warntype (I think) emulates calls from a top-level scope. This kind of thing will typically be in a compiled function, at which point inlining and IPO can do their thing.

ararslan · 2018-01-31T19:34:36Z

Right, that is what I'm doing: sum(x, dims=d) calls the positional method _sum(x, d), but I'm still getting

julia> using Test

julia> X = [1 2; 3 4];

julia> @inferred Base._sum(X, 1)
1×2 Array{Int64,2}:
 4  6

julia> @inferred Base._sum(X, :)
10

julia> @inferred sum(X) # calls _sum(X, :)
10

julia> @inferred sum(X, dims=1) # calls _sum(X, 1)
ERROR: return type Array{Int64,2} does not match inferred return type Union{Int64, Array{Int64,2}}
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] top-level scope

mbauman · 2018-01-31T19:37:42Z

~~This'll only work if _sum inlines into sum(;dims=). I think you'll have to have a second intermediary function:~~

sum(…, ;dims=…) = _sum(…, dims)
@inline _sum(…, dims) = __sum(…, dims)
__sum(…, dims) = # real work

~~(edit: or just inline your _sum as it is, but I'm guessing that it has multiple methods)~~ Edit: this is wrong.

martinholters · 2018-02-01T12:05:54Z

Does @inferred (x -> sum(x, dims=1))(X) work?

EDIT: Ah, that's what #25835 is about. But is that the problem here or does that construct still fail?

ararslan · 2018-02-01T19:44:25Z

It does indeed solve the problem.

julia> using Test

julia> X = [1 2; 3 4];

julia> @inferred (x->sum(x, dims=1))(X)
1×2 Array{Int64,2}:
 4  6

julia> @inferred sum(X, dims=1)
ERROR: return type Array{Int64,2} does not match inferred return type AbstractArray
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] top-level scope

(Though interestingly, the inferred return type is no longer a Union but rather an abstract type...)

ararslan · 2018-02-01T22:19:38Z

Having been going through and implementing this, I'm actually not so sure I like it. It could be that I'm just used to writing things like sum(X, 1), but that does have a nice symmetry with size(X, 1), and I think size(X, dims=1) to maintain symmetry would be kind of unfortunate.

Really multi-arg any and all was the reason I wanted this, and if we aren't going to do that then I'm not so sure I see an overall benefit here. Is anyone a strong supporter of this change?

JeffBezanson · 2018-02-01T22:33:57Z

I think the issue is that in sum(X, 1) it's not clear what the 1 refers to. sum or sort might support other kinds of arguments; for example sort(X, order) was discussed today.

ararslan · 2018-02-01T23:14:47Z

I agree that it does clarify the intent somewhat, but has there been any confusion about it in the past? Regarding "might support other kinds of arguments," isn't that just taken care of by multiple methods and dispatch?

JeffBezanson · 2018-02-01T23:34:34Z

Not necessarily; it's not ideal to re-use the same argument slot for a totally different meaning.

ararslan · 2018-02-02T22:02:43Z

Losing the ability to concretely infer the return type for these pretty essential functions in global scope is pretty unfortunate. Though perhaps there's a way to get around that that I don't know how to do.

JeffBezanson · 2018-02-02T22:12:23Z

It should be fairly easy to get that to infer; I believe all we need is a constant Bool from haskey.

EDIT: This already works.

ararslan · 2018-02-03T03:39:18Z

What if instead of deprecating a positional dimension argument to a keyword argument, we go all-in on mapslices? That is, instead of sum(X, 1) -> sum(X, dims=1), we do mapslices(sum, X, 1). Then we have a single convenient function for this kind of reduction that can apply to any function rather than going through and whitelisting all of them. I quite like that consolidation.

StefanKarpinski · 2018-02-03T14:04:35Z

While I agree that would be a nice "orthogonalization" of APIs, I think this is far too common an operation on matrices to be made that much more clunky. Array reductions are kind of bread and butter when doing any numerical work.

JeffBezanson · 2018-02-04T18:09:24Z

Agreed --- it would be great for mapslices to be fast enough to do that, but getting it to that point is probably a significant project, and the resulting API is too verbose to fully replace the old one.

timholy · 2018-02-04T21:18:25Z

Last I looked the one in JuliennedArrays is about as fast as it'll get (EDIT: short of being able to elide the SubArray creation or store the wrapper on the stack). But it will still lose compared to sum(A, 2) simply for reasons of cache-efficiency, unless we dispatch specifically on sum in mapslices: sum(A, 2) can compute its result while traversing A in cache-order, whereas mapslices does not.

ararslan · 2018-02-04T21:24:03Z

Yes, I figured if we did go the mapslices route then we'd specialize on ::typeof(sum) and others to use the existing implementations under the hood.

ararslan · 2018-02-08T21:03:56Z

Re-marking for triage since I no longer think we should do this after partially implementing it if we aren't going to use multi-arg any/all. sum(A, dims=1) just seems annoying compared to sum(A, 1).

JeffBezanson · 2018-02-08T22:35:15Z

Triage still mostly likes the keyword argument.

mbauman · 2018-02-12T17:14:19Z

I had been very hesitant to support this, but I'm now fully on board. Two points crystallized my opinion — documenting them here for posterity:

What does any(==, a, b) do? (It errors — you likely wanted any(a .== b), but that's very different)
We may be able to fold max and maximum back together in the future with this change.

StefanKarpinski · 2018-02-12T19:00:21Z

If we are thinking of merging max and maximum we may want to do some thinking about it now since I believe we would have to fold max into maximum rather than the other way around because of this behavior:

julia> max([1,2,3], [2,1,3])
3-element Array{Int64,1}:
 2
 1
 3

The behavior we would presumably want for a combined max(args..., dims=dims) function would be semantically to do broadcasting across all args and reduction of dims in the resulting broadcast value, although, of course, the implementation should be more efficient than actually constructing the broadcast result. This is compatible with maximum but not with max, which is why it requires consideration since max is the name we probably want to keep.

JeffBezanson · 2018-02-13T04:28:51Z

I agree that having both the names max and maximum around is not ideal, since it's not obvious what the difference is. However I feel they should remain separate functions, since they're just not the same operation. It would be nice if we could find a better name for maximum, but we can't simply combine them any more than we can combine + and sum.

piever · 2018-02-13T14:01:37Z

Would this apply also to method that shift data along some dimensions, rather than reducing it?

For example, is the plan to have:

circshift(v, 1, dims = 2) == circshift(v, (0, 1)) # if v is a 2-dimensional array

Or the two issues are unrelated and circshift will stay as is?

JeffBezanson · 2018-02-15T22:04:00Z

I think they're unrelated, since circshift accepts the amount to shift in each dimension, not a list of which dimensions to operate on.

This is a follow-up to issue #25501 and seems to make sense in that same vein. This also adds NEWS for squeeze and all the other related functions (#25501 was missing NEWS).

* Deprecate squeeze(A, dims) to use keyword arguments This is a follow-up to issue #25501 and seems to make sense in that same vein. This also adds NEWS for squeeze and all the other related functions (#25501 was missing NEWS).

ararslan added collections Data structures holding multiple items, e.g. sets keyword arguments f(x; keyword=arguments) triage This should be discussed on a triage call labels Jan 11, 2018

JeffBezanson removed the triage This should be discussed on a triage call label Jan 11, 2018

JeffBezanson added this to the 1.0 milestone Jan 11, 2018

ararslan changed the title ~~Make region a keyword argument for any and all~~ Make region a keyword argument for dimension reducing methods Jan 11, 2018

ararslan self-assigned this Jan 20, 2018

JeffBezanson mentioned this issue Jan 29, 2018

API consistency review #20402

Closed

19 tasks

ararslan mentioned this issue Jan 31, 2018

WIP/RFH: Make dims a keyword argument to reducing functions #25831

Closed

5 tasks

ararslan added the triage This should be discussed on a triage call label Feb 8, 2018

JeffBezanson removed the triage This should be discussed on a triage call label Feb 8, 2018

ararslan assigned JeffBezanson and unassigned ararslan Feb 9, 2018

JeffBezanson mentioned this issue Feb 10, 2018

make dims (previously sometimes called region) a keyword argument #25989

Merged

JeffBezanson closed this as completed in #25989 Feb 23, 2018

mbauman mentioned this issue Mar 29, 2018

Deprecate squeeze(A, dims) to use keyword arguments #26660

Merged

Make region a keyword argument for dimension reducing methods #25501

Make region a keyword argument for dimension reducing methods #25501

Comments

ararslan commented Jan 11, 2018 • edited Loading

nalimilan commented Jan 11, 2018

JeffBezanson commented Jan 11, 2018

JeffBezanson commented Jan 11, 2018

ararslan commented Jan 11, 2018

malmaud commented Jan 14, 2018

ararslan commented Jan 16, 2018

nalimilan commented Jan 20, 2018

StefanKarpinski commented Jan 20, 2018

JeffBezanson commented Jan 20, 2018

bjarthur commented Jan 30, 2018 • edited Loading

Keno commented Jan 30, 2018

ararslan commented Jan 31, 2018

mbauman commented Jan 31, 2018 • edited Loading

ararslan commented Jan 31, 2018

mbauman commented Jan 31, 2018 • edited Loading

martinholters commented Feb 1, 2018 • edited Loading

ararslan commented Feb 1, 2018

ararslan commented Feb 1, 2018

JeffBezanson commented Feb 1, 2018

ararslan commented Feb 1, 2018

JeffBezanson commented Feb 1, 2018

ararslan commented Feb 2, 2018

JeffBezanson commented Feb 2, 2018 • edited Loading

ararslan commented Feb 3, 2018

StefanKarpinski commented Feb 3, 2018

JeffBezanson commented Feb 4, 2018

timholy commented Feb 4, 2018 • edited Loading

ararslan commented Feb 4, 2018

ararslan commented Feb 8, 2018

JeffBezanson commented Feb 8, 2018

mbauman commented Feb 12, 2018

StefanKarpinski commented Feb 12, 2018

JeffBezanson commented Feb 13, 2018

piever commented Feb 13, 2018

JeffBezanson commented Feb 15, 2018

ararslan commented Jan 11, 2018 •

edited

Loading

bjarthur commented Jan 30, 2018 •

edited

Loading

mbauman commented Jan 31, 2018 •

edited

Loading

mbauman commented Jan 31, 2018 •

edited

Loading

martinholters commented Feb 1, 2018 •

edited

Loading

JeffBezanson commented Feb 2, 2018 •

edited

Loading

timholy commented Feb 4, 2018 •

edited

Loading