reducedim much slower for user-defined functions #3426

simonster · 2013-06-17T17:39:07Z

using Benchmark
const A = rand(1000, 1000);
f1() = reducedim(+, A, 2, 0.0)
plus(x, y) = x + y
f2() = reducedim(plus, A, 2, 0.0)
compare([f1, f2], 100)

gives:

        Function  Elapsed Relative Replications
[1,]        "f1" 0.593566      1.0          100
[2,]        "f2"  3.70654  6.24453          100

This seems to be due to lack of inlining. I don't see an easy solution that can be implemented in reducedim() itself, although maybe others do. Alternatively, it would be nice to move gen_reducedim_func outside of the let block in array.jl so that other code can use it.

See also my comment at https://groups.google.com/forum/?fromgroups=#!topic/julia-dev/ipbrvflVY1k

The text was updated successfully, but these errors were encountered:

lindahua · 2013-06-17T18:10:44Z

Failure to inline user defined functions is one issue that leads to suboptimal performance of reducedim. However, this is not the only problem.

Issue #2325 shows that sum(x, 2) is about 5x slower than a carefully crafted loop because the order that it scans the array is not cache friendly. I am considering to write a generic yet efficient reduction function that addresses both problems.

lindahua · 2013-06-17T18:28:49Z

Here is what I am considering to address the inlining problem -- use typed functor instead Julia functions.

abstract BinaryOp

type Add <: BinaryOp end

evaluate(op::Add, x, y) = x + y

function fast_reduce{R<:Number}(op::BinaryOp, initvalue::R, a::AbstractArray)
    v::R = initvalue
    for i in 1 : length(a)
        v = evaluate(op, v, a[i])
    end
    v
end

plus(x, y) = x + y

a = rand(10^6)

# warming
sum(a)
fast_reduce(Add(), 0., a)
reduce(plus, 0., a)

# benchmark
@time for i in 1 : 500 sum(a) end                      # 0.3955 sec
@time for i in 1 : 500 fast_reduce(Add(), 0., a) end   # 0.3938 sec
@time for i in 1 : 500 reduce(plus, 0., a) end         # 21.62 sec

Note here that fast_reduce is comparable to sum which is specialized, while reduce with a user function is 55x slower!

I believe this addresses the performance issue without losing generality.

simonster · 2013-06-17T19:00:02Z

Yes, both inlining and cache-unfriendliness are big issues here. Even along the first dimension, where cache isn't a problem, things are 1 OOM slower than they could be. Along the second dimension it's more like 2 OOM.

I think the best solution to the inlining problem is to specialize code generation on function arguments, which I think has been mentioned in the past, although I can't find the issue for it. As a stopgap, I'd be happy with a type-based approach like yours above. If the order of iteration were fixed in gen_reducedim_func and if it were moved out of the let block, it should also be able to achieve the same performance.

lindahua · 2013-06-17T19:08:15Z

Yes, specialized code generation depending on function arguments is the ideal way, which, however, probably takes much longer as it may need some changes in the core.

At this moment, I am going to work on an improved version of reduction based on typed functors.

JeffBezanson · 2013-10-14T03:47:28Z

Special case of #210.

JeffBezanson mentioned this issue Jun 18, 2013

compiler optimization tracker #3440

Closed

65 tasks

timholy mentioned this issue Jul 31, 2013

later-stage macros? #3706

Closed

JeffBezanson closed this as completed Oct 14, 2013

tkf mentioned this issue Jan 4, 2022

RFC/POC: Multi-scheduler for and in Julia #43649

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reducedim much slower for user-defined functions #3426

reducedim much slower for user-defined functions #3426

simonster commented Jun 17, 2013

lindahua commented Jun 17, 2013

lindahua commented Jun 17, 2013

simonster commented Jun 17, 2013

lindahua commented Jun 17, 2013

JeffBezanson commented Oct 14, 2013

reducedim much slower for user-defined functions #3426

reducedim much slower for user-defined functions #3426

Comments

simonster commented Jun 17, 2013

lindahua commented Jun 17, 2013

lindahua commented Jun 17, 2013

simonster commented Jun 17, 2013

lindahua commented Jun 17, 2013

JeffBezanson commented Oct 14, 2013