LICM for pure functions #29285

SimonDanisch · 2018-09-20T12:58:29Z

Julia 1.0:

I just realized, that this is a gotcha one easily runs into, especially when using the @. macro:

a = rand(1000, 1000)
c = 1.0
out = similar(a)
julia> @btime $(out) .= $a .- sin.($c);
  5.695 ms (0 allocations: 0 bytes)
julia> @btime $(out) .= $a .- sin($c);
  495.669 μs (0 allocations: 0 bytes)

This likely happens because the compiler can't infer that sin is pure.
I realize, with having access to the call tree in the new lazy broadcast, we could solve this for a predefined set of functions.

First trick could be to just overload broadcasted for known signatures:

Base.Broadcast.broadcasted(::typeof(sin), x::Number) = sin(x)
@btime $out .= $a .+ sin.($c)

this solves the problem for a chosen set of functions.
We could also consider, if we introduce a purity trait to make this easier for multiple argument functions:

broadcasted(f, args...) = broadcasted(IsPure(f), f, args...)
broadcasted(::Pure{true}, f, args...) = f(args...) # should probably not get applied to arrays
broadcasted(::Pure{false}, f, args...) = Broadcasted(f, args...)

I guess this has been discussed before, but I couldn't really find an issue about it...

The text was updated successfully, but these errors were encountered:

mbauman · 2018-09-20T13:58:04Z

Sure, both of those workarounds could work, but I'd very much rather directly tell the compiler about a reasonable level of purity than push an orthogonal concern into the already-complicated broadcasting API.

I must say, though: it is doing exactly what you requested and performing within 5% of the equivalent for loop. Folks rely on this behavior for non-pure functions like A .= rand.().

yuyichao · 2018-09-20T14:07:45Z

Dup of #20875

SimonDanisch · 2018-09-20T14:08:22Z

Yeah I know ;)
I'd totally understand the consent, that we don't ever want to fix it - because it's what the user specifies and this optimization can only ever work for a small subset of predefined functions. So making the user comfortable by solving the problem for a small subset of functions could lead to the opposite effect.

So it might be much better to educate the user - I wouldn't mind if this issue "derails" into how to better inform the user of such gotchas!

SimonDanisch · 2018-09-21T09:17:42Z

from the title change, I assume that @vtjnash is already on the right track for my next boiled down example:

using BenchmarkTools
Base.@pure @noinline sopure(x) = x / 2

function test(x, y)
    #c = sopure(y) 
    for i = 1:length(x)
        @inbounds x[i] = sopure(y) # c
    end
    x
end

@btime test($(rand(10)), $(Ref(1.0))[]) # when commenting out c 14ns, with c 5ns

This doesn't hoist, and If I'm not mistaken should be the most straightforward example for LICM?

SimonDanisch · 2018-09-21T09:22:24Z

Now that the issue is pretty clearly outlined, my main question is:
is fixing just a matter of turning on some flags, will this require a new pass in the Julia optimizer, or does it need a refactor to forward the correct information to LLVM?

SimonDanisch · 2018-09-24T14:57:34Z

Would it be possible to enlighten me a bit about the feasibility and some background about this problem? Just a short statement, that I can tell to the people wondering why Julia's compiler has a problems with this ;)

I feel, like I've seen a discussion about this before, but when I search for julia LICM, this issue turns up as the most relevant result :D

KristofferC · 2018-09-24T14:59:32Z

#9942 maybe relevant?

SimonDanisch · 2018-09-24T15:14:55Z

I guess ;)
So gcc etc can likely hoist compiler intrinsics marked as pure out of the loop, while Julia's trigonometric functions are implemented in pure Julia which makes it harder to detect the pureness...
But an MWE with a "julia pure" function only works for constant propagation but not for LICM. I guess that would need a Julia LICM pass - or forward pureness metadata to LLVM?

mbauman added broadcast Applying a function over a collection performance Must go faster labels Sep 20, 2018

vtjnash mentioned this issue Sep 20, 2018

Slow loop fusion when multiplying a column vector with a row vector #20875

Closed

vtjnash added compiler:optimizer Optimization passes (mostly in base/compiler/ssair/) and removed broadcast Applying a function over a collection performance Must go faster labels Sep 20, 2018

vtjnash changed the title ~~broadcast performance of function applied to scalars~~ LICM for pure functions Sep 20, 2018

vtjnash added the feature Indicates new feature / enhancement requests label Sep 20, 2018

SimonDanisch mentioned this issue Sep 21, 2018

Baseline nim solution SimonDanisch/julia-challenge#3

Merged

vtjnash mentioned this issue Oct 11, 2018

Performance issue with iteration over StepRange{<:TimeType} #29403

Closed

stevengj mentioned this issue Jul 8, 2019

codegen: propagate at-pure macro to llvm #32368

Closed

vchuravy mentioned this issue Jul 28, 2020

add LoopInfo + LICM #36832

Open

3 tasks

simonbyrne mentioned this issue Jun 30, 2021

inference: enable constant propagation for invoked calls, fixes #41024 #41383

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICM for pure functions #29285

LICM for pure functions #29285

SimonDanisch commented Sep 20, 2018 •

edited

Loading

mbauman commented Sep 20, 2018

yuyichao commented Sep 20, 2018

SimonDanisch commented Sep 20, 2018 •

edited

Loading

SimonDanisch commented Sep 21, 2018

SimonDanisch commented Sep 21, 2018

SimonDanisch commented Sep 24, 2018

KristofferC commented Sep 24, 2018 •

edited

Loading

SimonDanisch commented Sep 24, 2018 •

edited

Loading

LICM for pure functions #29285

LICM for pure functions #29285

Comments

SimonDanisch commented Sep 20, 2018 • edited Loading

mbauman commented Sep 20, 2018

yuyichao commented Sep 20, 2018

SimonDanisch commented Sep 20, 2018 • edited Loading

SimonDanisch commented Sep 21, 2018

SimonDanisch commented Sep 21, 2018

SimonDanisch commented Sep 24, 2018

KristofferC commented Sep 24, 2018 • edited Loading

SimonDanisch commented Sep 24, 2018 • edited Loading

SimonDanisch commented Sep 20, 2018 •

edited

Loading

SimonDanisch commented Sep 20, 2018 •

edited

Loading

KristofferC commented Sep 24, 2018 •

edited

Loading

SimonDanisch commented Sep 24, 2018 •

edited

Loading