Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using keyword argument prevents specialization #45162

Closed
moble opened this issue May 3, 2022 · 0 comments · Fixed by #45198
Closed

Using keyword argument prevents specialization #45162

moble opened this issue May 3, 2022 · 0 comments · Fixed by #45198
Assignees
Labels
bug Indicates an unexpected problem or unintended behavior compiler:lowering Syntax lowering (compiler front end, 2nd stage) performance Must go faster

Comments

@moble
Copy link
Contributor

moble commented May 3, 2022

I raised this issue on discourse, where the consensus seems to be that this is at least very confusing, and possibly a bug.

If I use a kwarg for a function that was passed as an argument to another function, Julia does not specialize the latter function. If I don't use the kwarg for that very same function, it does specialize. Below, I'll include a less trivial example that's closer to my actual use case. But schematically, the idea is this:

f1(a; b=10) = a+b
f2(c, f) = c + f(20)
f3(d, f) = d + f(30; b=40)

Calling, for example, f2(5, f1) will result in a specialized f2; calling f3(5, f1) will not result in specialization. I imagine Julia is clever enough to optimize the problem away in this schematic. But in my code, I was seeing slowdowns of ~100x and allocations of multiple GiBs on each call to my core computation function.

As pointed out in discourse, it's possible to manually trigger specialization by adding a type parameter. But the failure to automatically specialize is a problem for a few reasons:

  1. It's surprising. The performance tip on specialization says "Julia will always specialize when the argument is used within the method, but not if the argument is just passed through to another function." In this case, I did use the argument (the function with the kwarg). From the discussion on discourse, it looks like the problem is that Julia immediately lowers that to just pass the function through to Core.kwfunc. So technically the argument "is just passed through to another function" — but not by the programmer. (Gotta love passive voice!)
  2. It's very hard to diagnose. None of the usual tools — profiling, allocation tracking, @code_warntype, JET, Traceur — pointed out any problem with the use of kwargs. In fact, profiling and allocation actively focused my attention on other parts of the code that were not at all the source of the problem. Even the (@which f(...)).specializations trick from that section of the performance tips seemed to say the function was being specialized for my arguments. (See below.)
  3. It seems to contradict the docs. If the goal when designing this heuristic is to detect when a function is "just passed through" so that it will "usually [have] no performance impact at runtime", surely the decision of how to arrange parameters in a function definition should not affect the result.

So, at the very least, I would think this is a documentation bug, because the kwarg wrinkle should be noted in that performance tip — rather than requiring the user to mentally combine disparate arcana from the most cryptic parts of the docs. It would also be nice if some standard tools could point toward the source of the problem. But maybe this is truly a bug in Julia, which should actually specialize even when a kwarg is used?


For reference, here's a working example that's complicated enough that Julia doesn't just optimize the problem away, while still being a greatly simplified version of my actual use case:

using Profile

function index(n, mp, m; n_max=n)
    n + mp + m + n_max
end

function inplace!(a, n_max, index_func)
    i1 = index_func(1, 2, 3; n_max=n_max)  # Using this version leads to allocations below
    # i1 = index_func(1, 2, 3)             # Using this version leads to 0 allocations
    i2 = size(a, 1) - 2i1
    for i in 1:i2                      # Allocates 3182688 B if using kwarg above
        a[i + i1] = a[i + i1 - 1]      # Allocates 9573120 B if using kwarg above
    end
    for i in 3:i2-4                    # Allocates 3182576 B if using kwarg above
        a[i + i1] -= a[i + i1 - 2]     # Allocates 12771408 B if using kwarg above
    end
end

function compute_a(n_max::Int64)
    a = randn(Float64, 100_000)
    inplace!(a, n_max, index)
    Profile.clear_malloc_data()
    inplace!(a, n_max, index)
end


compute_a(10)

And yes, there are plenty of ways to improve the performance of this simplified code with function barriers and such. But my actual code is too complicated for that, with the kwarg func being used multiple times inside some loops.

If I look at the specializations of inplace!(a, n_max, index), I get

svec(MethodInstance for inplace!(::Vector{Float64}, ::Int64, ::Function), MethodInstance for inplace!(::Vector{Float64}, ::Int64, ::typeof(index)), nothing, nothing, nothing, nothing, nothing, nothing)

That second element really looks to me like it specialized for my particular index function.

Here's all my versioninfo
julia> versioninfo()
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin19.5.0)
  CPU: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, haswell)
@KristofferC KristofferC added the performance Must go faster label May 3, 2022
@JeffBezanson JeffBezanson self-assigned this May 5, 2022
@JeffBezanson JeffBezanson added bug Indicates an unexpected problem or unintended behavior compiler:lowering Syntax lowering (compiler front end, 2nd stage) labels May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior compiler:lowering Syntax lowering (compiler front end, 2nd stage) performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants