concrete-eval: Use concrete eval information even if the result is big #47283

Keno · 2022-10-22T04:37:04Z

When we originally implemented concrete-eval, we decided not to create Const lattice elements for constants that are too large, on the fear that these would end up in the IR and blowing up the size. Now that we have some experience with this, I think that decision was incorrect for several reasons:

We've already performed the concrete evaluation (and allocated the big object), so we're just throwing away precision here that we could have otherwise used (Although if the call result is unused, we probably shouldn't do concrete eval at all - see Don't attempt concrete eval if there's no information to be gained #46774).
There's a number of other places in the compiler where we read large values into Const. Unless we add these kinds of check there too, we need to have appropriate guards in the optimizer and the cache anyway, to prevent the IR size blowup.
It turns out that throwing away this precision actually causes significant performance problems for code that is just over the line. Consider for example a lookup of a small struct inside a large, multi-level constant structure. The final result might be quite tiny, but if we refuse to propagate the intermediate levels, we might end up doing an (expensive) full-constprop when propagating the constant information could have given us a much cheaper concrete evaluation.

This commit simply removes that check. If we see any regressions as a result of this, we should see if there are additional guards needed in the optimizer.

aviatesk

Consider for example a lookup of a small struct inside a large, multi-level constant structure. The final result might be quite tiny, but if we refuse to propagate the intermediate levels, we might end up doing an (expensive) full-constprop when propagating the constant information could have given us a much cheaper concrete evaluation.

Maybe we can have a benchmark target for this kind of case in BaseBenchmarks.jl?
This PR itself LGTM.

Keno · 2022-10-22T10:57:31Z

Maybe we can have a benchmark target for this kind of case in BaseBenchmarks.jl?

Yes, should be reasonably to do. Just build a trie or a heap out of tuples or something.

aviatesk · 2022-10-23T03:42:05Z

Yes, should be reasonably to do. Just build a trie or a heap out of tuples or something.

I will try to come up with one. For the meanwhile, let's try the other benchmarks and merge this if there are no apparent regressions.

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2022-10-23T04:43:24Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

…o large When we originally implemented concrete-eval, we decided not to create `Const` lattice elements for constants that are too large, on the fear that these would end up in the IR and blowing up the size. Now that we have some experience with this, I think that decision was incorrect for several reasons: 1. We've already performed the concrete evaluation (and allocated the big object), so we're just throwing away precision here that we could have otherwise used (Although if the call result is unused, we probably shouldn't do concrete eval at all - see #46774). 2. There's a number of other places in the compiler where we read large values into `Const`. Unless we add these kinds of check there too, we need to have appropriate guards in the optimizer and the cache anyway, to prevent the IR size blowup. 3. It turns out that throwing away this precision actually causes significant performance problems for code that is just over the line. Consider for example a lookup of a small struct inside a large, multi-level constant structure. The final result might be quite tiny, but if we refuse to propagate the intermediate levels, we might end up doing an (expensive) full-constprop when propagating the constant information could have given us a much cheaper concrete evaluation. This commit simply removes that check. If we see any regressions as a result of this, we should see if there are additional guards needed in the optimizer.

aviatesk · 2022-10-23T05:17:58Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2022-10-23T06:18:59Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

Keno · 2022-10-24T04:32:42Z

Nanosolider is fine here right, since "allinference" went down it actually got faster, optimization time for this one frame going up would make sense if things became more inlineable and optimization has to do more work, but since allinference got faster on the same benchmark, inference overall would be faster.

aviatesk · 2022-10-24T04:44:57Z

Sounds reasonable. Let's get this merged.

It is possible for concrete-eval to refine effects, but be unable to inline (because the constant is too large, c.f. #47283). However, in that case, we would still like to use the extra effect information to make sure that the optimizer can delete the statement if it turns out to be unused. There are two cases: the first is where the call is not inlineable at all. This one is simple, because we just apply the effects on the :invoke as we usually would. The second is trickier: If we do end up inlining the call, we need to apply the overriden effects to every inlined statement, because we lose the identity of the function as a whole. This is a bit nasty and I don't really like it, but I'm not sure what a better alternative would be. We could always refuse to inline calls with large-constant results (since we currently pessimize what is being inlined anyway), but I'm not sure that would be better. This is a simple solution and works for the case I have in practice, but we may want to revisit it in the future.

vtjnash · 2022-10-25T01:30:45Z

base/compiler/ssair/inlining.jl

@@ -1483,15 +1487,12 @@ function handle_concrete_result!(cases::Vector{InliningCase}, result::ConcreteRe
    return true
 end

+may_inline_concrete_result(result::ConcreteResult) =
+    isdefined(result, :result) && is_inlineable_constant(result.result)


I am hoping someday we can get rid of this arbitrary size cutoff entirely, and just do something simple like:

is_inlineable_constant(x) = (!ismutable(x) && datatype_pointerfree(typeof(x))) || x isa Type || x isa Symbol

this avoids embedding pointers to random bits of mutable memory, but otherwise drops the random size cut-off

It is possible for concrete-eval to refine effects, but be unable to inline (because the constant is too large, c.f. #47283). However, in that case, we would still like to use the extra effect information to make sure that the optimizer can delete the statement if it turns out to be unused. There are two cases: the first is where the call is not inlineable at all. This one is simple, because we just apply the effects on the :invoke as we usually would. The second is trickier: If we do end up inlining the call, we need to apply the overriden effects to every inlined statement, because we lose the identity of the function as a whole. This is a bit nasty and I don't really like it, but I'm not sure what a better alternative would be. We could always refuse to inline calls with large-constant results (since we currently pessimize what is being inlined anyway), but I'm not sure that would be better. This is a simple solution and works for the case I have in practice, but we may want to revisit it in the future.

This undoes some of the choices made in #47283 and #47305. As Shuhei correctly pointed out, even with the restriction to `nothrow`, adding the extra flags on the inlined statements results in incorrect IR. Also, my bigger motivating test case turns out to be insufficiently optimized without the effect_free flags (which I removed in the final revision of #47305). I think for the time being, the best course of action here is to just stop inlining concrete-eval'ed calls entirely. The const result is available for inference, so in most cases the call will get deleted. If there's an important case we care about where this does not happen, we should take a look at that separately.

…rge (#47371) This undoes some of the choices made in #47283 and #47305. As Shuhei correctly pointed out, even with the restriction to `nothrow`, adding the extra flags on the inlined statements results in incorrect IR. Also, my bigger motivating test case turns out to be insufficiently optimized without the effect_free flags (which I removed in the final revision of #47305). I think for the time being, the best course of action here is to just stop inlining concrete-eval'ed calls entirely. The const result is available for inference, so in most cases the call will get deleted. If there's an important case we care about where this does not happen, we should take a look at that separately.

Keno requested a review from aviatesk October 22, 2022 04:37

Keno force-pushed the kf/constpropbig branch from 200d770 to ead2227 Compare October 22, 2022 10:11

aviatesk approved these changes Oct 22, 2022

View reviewed changes

aviatesk force-pushed the kf/constpropbig branch from cbebf22 to 209d943 Compare October 23, 2022 05:17

aviatesk merged commit 13c0060 into master Oct 24, 2022

aviatesk deleted the kf/constpropbig branch October 24, 2022 04:46

Keno mentioned this pull request Oct 24, 2022

inlining: Use concrete-eval effects if available #47305

Merged

vtjnash reviewed Oct 25, 2022

View reviewed changes

Keno mentioned this pull request Oct 28, 2022

inlining: Don't inline concrete-eval'ed calls whose result was too large #47371

Merged

staticfloat assigned Keno Mar 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

concrete-eval: Use concrete eval information even if the result is big #47283

concrete-eval: Use concrete eval information even if the result is big #47283

Keno commented Oct 22, 2022

aviatesk left a comment

Keno commented Oct 22, 2022

aviatesk commented Oct 23, 2022

nanosoldier commented Oct 23, 2022

aviatesk commented Oct 23, 2022

nanosoldier commented Oct 23, 2022

Keno commented Oct 24, 2022

aviatesk commented Oct 24, 2022

vtjnash Oct 25, 2022

concrete-eval: Use concrete eval information even if the result is big #47283

concrete-eval: Use concrete eval information even if the result is big #47283

Conversation

Keno commented Oct 22, 2022

aviatesk left a comment

Choose a reason for hiding this comment

Keno commented Oct 22, 2022

aviatesk commented Oct 23, 2022

nanosoldier commented Oct 23, 2022

aviatesk commented Oct 23, 2022

nanosoldier commented Oct 23, 2022

Keno commented Oct 24, 2022

aviatesk commented Oct 24, 2022

vtjnash Oct 25, 2022

Choose a reason for hiding this comment