Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix type instability of closures capturing types (2) #40985

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

simeonschaub
Copy link
Member

@simeonschaub simeonschaub commented May 28, 2021

Instead of closures lowering to typeof for the types of captured fields, this introduces a new function _typeof_captured_variable that returns Type{T} if T is a type (w/o free typevars). Also adds special inference support for that function to make it a no-op in most cases.

replaces/closes #35970
fixes #23618

@simeonschaub simeonschaub added performance Must go faster compiler:lowering Syntax lowering (compiler front end, 2nd stage) labels May 28, 2021
@oscardssmith oscardssmith added this to the 1.7 milestone May 28, 2021
base/compiler/compiler.jl Outdated Show resolved Hide resolved
base/Base.jl Outdated Show resolved Hide resolved
@tkf
Copy link
Member

tkf commented May 28, 2021

I just run ]test ChainRules. Just eyeballing the test result, I don't see a major regression. Though maybe rulesets/LinearAlgebra/symmetric.jl is slower? (307 in this PR to 251 on master) (Edit: never mind, the compilation time is < 1%)

This PR

Testing ChainRules.jl
Testing rulesets/Base/base.jl:
...
113.387431 seconds (194.65 M allocations: 11.406 GiB, 4.15% gc time, 0.18% compilation time)
Testing rulesets/Base/fastmath_able.jl:
135.469393 seconds (201.52 M allocations: 11.869 GiB, 3.89% gc time, 0.03% compilation time)
Testing rulesets/Base/evalpoly.jl:
 35.726451 seconds (64.08 M allocations: 3.702 GiB, 3.45% gc time, 76.07% compilation time)
Testing rulesets/Base/array.jl:
 10.367831 seconds (17.88 M allocations: 1.074 GiB, 3.64% gc time, 21.27% compilation time)
Testing rulesets/Base/arraymath.jl:
...
188.849576 seconds (341.39 M allocations: 19.978 GiB, 3.86% gc time, 92.38% compilation time)
Testing rulesets/Base/indexing.jl:
 17.777968 seconds (26.13 M allocations: 1.505 GiB, 3.51% gc time, 98.34% compilation time)
Testing rulesets/Base/mapreduce.jl:
113.682090 seconds (221.61 M allocations: 12.675 GiB, 5.68% gc time, 0.00% compilation time)
Testing rulesets/Base/sort.jl:
  9.309037 seconds (14.99 M allocations: 894.253 MiB, 4.40% gc time, 10.71% compilation time)

Testing rulesets/Statistics/statistics.jl:
  3.154018 seconds (5.54 M allocations: 337.662 MiB, 6.72% gc time, 18.29% compilation time)

Testing rulesets/LinearAlgebra/dense.jl:
 90.476702 seconds (144.34 M allocations: 8.792 GiB, 4.83% gc time, 20.87% compilation time)
Testing rulesets/LinearAlgebra/norm.jl:
153.307127 seconds (241.14 M allocations: 14.354 GiB, 4.56% gc time, 0.00% compilation time)
Testing rulesets/LinearAlgebra/matfun.jl:
 16.050503 seconds (27.39 M allocations: 2.706 GiB, 8.82% gc time, 0.37% compilation time)
Testing rulesets/LinearAlgebra/structured.jl:
 81.032083 seconds (112.20 M allocations: 6.698 GiB, 3.87% gc time, 21.03% compilation time)
Testing rulesets/LinearAlgebra/symmetric.jl:
307.083321 seconds (385.80 M allocations: 45.909 GiB, 6.67% gc time, 0.46% compilation time)
...
Testing rulesets/LinearAlgebra/blas.jl:
 57.792782 seconds (93.93 M allocations: 7.993 GiB, 5.22% gc time, 0.00% compilation time)
Testing rulesets/LinearAlgebra/lapack.jl:
 18.828087 seconds (29.57 M allocations: 1.831 GiB, 4.03% gc time, 17.00% compilation time)

Testing rulesets/Random/random.jl:
  0.783869 seconds (821.83 k allocations: 50.719 MiB, 90.65% compilation time)

Testing rulesets/packages/NaNMath.jl:
  0.000138 seconds (56 allocations: 4.266 KiB)

...

julia> Base.GIT_VERSION_INFO
Base.GitVersionInfo("75086967ef7e5385b54369cb0f99b8eb9cc7480b", "75086967ef", "sds/tkf/type-capturing", 1189, "2021-05-28 18:48 UTC", false, 3, 1.622195605e9)

master

Testing ChainRules.jl
Testing rulesets/Base/base.jl:
...
110.912486 seconds (191.40 M allocations: 11.192 GiB, 3.72% gc time, 0.18% compilation time)
Testing rulesets/Base/fastmath_able.jl:
133.547750 seconds (199.73 M allocations: 11.757 GiB, 3.86% gc time, 0.03% compilation time)
Testing rulesets/Base/evalpoly.jl:
 34.510045 seconds (63.45 M allocations: 3.660 GiB, 3.54% gc time, 75.96% compilation time)
Testing rulesets/Base/array.jl:
 10.106647 seconds (17.67 M allocations: 1.061 GiB, 3.55% gc time, 20.78% compilation time)
Testing rulesets/Base/arraymath.jl:
...
185.997513 seconds (339.63 M allocations: 19.862 GiB, 4.00% gc time, 92.22% compilation time)
Testing rulesets/Base/indexing.jl:
 17.365063 seconds (25.89 M allocations: 1.489 GiB, 3.21% gc time, 98.30% compilation time)
Testing rulesets/Base/mapreduce.jl:
110.458438 seconds (220.21 M allocations: 12.584 GiB, 4.98% gc time, 0.00% compilation time)
Testing rulesets/Base/sort.jl:
  9.054407 seconds (14.83 M allocations: 882.215 MiB, 4.29% gc time, 11.54% compilation time)

Testing rulesets/Statistics/statistics.jl:
  3.022534 seconds (5.51 M allocations: 335.324 MiB, 4.24% gc time, 16.47% compilation time)

Testing rulesets/LinearAlgebra/dense.jl:
 88.285892 seconds (143.68 M allocations: 8.751 GiB, 4.53% gc time, 20.88% compilation time)
Testing rulesets/LinearAlgebra/norm.jl:
150.491677 seconds (239.33 M allocations: 14.238 GiB, 4.41% gc time, 0.00% compilation time)
Testing rulesets/LinearAlgebra/matfun.jl:
 16.358479 seconds (27.57 M allocations: 2.718 GiB, 9.26% gc time, 0.36% compilation time)
Testing rulesets/LinearAlgebra/structured.jl:
 76.466074 seconds (108.54 M allocations: 6.501 GiB, 4.32% gc time, 22.90% compilation time)
Testing rulesets/LinearAlgebra/symmetric.jl:
251.823712 seconds (348.90 M allocations: 44.452 GiB, 7.87% gc time, 0.61% compilation time)
...
Testing rulesets/LinearAlgebra/blas.jl:
 56.097692 seconds (93.35 M allocations: 7.957 GiB, 5.91% gc time, 0.00% compilation time)
Testing rulesets/LinearAlgebra/lapack.jl:
 18.680355 seconds (29.96 M allocations: 1.859 GiB, 4.90% gc time, 17.17% compilation time)

Testing rulesets/Random/random.jl:
  0.771533 seconds (817.91 k allocations: 50.452 MiB, 90.70% compilation time)

Testing rulesets/packages/NaNMath.jl:
  0.000109 seconds (56 allocations: 4.266 KiB)

...

julia> Base.GIT_VERSION_INFO
Base.GitVersionInfo("61701d7a84be62beaf129b2178ede42882a7a46a", "61701d7a84", "master", 1186, "2021-05-28 09:53 UTC", false, 0, 1.622195605e9)

@simeonschaub simeonschaub marked this pull request as ready for review May 29, 2021 22:24
@KristofferC
Copy link
Sponsor Member

@oscardssmith Why is this added to the 1.7 milestone?

@vtjnash vtjnash removed this from the 1.7 milestone May 31, 2021
@vtjnash vtjnash added the feature Indicates new feature / enhancement requests label May 31, 2021
@vtjnash
Copy link
Sponsor Member

vtjnash commented May 31, 2021

This is not going into v1.7, but is something to review and consider for a next release

@simeonschaub simeonschaub added needs nanosoldier run This PR should have benchmarks run on it needs pkgeval Tests for all registered packages should be run with this change labels Jul 1, 2021
@simeonschaub
Copy link
Member Author

Triage echoed worries about significantly increased latency in some cases. We should at least run all Base benchmarks on this and we also talked about potentially doing a PkgEval run and analyzing whether there are any outliers in terms of run times.

@simeonschaub
Copy link
Member Author

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier runtests(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here. cc @maleadt

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @christopher-dG

@simeonschaub
Copy link
Member Author

Sorry it took me so long to get back to this! Overall, I couldn't measure any significant mean increase in run time for the PkgEval jobs. Just to be safe, I'll rerun PkgEval for all packages whose tests took more than 20% longer.

@nanosoldier runtests(["HealthBase", "P4est", "Pitaya", "Geophysics", "Synthesizer", "LatinHypercubeSampling", "SequentialMonteCarlo", "Batsrus", "InterProcessCommunication", "MixedModels", "RandomizedNMF", "EasyConfig", "Crayons", "CooperativeGames", "DashBootstrapComponents", "Word2Vec", "PatModules", "IntervalTrees", "IncrementalInference", "VT100", "TraitWrappers", "URIParser", "KeywordCalls", "ProgrammableAPI", "LikelihoodProfiler", "DataFitting", "FeatureDescriptors", "StatsDiscretizations", "LittleEndianBase128", "ConventionalApp", "RealContinuedFractions", "PredictMD", "Avalon", "FromDigits", "H3", "SLEEFInline", "DashTable", "MultivariateOrthogonalPolynomials", "Vinyl", "CIAOAlgorithms", "Zabbix", "RenoiseOSC", "UnitTestDesign", "Remark", "Scalpels", "MixedModelsExtras", "RegularizationTools", "ChainLadder", "Morton", "CPLEX", "LCMGL", "LibDeflate", "SuperLU", "QuadraticToBinary", "AbstractIndices", "Gloria", "GLNS", "ArgParse", "TriangleMesh", "BlockBootstrap", "ForecastEval", "SalesForceBulkApi", "MLDataUtils", "SuiteSparseGraphBLAS", "ARFFFiles", "FeatherLib", "StippleLatex", "GtkMarkdownTextView", "ProgressMeter", "LinearFractionalTransformations", "SimpleLife", "Plots", "Glob", "BDF", "CurrenciesBase", "Skyler", "TrackingTimers", "BioServices", "BoltzmannMachines", "ClassicalOrthogonalPolynomials", "Taro", "TensorKit", "ReusePatterns", "SymArrays", "RiskAdjustedLinearizations", "YahooFinance", "JuliaPetra", "PooledArrays", "StaticNumbers", "ExponentialUtilities", "LogCompose", "GitCommand", "Pilot", "FastRounding", "TSAnalysis", "ConstantArrays", "Intervals", "NeuroCore", "Parameters", "AMRVW", "HighFrequencyCovariance", "ComputabilityTheory", "GenFlux", "Shoco", "KiteConnect", "EditionBuilders", "FastFloat16s", "ImGuiOpenGLBackend", "Yao", "ConstructiveGeometry", "CircoCore", "Vortice", "AbaqusReader", "RelocatableFolders", "BayesianNonparametrics", "NearestNeighbors", "ElementaryChemistry", "NLPModelsJuMP", "IPNets", "Tectonic", "JacobiDavidson", "Multisets", "PropertyUtils", "BitInformation", "TensArrays", "ValidatedNumerics", "EffectiveWaves", "MitosisStochasticDiffEq", "Chron", "ApproxFunFourier", "Quadrature", "StateSpaceReconstruction", "HOODESolver", "EcoSISTEM", "MPIMapReduce"], vs = ":master")

@simeonschaub simeonschaub added this to the 1.8 milestone Oct 18, 2021
@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - no new issues were detected. A full report can be found here.

@simeonschaub
Copy link
Member Author

@nanosoldier runtests(["JacobiDavidson"], vs = ":master")

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

@simeonschaub
Copy link
Member Author

Yeah, doesn't seem like there are any significant outliers, so I will mark this for triage.

@simeonschaub simeonschaub added triage This should be discussed on a triage call and removed needs nanosoldier run This PR should have benchmarks run on it needs pkgeval Tests for all registered packages should be run with this change labels Oct 18, 2021
base/Base.jl Outdated
# for closures
function _typeof_captured_variable(Core.@nospecialize x)
if x isa DataType
if x.layout === C_NULL
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we able to optimize this out? And what is the reason for widening only this specific kind of abstract type?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently not. This is from my #35970 but I don't remember why I checked it. It could be just a silly artifact of my blind try-and-error back then.

For inference, it looks like

_typeof_captured_variable(Core.@nospecialize x) =
    isconcretetype(x) ? Core.Typeof(x) : DataType

works. But I wonder if we can just do

const _typeof_captured_variable = Core.Typeof

?

@simeonschaub
Copy link
Member Author

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@simeonschaub
Copy link
Member Author

simeonschaub commented Sep 8, 2024

Benchmarks look good as far as I can tell. Tagging for triage to give the final green light considering some of the concerns voiced above. I do believe we've sufficiently demonstrated that this would be a net win and otherwise this is ready to merge.

@simeonschaub simeonschaub added the triage This should be discussed on a triage call label Sep 8, 2024
@oscardssmith
Copy link
Member

triage assigned a build time and sysimage size check to Jeff, but merge after that.

@oscardssmith oscardssmith removed triage This should be discussed on a triage call feature Indicates new feature / enhancement requests labels Sep 12, 2024
@JeffBezanson
Copy link
Sponsor Member

I don't see any meaningful changes in the numbers there, so probably good.

@simeonschaub simeonschaub added the merge me PR is reviewed. Merge when all tests are passing label Sep 17, 2024
rt = Const(has_free_typevars(tv) ? typeof(tv) : Core.Typeof(tv))
elseif isType(t)
tv = t.parameters[1]
rt = Const(has_free_typevars(tv) ? typeof(tv) : Core.Typeof(tv))
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems mostly likely incorrect changing an uncertain value to a guaranteed one

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean just the isType branch? Or is there an issue with this optimization as a whole?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump. @vtjnash What would a correct version of this look like?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just delete it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try that again. I remember there being cases where the call wouldn't be eliminated otherwise, but that was also a while ago so things might have changed

@vtjnash vtjnash removed the merge me PR is reviewed. Merge when all tests are passing label Sep 17, 2024
@simeonschaub
Copy link
Member Author

Ok, this seemed to work, all tests still pass and I've confirmed locally that the call is still eliminated in all cases I care about. Good to merge?

base/compiler/abstractinterpretation.jl Outdated Show resolved Hide resolved
test/core.jl Show resolved Hide resolved
@aviatesk aviatesk force-pushed the sds/tkf/type-capturing branch 2 times, most recently from e6eef3c to 1b749b9 Compare October 10, 2024 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:lowering Syntax lowering (compiler front end, 2nd stage) performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

type inference problem with captured type in closures
10 participants