Skip to content
mkeskells edited this page Aug 1, 2017 · 17 revisions

scalac_perf

This is a project focussed on performance improvements to the scalac compiler, whether in the compiler itself or in the libraries it calls, or in how to tidy up the code. really anything to make the process of running scalac on a large project faster

This wiki is intended to be dynamic and a first source of what is going on and what are ideas that may provide some benefit. All ideas are welcome.

It is hoped that the ideas will get investigated and turn into issues, branches and PRs to be pushed back the main scala repo, so if you want to pick up an idea or work on it mention this in the wiki so other can assist

Checkout https://github.com/rorygraves/scalac_perf/wiki/Getting-started-with-compiler-hacking to get a walkthough of getting setup on the compiler.


Work in progress

scalac

Linker

Status - in branch https://github.com/rorygraves/scalac_perf/tree/linker

This is aimed at producing and using a linker format for scalac, so scalac doesn't have to open and parse lots of class files in lots of jars. It should eventually enable overlapping compilation of dependent projects. More information is on the detail page LinkerInfo Linker

GenBcode parallelisation and optimisation

Status - in branch https://github.com/rorygraves/scalac_perf/tree/2.12.genBcodeParallel

GenBcode uses queues internally to do work on different phases from AST -> ASM -> bytes -> files

This would process the queues in different threads. This is safe as there is no access to the AST and no interaction between these files after the initial AST -> ASM This may be less useful if Linker pays off but is still worth persueing. The file writing is the biggest overhead and could be parallelised. There are a few other minor optimisations that we can do at the same time

GenBcode is a very expensive phase in compilation. May be reduced after https://github.com/rorygraves/scalac_perf/tree/arraySlice - no figures yet

LocalOpt and Settings

Status - in branch https://github.com/rorygraves/scalac_perf/tree/2.12.x_perRunImmutableSettings

this does optimisation within a local scope, e.g. a method. It makes lots of calls like compilerSettings.optNone and opt* call that call to optEnabled in ScalaSettings. Is this run mutiple times per method it seems expensive to be doing this using set logic, so maybe the values should be snapped to a constant for the compilation run. T extend that maybe all of the compiler setting should be immutable. For example - look at ScalaSettings.optAddToBytecodeRepository which does 18 set operation to evaluate. Maybe just change then to be lazy vals

No data to back this up as yet - just code inspection

Changes will likely overlap/conflict with GenBcode parallelisation

Cleanup

this phase does some rework of Array.apply. Can we do something similar with other apply methods ( e.g. List, Map, Set ) for the simple cases (where we know the length & type of the parameters etc)

Read files to parse in parallel

Library

this will be lots of these so it is on a separate page - see LibraryWIP

other Ideas awaiting proof/thought/love

library

in a separate page see Library Ideas

##scalac

parallel IO and parse for source

currently the source files are read one by one and then split into lines and then parsed We should be able to do some or all of that in parallel and use nio to help

remove unused flags in scalac

stats collection in scalac

does some work even if disabled

scalac/sbt

overlapping compilation

using Linker we can get the result of the compilation symbols before the end of the compile. This is effectivily all of the symbol infor that we need for a dependent compilation, so if we could get sbt to strt the next compilation before the previous one finished we could reduce the critical path to somewhere between 25% and 40% of what it is now (based on some out of date timings) so that could make the compilation process 2.5 to 4 times faster

intellij/sbt

straight to jar

get intellij and sbt to be capable to write straight to jars, so we don't have to generate lots of files and then write a jar, when a jar is all we actually want this makes build faster. For a project that generates 10K files then a build typically deleted 10K files, the target jar, creates 10K, then jars them so just do the jars!

Some small interaction with GenBcode as this will need to cope better with incremental compilation

source code

implicit def should have return type

this will happen soon in scala, and it makes the implicit search simpler (Martin Odersky explained), so if we can tidy are code up then it should be faster because we give the typer less work to do

avoid temporary collections

  • (list ++ list).map

poor API usage (particually those which generate objects or burn CPU)

Either in the compiler or a a separate (lint style) tool create warnings for poor code usage. Maybe some can be done in the cleanup phase

  • temporary collection generation when we only need iterators
  • .map{...}.flatten

imports

unused imports cost ( we found 2% cost on a random set of files). Probably a lot can be does with imports to make them faster. Single line is quicker. Can we share some processing of import between files in the same package?

dead code

better ways to detect and remove dead code, if you have whole program knowledge

make method & vals & classes more private where possible

makes sbt and incremental build faster

make method final where possible

should make the runtime faster