Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile the dart_dev run script for better performance #363

Closed
wants to merge 1 commit into from

Conversation

evanweible-wf
Copy link
Contributor

@evanweible-wf evanweible-wf commented Nov 16, 2021

Running a dart_dev command involves several subprocesses. The main reason for this is because we wanted to support rich package configuration via a .dart file that can do whatever is necessary and possible, rather than just a static config file in something like yaml. To make this work with dart run dart_dev, however, requires generating another Dart program that imports this config file from the current package and uses it when running dart_dev, which we then run via a subprocess.

The execution flow looks like this:

Screen Shot 2021-11-20 at 11 04 23 AM

With the exception that currently we don't perform the second precompilation step, meaning that every time you run dart_dev, you have to pay the cost of Dart compiling that generated run script. In some informal benchmarking, that step takes about 5 seconds.

This PR introduces logic that will compile this run script and store the result in .dart_tool/dart_dev/ so that subsequent runs can use it directly. Running precompiled Dart executables is very fast (~0.5s for kernel snapshots and ~0.02s for native executables), so once the compilation step is done, subsequent runs save almost all of the aforementioned 5 seconds.

The compiled result is cached along with a digest of its inputs. If any of these inputs change, the script will be recompiled:

  • .dart_tool/package_config.json (pub get resolution + the current Dart SDK)
  • tool/dart_dev/config.dart
  • tool/**.dart (if the above config file has any relative imports)

Additionally, this compilation step will be skipped if:

  • tool/dart_dev/config.dart imports from any package that is considered mutable

Benchmarking

Here's some more informal benchmarking to demonstrate how this impacts first runs and subsequent (cached) runs:

dart run dart_dev (first run)

SDK Before After Delta
2.13.4 ~15.6s ~21s +5.4s
2.14.4 ~12.6s ~18s +5.4s
2.15.0-268.18.beta ~11.7s ~15.5s +3.8s

dart run dart_dev (cached run)

SDK Before After Delta
2.13.4 ~6.7s ~1.5s -5.2s
2.14.4 ~7s ~2s -5s
2.15.0-268.18.beta ~6.5s ~1.3s -5.2s

To summarize, the first run takes a perf hit of about 5 seconds, but as a result, subsequent runs have very little overhead and save that same 5 seconds every time.


This PR also adds a clean subcommand to dart run dart_dev that will remove the files in .dart_tool/dart_dev/, in case a bug is found in our compilation caching logic.

After compilation, the startup time for subsequent `dart run dart_dev`
commands should be significantly improved.

This compiled exectuable is cached in `.dart_tool/dart_dev/` along with
a digest that is used to determine when recompilation is needed.

Packages with `tool/dart_dev/config.dart` files that import from their
own package's source files will be opted out of this compilation
optimization because there is no efficient way to track those imports
for the purpose of detecting the need to recompile.

This commit also adds a `clean` subcommand to `dart run dart_dev`
that will remove the files in `.dart_tool/dart_dev/`, in case a bug is
found in our compilation caching logic.
@aviary3-wk
Copy link

Security Insights

No security relevant content was detected by automated scans.

Action Items

  • Review PR for security impact; comment "security review required" if needed or unsure
  • Verify aviary.yaml coverage of security relevant code

Questions or Comments? Reach out on Slack: #support-infosec.

Copy link
Contributor

@greglittlefield-wf greglittlefield-wf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a couple things, this looks awesome though!

import 'utils/version.dart';

// import 'package:completion/completion.dart' as completion;

class DartDevRunner extends CommandRunner<int> {
DartDevRunner(Map<String, DevTool> commands)
: super('dart_dev', 'Dart tool runner.') {
// For backwards-compatibility, only add the `clean` command if it doesn't
// conflict with any configured command.
commands.putIfAbsent('clean', () => CleanTool());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a read-only map (const or otherwise), this will fail; can we make a copy of the map before adding this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or you could write it as a literal with a spread (which I guess is another way of making a copy)
commands = {...commands, 'clean': () => CleanTool()};

Platform.executable, [_runScriptPath, ...args],
processArgs.first,
[
if (processArgs.length > 1) ...processArgs.sublist(1),
Copy link
Contributor

@greglittlefield-wf greglittlefield-wf Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#nit This check is unnecessary, since for a length of n, .sublist(n) will return an empty list:

Suggested change
if (processArgs.length > 1) ...processArgs.sublist(1),
...processArgs.sublist(1),

if (_config.existsSync()) {
final contents = _config.readAsStringSync();
configHasRelativeImports =
RegExp(r'''^import ['"][^:]+''').hasMatch(contents);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this regex is providing the test we want here.

  1. package: imports will match this regex, with the match (group(0)) being import 'package
  2. This only checks for imports on the first line of the file due to the use of ^ without the multiLine: true

return;
}

final digest = md5.convert([
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#nit it might be helpful to log which files are included in the digest, for debugging purposes.

Or at the very least, log the value of configHasRelativeImports.

for (final file in Glob('tool/**.dart', recursive: true)
.listSync()
.whereType<File>()
.where((f) => f.path != _configPath))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be canonicalizing these paths before comparing them? I'm not sure whether Glob makes any guarantees about the path formats in the returned files.

_runExecutableDigest.deleteSync();
}

final args = ['compile', 'exe', _runScriptPath, '-o', _runExecutablePath];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like Pub likely uses the kernel compile command; https://github.com/dart-lang/pub/blob/400f21e9883ce6555b66d3ef82f0b732ba9b9fc8/lib/src/dart.dart#L149-L150

I wonder why that's the one they chose...

Is there a reason you went with exe over any of other options? If so, could you add a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There wasn't a particular reason I chose exe, no, and now that I think about it, kernel is probably the better choice because we don't need this to be portable to systems that don't have the Dart SDK installed (we can run dart .dart_tool/dart_dev/run.dill). Maybe that will save some time on the compilation since (if I understand it correctly) it wouldn't need to include the SDK in the output?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so I did some benchmarking:

dart compile

How long it takes to compile the generated run.dart script. This is the "hit" users would take on first usage for the sake of faster subsequent, cached runs.

SDK Command Time
2.13.4 dart compile exe ~10.5s
2.13.4 dart compile kernel ~9.8s
2.14.4 dart compile exe ~11.5s
2.14.4 dart compile kernel ~10.5s
2.15.0-268.18.beta dart compile exe ~11.5s
2.15.0-268.18.beta dart compile kernel ~10.7s
Note: dart_dev is not yet null-safe. Once migrated, we assume compilation time will improve, but probably not significantly.

Running compiled Dart program

How long it takes to run the output of dart compile. In particular, we want to know if it's faster to run an exe or a dill. These times plus the compilation times will help determine which option is best.

SDK Compiled Type Time
2.13.4 exe: ./dart_dev ~0.02s
2.13.4 kernel: dart dart_dev.dill ~0.5s
2.13.4 kernel: dart run dart_dev.dill ~0.8s
2.14.4 exe: ./dart_dev ~0.02s
2.14.4 kernel: dart dart_dev.dill ~0.5s
2.14.4 kernel: dart run dart_dev.dill ~0.8s
2.15.0-268.18.beta exe: ./dart_dev ~0.02s
2.15.0-268.18.beta kernel: dart dart_dev.dill ~0.5s
2.15.0-268.18.beta kernel: dart run dart_dev.dill ~0.8s
Note: there is a perf hit (~0.9s) when running a Dart exe for the first time.

The tl;dr: here is that compiling a kernel dill is about 0.8-1s faster than a native executable, but running that dill takes about 0.5s every time compared to 0.02s for native executables (excluding first run). Based on that, I'm inclined to stick with dart compile exe because the first run is already going to be slow; might as well optimize for the best case to get the most speedup possible on every subsequent run.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that was a lot of data collection—so thorough! 💯💯💯

exe sounds good to me!

}

final digest = md5.convert([
..._packageConfig.readAsBytesSync(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just thought about something... path-dependency packages can change without updating package_config.dart...

Depending on those within the tool directory should be fairly uncommon, but it could potentially cause issues if you're, say, developing a package that's being package-imported in the tool/ directory, and are testing it out via a path dependency override...

I'm not sure how much we should worry about that case though, or what we'd be able to do about it without making this invalidation logic significantly more complex. 😕 What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.. I'm thinking it may be worth parsing the package_config.json and using it to lookup the type of dependency for each package: import in the dart_dev/config.dart file. If any of them are mutable, we skip the compilation & caching. This would also cover the same-package import scenario, which is nice because we're currently handling that separately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh good idea!

I think you might need the pubspec.lock to look up the dependency type, though, since package_config.json only contains paths to the packages being depended on and not the dependency type.

// existing script is outdated.
final runScriptContents = buildDartDevRunScriptContents();
if (!_runScript.existsSync() ||
_runScript.readAsStringSync() != runScriptContents) {
logTimedSync(log, 'Generating run script', () {
createCacheDir();
_runScript.writeAsStringSync(buildDartDevRunScriptContents());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you could use runScriptContents here rather than calling buildDartDevRunScriptContents() again.

Suggested change
_runScript.writeAsStringSync(buildDartDevRunScriptContents());
_runScript.writeAsStringSync(runScriptContents);

final currentPackageName =
Pubspec.parse(File('pubspec.yaml').readAsStringSync()).name;
configHasSamePackageImports =
RegExp('''import ['"]package:$currentPackageName''')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we maybe include a trailing slash in the regex? As it is if currentPackageName is foo I think this would pick up both of these imports:

import 'package:foo/bar.dart';
import 'package:foo_bar/fizz_buzz.dart';

I assume we wouldn't want to match the second of these imports though.

Suggested change
RegExp('''import ['"]package:$currentPackageName''')
RegExp('''import ['"]package:$currentPackageName\/''')

@alexandercampbell-wk
Copy link
Contributor

I've revived this PR here with merge conflicts resolved and most comments addressed: #426
Because of adjustments to our CI and the number of intervening changes to the tool, it seemed cleaner to open a fresh PR.
I'll close this PR for now— if it's preferable to move that commit into this PR and reopen, I'd be happy to make that change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants