Improve completion time for large code bases #48788

srawlins · 2022-04-11T22:31:01Z

Large code bases appear to strongly negatively affect code completion time. To be more precise, code completion in libraries with a large namespace, or a large potential namespace (suggesting things that can then be auto-imported) may be much slower than smaller code, even when accounting for the "new protocol" being added to IntelliJ, and which may be used in LSC clients, like VS Code, already.

This task is to measure it, improve it, and decide on a good exit criteria. There are, of course, always ways to improve performance; for this task we must choose a meaningful exit criteria so as to give the task a meaningful priority.

srawlins · 2022-04-12T17:46:11Z

Some motivating user-filed bugs relating to this:

srawlins · 2022-04-12T17:46:29Z

I profiled flutter/gallery using the completion_metrics.dart script, and got the following median, p90, and p95 metrics for completions:

                   p50  p90  p95
ms per completion   27   35   40

Those numbers look great.

So, we need more investigation to find if there are reproducible situations which lead to much higher completion times. Perhaps more symbols. Perhaps add some 'changedFile' events preceding the complete request. Perhaps Windows-specific, where stdin/stdout might be slower...

This adds a section to the output like this: ``` ### Percentile metrics p50 p90 p95 count > 2s ms per completion 4 6 7 0 ``` Bug: #48788 Change-Id: I868580324b3bc83605aa6466ffd5799625d1a9f3 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/240941 Commit-Queue: Samuel Rawlins <[email protected]> Reviewed-by: Brian Wilkerson <[email protected]>

srawlins · 2022-04-20T18:32:32Z

When I look at flutter_gallery, I see these as the slowest completions, when erasing a whole identifier/keyword and complete where it was:

874 ms
/Users/srawlins/code/flutter-gallery/lib/pages/home.dart:738:3, completing PageController

class _CarouselState extends State<_Carousel>
    with RestorationMixin, SingleTickerProviderStateMixin {
  ^? _controller;

859 ms
/Users/srawlins/code/flutter-gallery/lib/pages/home.dart:330:3, completing Timer

class _AnimatedHomePageState extends State<_AnimatedHomePage>
    with RestorationMixin, SingleTickerProviderStateMixin {
  late AnimationController _animationController;
  ^? _launchTimer;

783 ms
/Users/srawlins/code/flutter-gallery/lib/pages/demo.dart:95:3, completing bool

class _GalleryDemoPageState extends State<GalleryDemoPage>
    with RestorationMixin, TickerProviderStateMixin {
  final RestorableInt _demoStateIndex = RestorableInt(_DemoState.normal.index);
  final RestorableInt _configIndex = RestorableInt(0);

  ^? _isDesktop;

608 ms
/Users/srawlins/code/flutter-gallery/lib/pages/settings.dart:40:3, completing _ExpandableSetting
```
class _SettingsPageState extends State<SettingsPage> {
  ^? _expandedSettingId;
```
504 ms
/Users/srawlins/code/flutter-gallery/lib/studies/rally/login.dart:288:3, completing BoxDecoration
```
class _ThumbButtonState extends State<_ThumbButton> {
  ^? borderDecoration;
```

371 ms
/Users/srawlins/code/flutter-gallery/lib/demos/material/data_table_demo.dart:68:3, completing _DessertDataSource

class _DataTableDemoState extends State<DataTableDemo> with RestorationMixin {
  final _RestorableDessertSelections _dessertSelections =
      _RestorableDessertSelections();
  final RestorableInt _rowIndex = RestorableInt(0);
  final RestorableInt _rowsPerPage =
      RestorableInt(PaginatedDataTable.defaultRowsPerPage);
  final RestorableBool _sortAscending = RestorableBool(true);
  final RestorableIntN _sortColumnIndex = RestorableIntN(null);
  ^? _dessertsDataSource;

293 ms
/Users/srawlins/code/flutter-gallery/lib/feature_discovery/feature_discovery.dart:138:3, completing OverlayEntry

class _FeatureDiscoveryState extends State<FeatureDiscovery>
    with TickerProviderStateMixin {
  bool showOverlay = false;
  FeatureDiscoveryStatus status = FeatureDiscoveryStatus.closed;

  late AnimationController openController;
  late AnimationController rippleController;
  late AnimationController tapController;
  late AnimationController dismissController;

  late Animations animations;
  ^? overlay;

227 ms
/Users/srawlins/code/flutter-gallery/lib/feature_discovery/feature_discovery.dart:138:3, completing Matrix4

class _TransformationsDemoState extends State<TransformationsDemo>
    with TickerProviderStateMixin {
  final GlobalKey _targetKey = GlobalKey();
  // The radius of a hexagon tile in pixels.
  static const _kHexagonRadius = 16.0;
  // The margin between hexagons.
  static const _kHexagonMargin = 1.0;
  // The radius of the entire board in hexagons, not including the center.
  static const _kBoardRadius = 8;

  Board _board = Board(
    boardRadius: _kBoardRadius,
    hexagonRadius: _kHexagonRadius,
    hexagonMargin: _kHexagonMargin,
  );

  final TransformationController _transformationController =
      TransformationController();
  Animation<Matrix4>? _animationReset;
  late AnimationController _controllerReset;
  ^? _homeMatrix;

207 ms
/Users/srawlins/code/flutter-gallery/lib/demos/material/tabs_demo.dart:37:3, completing TabController

class __TabsScrollableDemoState extends State<_TabsScrollableDemo>
    with SingleTickerProviderStateMixin, RestorationMixin {
  ^? _tabController;

207 ms
/Users/srawlins/code/flutter-gallery/lib/data/gallery_options.dart:231:3, completing Timer

class _ModelBindingState extends State<ModelBinding> {
  late GalleryOptions currentModel;
  ^? _timeDilationTimer;

Notes:

These 10 slowest completions were the only 10 which took more than 200 ms. In this data set, with 33k completions, the median time was 27 ms; the p90 time was 40 ms; the p95 time was 45 ms.
Some of these may be artifacts of the completion_metrics.dart script; e.g. is there slow performance early in a file, when there is perhaps less stuff hot in the cache?
Aside from any artifacts of the script, there is a clear signal: every slowest completion is at the front of a line inside a class. They are all at the front of a field, without an initializer, without any modifiers (like final), and with a nullable type. Curious that none are at the front of a method, setter, getter? Curious that none are at the front of a field with an initializer?

bwilkerson · 2022-04-21T18:35:55Z

... is there slow performance early in a file, when there is perhaps less stuff hot in the cache?

I wouldn't expect that to be the case because if we're removing part of the file before completing then we're having to analyze the whole file before every completion. But I'm not an expert in performance questions.

... every slowest completion is at the front of a line inside a class.

That's definitely interesting. Can we tell what the performance is if completion is happening between complete members? I'm wondering whether it's because we're at the level of a class member or whether the AST we're seeing after recovering from the error is causing the problem. If it's the former, then that's an opportunity; if it's the latter, then I don't expect that users are in this situation very often.

Curious that none are at the front of a method, setter, getter?

The return type is optional for methods, setters, and getters, so no recovery is required. (Although I would expect recovery if the type were nullable and the question mark isn't being deleted.)

Curious that none are at the front of a field with an initializer?

That case might well recover better because the presence of the initializer seems like a clear signal that we have a field declaration.

srawlins · 2022-04-21T21:25:46Z

These were done with the default overlay, none, so no overlays are created, and the code should always be error-free.

srawlins · 2022-04-22T06:41:25Z

When I use the --overlay=remove-rest-of-file option, the completions get noticeably slower for the slowest: p95 is still just 39ms which is great. But there are now over 100 completions which take longer than 483 ms. The top 10 are:

2586 ms to complete at lib/demos/reference/motion_demo_fade_scale_transition.dart:12:58
1623 ms to complete at lib/demos/reference/transformations_demo_board.dart:119:8
1457 ms to complete at lib/demos/reference/colors_demo.dart:110:13
1324 ms to complete at lib/demos/reference/motion_demo_fade_scale_transition.dart:49:12
1253 ms to complete at lib/demos/reference/motion_demo_container_transition.dart:409:14
1253 ms to complete at lib/demos/reference/motion_demo_shared_x_axis_transition.dart:51:13
1086 ms to complete at lib/demos/reference/typography_demo.dart:122:30
1074 ms to compute at lib/demos/reference/motion_demo_fade_scale_transition.dart:12:9
963 ms to compute at lib/demos/reference/motion_demo_shared_x_axis_transition.dart:52:22
907 ms to compute at lib/demos/reference/motion_demo_fade_scale_transition.dart:47:13

gabrielgarciagava · 2023-05-23T07:53:57Z

I am experiencing very high latency in our code base. Things like >60000 ms. There is no task (being reported in the Analysis Server Diagnostics) that is taking this 60000ms, so it is something happening under the hood that I cannot see through the diagnostics page.

It is happening very often, but it is specially bad when we add or remove a dependency in a deeper level of the contexts being analyzed, for example.

The code completion time is also affected by slowness (not latency this time).

Let me know what kind of information I can provide to help investigating that.

srawlins · 2023-05-23T14:11:32Z

Great data, @gabrielgarciagava ! Could you also report:

the number of Contexts you have (from the Contexts page, undocumented right now 😞 )
whether your are running any Dart analyzer plugins (the primary one is Dart Code Metrics) (from the Plugins page, also undocumented
how much memory the DAS process (some dart process) is using (e.g. from Activity Monitor or Process Explorer or ps...)
your OS
your Dart version
Are you using VS Code or IntelliJ or...
The number of .dart files in your workspace (wherever the IDE window is rooted), something like find . -name "*.dart" | wc -l should do the trick.

mraleph · 2023-05-23T14:15:26Z

@gabrielgarciagava additionally to what @srawlins have said. If you are running on new enough analyzer (e.g. Flutter master will certainly have it) then there should be "Collect Report" option at the top of the "Analysis Server Diagnostics" page.

incendial · 2023-05-23T15:54:33Z

how much memory the DAS process

Shouldn't dart info provide this info instead of relying on Activity Monitor?

srawlins · 2023-05-23T16:08:35Z

TIL 🎉 I love it. Thanks @incendial

gabrielgarciagava · 2023-05-24T15:23:35Z

the number of Contexts you have (from the Contexts page, undocumented right now 😞 )

134

whether your are running any Dart analyzer plugins (the primary one is Dart Code Metrics) (from the Plugins page, also undocumented

No plugins.

how much memory the DAS process (some dart process) is using (e.g. from Activity Monitor or Process Explorer or ps...)

your OS

your Dart version

Output of dart info with all those details

If providing this information as part of reporting a bug, please review the information
below to ensure it only contains things you're comfortable posting publicly.

#### General info

- Dart 3.0.1 (stable) (Tue May 16 11:57:19 2023 +0000) on "macos_arm64"
- on macos / Version 13.3.1 (Build 22E261)
- locale is en-NL

#### Project info

- sdk constraint: '>=2.12.0 <3.0.0'
- dependencies: flutter
- dev_dependencies: 

#### Process info

|  Memory |    CPU | Elapsed time | Command line                                                                               |
| ------: | -----: | -----------: | ------------------------------------------------------------------------------------------ |
|   72 MB |   0.0% |        08:20 | dart devtools --machine --try-ports 10 --allow-embedding                                   |
|   74 MB |   0.0% |        07:07 | dart devtools --machine --try-ports 10 --allow-embedding                                   |
| 8527 MB | 142.2% |        08:20 | dart language-server --protocol=lsp --client-id=VS-Code --client-version=3.64.0 --instrumentation-log-file=<path>/analyzerInstrumentationLogFile.txt |
|  389 MB |   0.0% |        07:07 | dart language-server --protocol=lsp --client-id=VS-Code --client-version=3.64.0 --instrumentation-log-file=<path>/analyzerInstrumentationLogFile.txt |
|  180 MB |  55.8% |        00:04 | dart pub outdated --show-all --json                                                        |
|   96 MB |   0.0% |        08:20 | flutter_tools.snapshot daemon                                                              |
|   91 MB |   0.1% |        07:07 | flutter_tools.snapshot daemon                                                              |

Are you using VS Code or IntelliJ or...

VSCode

The number of .dart files in your workspace (wherever the IDE window is rooted), something like find . -name "*.dart" | wc -l should do the trick.

7601

gabrielgarciagava · 2023-05-24T15:28:01Z

@gabrielgarciagava additionally to what @srawlins have said. If you are running on new enough analyzer (e.g. Flutter master will certainly have it) then there should be "Collect Report" option at the top of the "Analysis Server Diagnostics" page.

I am afraid I cannot handle this file. At least not before checking with the legal parts of the company. I see there are package names there, so I need to take care what can be shared and what cannot.
Let me know if there is a specific section of the file that is specially useful for you, then I can edit it accordingly.

srawlins · 2023-05-24T17:00:57Z

CC @bwilkerson @scheglov I find this data very interesting:

18 seconds to perform a completion
134 analysis contexts (😮 )
only 7600 files
only 8.5GB memory used (sorry for using the word "only", but for 134 contexts, this seems pretty low; even with only 7600 local source files, I would imagine that 134 copies of the Flutter SDK and the Dart SDK would really slam that memory higher)

If, broadly speaking, we find that multiple contexts can linearly slow down completion (and other responses, as seen above), that would also boost the priority of improving the multi-context situation.

gabrielgarciagava · 2023-05-24T17:22:09Z

Sobre extra context, in case it helps.

For the 8.5GB, I had started the IDE and added a new dependency in one of the contexts that impacts a lot of other contexts. No more than that. Just saying that the IDE was not running for so long, not sure if it has any impact in the memory. Well, you can see that in the elapsed time.

Second, for the high number of contexts, we do divide our code in a lot of small packages within the same repository.

mraleph · 2023-05-25T08:50:32Z

I am afraid I cannot handle this file. At least not before checking with the legal parts of the company. I see there are package names there, so I need to take care what can be shared and what cannot. Let me know if there is a specific section of the file that is specially useful for you, then I can edit it accordingly.

Ah, it was an oversight to include context names which might be project specific. You can scrape this information by running the following script (it will also scrape lint and plugin names for good measure):

import 'dart:convert';
import 'dart:io';

// Usage: dart scrape.dart report.json
void main(List<String> args) {
  final data = jsonDecode(File(args.first).readAsStringSync());
  for (var entry in data['contexts']) {
    entry['name'] = '@@@';
    entry['lints'] = entry['lints'].length;
    entry['plugins'] = entry['plugins'].length;
  }
  File(args.first).writeAsStringSync(JsonEncoder.withIndent('  ').convert(data));
}

The rest of the report just contains information about the analyser state itself and should not contain any user data.

gabrielgarciagava · 2023-05-30T08:19:24Z

@mraleph Thanks for the script. Looks good to me now.

Here it is:

analyzer_report.json.zip

jensjoha · 2023-05-30T09:39:01Z

Thanks!

Initial glance over the report:

Current RSS (memory usage): ~2.7GB
Max RSS: ~8.8GB
134 contexts
12,966 uniquely known files
347,043 combined "known files" in the 134 contexts
347,043 FileState objects. (Good --- it fits with the number above :))

It seems to me that in this case at least, the memory usage has successfully gone down again.

These numbers seem rather similar to what I see when loading https://github.com/kodecocodes/rwf-materials.git from cache.

gabrielgarciagava · 2023-06-01T21:38:47Z

Thanks for checking the data. Please let me know if I can help providing more data to you.

Those slow times are being a hassle for the day to day work, but I'm also not sure if this issue has high priority for you guys or not. It looks like not so many people are currently affected by this problem, right?

jensjoha · 2023-06-15T07:24:03Z

An update on the report: it has 3,277,209 _Entry objects which seems similar to what we saw in #52447 (comment).
Here the uptime is only ~15 minutes though, so how it has accumulated that many is a good question.
Have you (or some script etc) continually been rewriting pubspec.yaml, analysis_options.yaml, BUILD or .dart_tool/package_config.json files?

gabrielgarciagava · 2023-06-15T08:24:46Z

We don't have anything running continually like you said.

However, we do have one script that runs pub get on a multiple packages. For this 15min experiment I did run it once.
To give you extra context, what the script does is propagating pubspeck.lock changes to packages that are dependent of it. Then it runs pub get for each of the affected packages.

In this experiment, I did modify the pubspec of one deep down dependency. I think around 50 packages were updated and thus had run pub get.

jensjoha · 2023-06-15T08:29:18Z

That's probably enough --- if 50 packages were updated with enough time between each update (say a few seconds which doesn't seem unlikely) that would probably do it.

gabrielgarciagava · 2023-06-15T09:53:27Z

Cool. Should I try to create a "small" project sample with multiple packages that uses this kind of script to share with you?

jensjoha · 2023-06-15T10:00:42Z

I don't think that's currently necessary, thanks though.
That particular leak was fixed in 5aee0f6 so hopefully it'll go away once that makes it into a release.

gabrielgarciagava · 2023-06-15T10:08:44Z

Does this leak affect performance as well?

jensjoha · 2023-06-15T10:28:58Z

Unfortunately I don't think that's the cause of the 60+ seconds latency you reported above, no.

jensjoha · 2023-06-15T10:29:51Z

(though, with a big enough leak we might get in trouble with the amount of memory available on the system, leading to swapping etc which certainly would impact performance).

gabrielgarciagava · 2023-06-15T11:05:59Z

Ok. I was wondering if all those _Entry could be part of some loops, or maybe notifying things all the time. That could affect performance.

In any case, after your fix is released, I will do another run with the analyzer diagnostics and post it here.

…fterContextsCreated` is called If again and again editing, say, a `pubspec.yaml` with proper timing there is a "temporary leak" that repairs itself once the analyzer finishes (some time after the editing stops). What happens is that old contexts are saved in the `declarationsTracker` but eventually cleared in the `afterContextsCreated` call. In a test on Windows (on Linux I'd currently run into https://dartbug.com/52703) I opened the entire "pkg" folder and edited a `pubspec.yaml` file every 5 seconds. The analyzer went from using something along the lines of 700MB of heap to using around 2.5 GB of heap after 25 edits and 17GB (!) of heap shortly before stopping at 250 `pubspec.yaml` edits. After the editing stopped (and clicking the GC button in observatory) the heap usage went down to ~650MB heap used. This fixes the problem by clearing the `declarationsTracker` on the `afterContextsDestroyed` call too. In the same test it stays at around 300-700MB of heap. Possibly related to: #48788 #52447 Change-Id: Ia38cc946a1f36fa8c5b6804f79cbc8dd96c21030 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/309722 Reviewed-by: Brian Wilkerson <[email protected]> Reviewed-by: Konstantin Shcheglov <[email protected]> Commit-Queue: Jens Johansen <[email protected]>

mraleph · 2024-10-03T08:36:28Z

@gabrielgarciagava what is your current experience with analyzer latency on your code base?

gabrielgarciagava · 2024-10-03T09:00:00Z

Still quite bad. Codegens, changing branch, updating dependencies, all of these degrades de performance for the next couple of minutes, for example.

I am following a bit the progress of "workspaces". I even gave it a try on the master branch 1 or 2 weeks ago.
It is hard to say for sure if it improved because I did had to do some quick tweaks to be able to run, and I also did not spend much time experimenting, but it sounds promising.

I did some codegens (which is something that currently degrades a lot the performance for a couple of minutes), and the analyzer seems to respond very quickly after that.

mraleph · 2024-10-15T11:05:46Z

@gabrielgarciagava how big is your code base? do you use code generation? how much generated code is there? (and what type of code generators do you use?)

gabrielgarciagava · 2024-10-16T22:05:58Z

We have something around 120 packages. I have no idea about the LOC.

We do use code generation quite a lot, like json_serializable and functional_data (which generates ==, hashCode, copyWith, etc). I also don't know the number of generated lines of code.

What type of data exactly are you looking for? I can try to calculate it.

srawlins self-assigned this Apr 11, 2022

AndryHTC mentioned this issue Oct 3, 2024

Dart Analysis Poor Performance - Group Issue #56838

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve completion time for large code bases #48788

Improve completion time for large code bases #48788

srawlins commented Apr 11, 2022

srawlins commented Apr 12, 2022

srawlins commented Apr 12, 2022

srawlins commented Apr 20, 2022 •

edited

Loading

bwilkerson commented Apr 21, 2022

srawlins commented Apr 21, 2022

srawlins commented Apr 22, 2022

gabrielgarciagava commented May 23, 2023 •

edited

Loading

srawlins commented May 23, 2023

mraleph commented May 23, 2023

incendial commented May 23, 2023

srawlins commented May 23, 2023 •

edited

Loading

gabrielgarciagava commented May 24, 2023

gabrielgarciagava commented May 24, 2023

srawlins commented May 24, 2023

gabrielgarciagava commented May 24, 2023 •

edited

Loading

mraleph commented May 25, 2023

gabrielgarciagava commented May 30, 2023

jensjoha commented May 30, 2023

gabrielgarciagava commented Jun 1, 2023

jensjoha commented Jun 15, 2023

gabrielgarciagava commented Jun 15, 2023 •

edited

Loading

jensjoha commented Jun 15, 2023

gabrielgarciagava commented Jun 15, 2023

jensjoha commented Jun 15, 2023

gabrielgarciagava commented Jun 15, 2023

jensjoha commented Jun 15, 2023

jensjoha commented Jun 15, 2023

gabrielgarciagava commented Jun 15, 2023

mraleph commented Oct 3, 2024

gabrielgarciagava commented Oct 3, 2024

mraleph commented Oct 15, 2024

gabrielgarciagava commented Oct 16, 2024

Improve completion time for large code bases #48788

Improve completion time for large code bases #48788

Comments

srawlins commented Apr 11, 2022

srawlins commented Apr 12, 2022

srawlins commented Apr 12, 2022

srawlins commented Apr 20, 2022 • edited Loading

bwilkerson commented Apr 21, 2022

srawlins commented Apr 21, 2022

srawlins commented Apr 22, 2022

gabrielgarciagava commented May 23, 2023 • edited Loading

srawlins commented May 23, 2023

mraleph commented May 23, 2023

incendial commented May 23, 2023

srawlins commented May 23, 2023 • edited Loading

gabrielgarciagava commented May 24, 2023

gabrielgarciagava commented May 24, 2023

srawlins commented May 24, 2023

gabrielgarciagava commented May 24, 2023 • edited Loading

mraleph commented May 25, 2023

gabrielgarciagava commented May 30, 2023

jensjoha commented May 30, 2023

gabrielgarciagava commented Jun 1, 2023

jensjoha commented Jun 15, 2023

gabrielgarciagava commented Jun 15, 2023 • edited Loading

jensjoha commented Jun 15, 2023

gabrielgarciagava commented Jun 15, 2023

jensjoha commented Jun 15, 2023

gabrielgarciagava commented Jun 15, 2023

jensjoha commented Jun 15, 2023

jensjoha commented Jun 15, 2023

gabrielgarciagava commented Jun 15, 2023

mraleph commented Oct 3, 2024

gabrielgarciagava commented Oct 3, 2024

mraleph commented Oct 15, 2024

gabrielgarciagava commented Oct 16, 2024

srawlins commented Apr 20, 2022 •

edited

Loading

gabrielgarciagava commented May 23, 2023 •

edited

Loading

srawlins commented May 23, 2023 •

edited

Loading

gabrielgarciagava commented May 24, 2023 •

edited

Loading

gabrielgarciagava commented Jun 15, 2023 •

edited

Loading