Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite merlin-completion-at-point integration to be faster and better #1759

Merged
merged 2 commits into from
Sep 11, 2024

Conversation

catern
Copy link
Contributor

@catern catern commented May 6, 2024

Let me know how you'd like me to approach this PR. This is what we use internally at Jane Street, and it's been written to be relatively upstreamable. At the same time, the old version of merlin-cap was in need of a lot of improvement and modernization to how the Emacs completion API is supposed to be used, and essentially none of the original code survived. So it's kind of just a giant code drop, which I know is pretty hard to review.

The merlin-completion-at-point integration is completely rewritten. The version before was simple and slow; the new version is much faster and much more featureful:

  • Completions are requested for an entire module at once and filtered locally in Emacs rather than doing the filtering on the Merlin side.

  • Completions are cached based on the OCaml atom being completed, so they are re-used instead of re-requested when a new character is typed.

  • Those completions are cached for a given position inside an OCaml atom, so that if a user types Li TAB ma TAB to complete "List.map" and then decides they actually want the module "Labels", when they delete the "ist.map" part and hit TAB, they'll use the previously-requested completions.

  • We avoid updating Merlin with the new buffer contents as completion proceeds, so that Merlin doesn't need to re-parse and re-type-check, substantially improving performance in expensive
    files. (merlin-cap--omit-bounds)

  • Completion requests are handled asynchronously and reused, so that if completion is interrupted and then resumed, we're able to use the results of the previous completion request. This makes completion UIs which use while-no-input (like corfu-mode) much more performant.

  • Completion is wrapped in while-no-input when non-essential is set; this makes completion UIs which don't use while-no-input (like company-mode) much more responsive.

  • Completions are sorted more intelligently: if they're a constructor or variant or label, they're likely to be more relevant to the user, so they're sorted first.

  • Module names in completions are suffixed with a ., matching Emacs behavior for file name completion (where directories are suffixed with a /); this makes completion of module paths much more fluent, since there's no need to hit . after every module name.

  • We use completion boundaries, so the built-in Emacs partial-completion feature now works: if the user types "Li.ma TAB", it will complete to "List.map".

  • Likewise, partial-completion will expand * as a glob, so if the user types "Deferred.*.map TAB" they will be presented with every module in "Deferred." which contains the method "map"

There are also several tests now, testing the new functionality.

@sidkshatriya
Copy link
Contributor

Completions are requested for an entire module at once and filtered locally in Emacs rather than doing the filtering on the Merlin side.

Does this mean that ocaml-lsp would need to be modified to do this the same filtering ? (If the end user is using ocaml-lsp rather than Merlin + Merlin Emacs plugin )

@voodoos
Copy link
Collaborator

voodoos commented May 7, 2024

Thanks for doing the upstreaming @catern, it looks like it would be a major improvement for emacs users.

@sidkshatriya : it's very probable that the ocaml-lsp-server and the associated lsp clients already rely on such client-side filtering

@voodoos voodoos closed this May 7, 2024
@voodoos voodoos reopened this May 7, 2024
@voodoos
Copy link
Collaborator

voodoos commented May 7, 2024

I am going to ping recent contributors to the emacs mode, someone might be interested in taking part in the review process:
@bcc32, @erikmd, @monnier, @mattiase, @bbatsov, @Chris00

@catern I see that you have some tests in your changes. Would there be a way to have those test ran by some Dune's cram test ?

@xvw xvw requested review from xvw June 11, 2024 14:48
@xvw
Copy link
Collaborator

xvw commented Jun 11, 2024

I'll try it locally soon as possible!

@bbatsov
Copy link

bbatsov commented Jun 11, 2024

I'll take a closer look as well soon. There are quite a few changes, so reviewing them carefully will take some time.

@nojb
Copy link
Contributor

nojb commented Aug 29, 2024

Friendly ping. @xvw @bbatsov have you been able to take a look? This PR looked quite promising. It would be great if it could be pushed forward.

@bbatsov
Copy link

bbatsov commented Aug 29, 2024

Overall the PR looks reasonable to me, but I haven't gone in much depth given the massive amount of changes. I'd suggest that the tests be moved to a dedicated file and ran automatically by the CI, as inline tests are always a mess.

@xvw
Copy link
Collaborator

xvw commented Aug 29, 2024

I was not able to run it (via load-file) and I suspect that my Doom-Emacs configuration should generate some trouble.
For the extraction of the test @bbatsov, it seems that the code base on the emacs side does not have any test, do you have maybe some document about test infrastructure in Emacs? Thanks in advance!

@bbatsov
Copy link

bbatsov commented Aug 29, 2024

Not exactly documentation, but I do use a setup similar to https://github.com/bbatsov/projectile/blob/master/.github/workflows/test.yml in most of my Emacs projects. Works reasonably well for me.

Copy link
Contributor

@nojb nojb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this with LexiFi's codebase; it seems to work well, thanks! The following two features are particularly useful:

We use completion boundaries, so the built-in Emacs partial-completion feature now works: if the user types "Li.ma TAB", it will complete to "List.map".

Likewise, partial-completion will expand * as a glob, so if the user types "Deferred.*.map TAB" they will be presented with every module in "Deferred." which contains the method "map"

I also tried it with a particular heavy file (~380k of code), under Windows (which magnifies performance issues). I get the feeling that the new code is slightly slower (especially the first time completion is invoked), while still remaining usable, but that may be because the new code is doing more work:

Completions are requested for an entire module at once and filtered locally in Emacs rather than doing the filtering on the Merlin side.

LGTM

@voodoos I would be inclined to have this merged; it has a number of useful improvements to the current Emacs mode (which already suffers from few contributors). What do you think?

Copy link
Collaborator

@voodoos voodoos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review @nojb !

Approving on behalf of @nojb and @xvw

voodoos added a commit to catern/merlin that referenced this pull request Sep 11, 2024
catern and others added 2 commits September 11, 2024 16:47
The merlin-completion-at-point integration before was simple and slow.

The new version is much faster and much more featureful:

- Completions are requested for an entire module at once and filtered
locally in Emacs rather than doing the filtering on the Merlin side.

- Completions are cached based on the OCaml atom being completed, so
they are re-used instead of re-requested when a new character is
typed.

- Those completions are cached for a given position inside an OCaml
atom, so that if a user types Li<TAB>ma<TAB> to complete "List.map"
and then decides they actually want the module "Labels", when they
delete the "ist.map" part and hit <TAB>, they'll use the
previously-requested completions.

- We avoid updating Merlin with the new buffer contents as completion
proceeds, so that Merlin doesn't need to re-parse and re-type-check,
substantially improving performance in expensive
files.  (merlin-cap--omit-bounds)

- Completion requests are handled asynchronously and reused, so that
if completion is interrupted and then resumed, we're able to use the
results of the previous completion request.  This makes completion UIs
which use while-no-input (like corfu-mode) much more performant.

- Completion is wrapped in while-no-input when non-essential is set;
this makes completion UIs which don't use while-no-input (like
company-mode) much more responsive.

- Completions are sorted more intelligently: if they're a constructor
or variant or label, they're likely to be more relevant to the user,
so they're sorted first.

- Module names in completions are suffixed with a ., matching Emacs
behavior for file name completion (where directories are suffixed with
a /); this makes completion of module paths much more fluent, since
there's no need to hit . after every module name.

- We use completion boundaries, so the built-in Emacs
partial-completion feature now works: if the user types "Li.ma<TAB>",
it will complete to "List.map".

- Likewise, partial-completion will expand * as a glob, so if the user
types "Deferred.*.map<TAB>" they will be presented with every module
in "Deferred." which contains the method "map"

There are also several tests now, testing the new functionality.
@voodoos
Copy link
Collaborator

voodoos commented Sep 11, 2024

Thanks a lot for doing that upstreaming @catern !

@voodoos voodoos merged commit 671ef62 into ocaml:main Sep 11, 2024
2 of 5 checks passed
(buffer-substring-no-properties (point) (nth 1 cap))
(nth 2 cap))))))

(ert-deftest test-merlin-cap--bounds ()
Copy link

@andersk andersk Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without (require 'ert), this causes errors when using byte-compiled configs.

Debugger entered--Lisp error: (void-function ert-set-test)
  ert-set-test(test-merlin-cap--bounds #s(ert-test test-merlin-cap--bounds nil #[0 "\301C\3021\23\0\303\30\304\305\306\")\307D0\202\36\0\1\310\240\210\211@\1AD\262\1\311C\312C\313\314\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204<\0\316\1\242!\210\266\4\301C\3171Q\0\303\30\304\320\321\")\322D0\202\\\0\1\310\240\210\211@\1AD\262\1\323C\312C\313\324\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204z\0\316\1\242!\210\266\4\301C\3251\217\0\303\30\304\326\327\")\330D0\202\232\0\1\310\240\210\211@\1AD\262\1\331C\312C\313\332\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204\270\0\316\1\242!\210\266\4\301C\3331\315\0\303\30\304\320\334\")\335D0\202\330\0\1\310\240\210\211@\1AD\262\1\336C\312C\313\337\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204\366\0\316\1\242!\210\266\4\301C\3401\13\1\303\30\304\341\342\")\343D0\202\26\1\1\310\240\210\211@\1AD\262\1\344C\312C\313\345\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\2044\1\316\1\242!\210\266\4\301C\3461I\1\303\30\304\320\347\")\350D0\202T\1\1\310\240\210\211@\1AD\262\1\351C\312C\313\352\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204r\1\316\1\242!\210\266\4\301C\3531\207\1\303\30\304\327\326\")\354D0\202\222\1\1\310\240\210\211@\1AD\262\1\355C\312C\313\356\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204\260\1\316\1\242!\210\266\4\301C\3571\305\1\303\30\304\326\360\")\361D0\202\320\1\1\310\240\210\211@\1AD\262\1\362C\312C\313\363\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204\356\1\316\1\242!\210\266\4\301C\3641\3\2\303\30\304\326\365\")\366D0\202\16\2\1\310\240\210\211@\1AD\262\1\367C\312C\313\370\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204,\2\316\1\242!\210\266\4\301C\3711A\2\303\30\304\372\373\")\374D0\202L\2\1\310\240\210\211@\1AD\262\1\375C\312C\313\376\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204j\2\316\1\242!\210\266\4\301C\3771\203\2\303\30\304\201@\0\326\")\201A\0D0\202\216\2\1\310\240\210\211@\1AD\262\1\201B\0C\312C\313\201C\0\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204\260\2\316\1\242!\210\266\4\301C\201D\0001\315\2\303\30\304\201E\0\201F\0\")\201G\0D0\202\330\2\1\310\240\210\211@\1AD\262\1\201H\0C\312C\313\201I\0\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204\372\2\316\1\242!\210\266\4\301C\201J\0001\27\3\303\30\304\201K\0\201L\0\")\201M\0D0\202\"\3\1\310\240\210\211@\1AD\262\1\201N\0C\312C\313\201O\0\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204D\3\316\1\242!\210\266\4\301C\201P\0001a\3\303\30\304\201Q\0\201L\0\")\201R\0D0\202l\3\1\310\240\210\211@\1AD\262\1\201S\0C\312C\313\201T\0\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204\216\3\316\1\242!\210\266\4\301C\201U\0001\253\3\303\30\304\201V\0\201L\0\")\201W\0D0\202\266\3\1\310\240\210\211@\1AD\262\1\201X\0C\312C\313\201Y\0\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204\330\3\316\1\242!\210\266\4\301C\201Z\0001\365\3\303\30\304\201[\0\201\\\0\")\201]\0D0\202\0\4\1\310\240\210\211@\1AD\262\1\201^\0C\312C\313\201_\0\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204\"\4\316\1\242!\210\266\4\301C\201`\0001?\4\303\30\304\201a\0\201\\\0\")\201b\0D0\202J\4\1\310\240\210\211@\1AD\262\1\201c\0C\312C\313\201d\0\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204l\4\316\1\242!\210\266\4\301C\201e\0001\211\4\303\30\304\201f\0\201\\\0\")\201g\0D0\202\224\4\1\310\240\210\211@\1AD\262\1\201h\0C\312C\313\201i\0\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204\266\4\316\1\242!\210\266\4\301C\201j\0001\323\4\303\30\304\201k\0\201\\\0\")\201l\0D0\202\336\4\1\310\240\210\211@\1AD\262\1\201m\0C\312C\313\201n\0\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204\0\5\316\1\242!\210\266\4\301C\201o\0001\33\5\303\30\304\201p\0\326\")\201q\0D0\202&\5\1\310\240\210\211@\1AD\262\1\201r\0C\312C\313\201s\0\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204H\5\316\1\242!\210\266\4\301C\201t\0001c\5\303\30\304\201u\0\326\")\201v\0D0\202n\5\1\310\240\210\211@\1AD\262\1\201w\0C\312C\313\201x\0\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204\220\5\316\1\242!\210\266\4\301C\201y\0001\253\5\303\30\304\201z\0\326\")\201{\0D0\202\266\5\1\310\240\210\211@\1AD\262\1\201|\0C\312C\313\201}\0\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204\330\5\316\1\242!\210\266\4\301C\201~\0001\363\5\303\30\304\201\177\0\326\")\201\200\0D0\202\376\5\1\310\240\210\211@\1AD\262\1\201\201\0C\312C\313\201\202\0\2\4\6\6\6\10%\216\1\315\5\242\5\"\240)\204 \6\316\1\242!\210\210\211\242\207" [signal-hook-function equal ... ert--should-signal-hook merlin-cap--regions "Aaa.bbb.c" "cc.ddd" ... signal ert-form-evaluation-aborted-212 nil make-closure #[0 "\300\304C\305\303\242\302BD\244\301\242\306=?\205\26\0\307\301\242D\244\301\242\306=?\205.\0\310\311!\211\205,\0\312\313\2\302\"D\262\1\244\240\210\314\300\242!\207" [V0 V1 V2 V3 ... :form ert-form-evaluation-aborted-212 :value ert--get-explainer equal :explanation apply ert--signal-should-execution] 7] apply ert-fail ... "~fo" "o.bar" ... ert-form-evaluation-aborted-217 ...] 10] nil :passed nil "/nix/store/b4n6h95vay4xnn79rk4ld5rm122ia6xm-emacs-merlin-20240911.1448/share/emacs/site-lisp/elpa/merlin-20240911.1448/merlin-cap.el"))
  byte-code("\300\301\302\303\301\304\305\304\306\304\307&\10\"\207" [ert-set-test test-merlin-cap--bounds record ert-test nil #f(compiled-function () #<bytecode 0x1a772f8fc6c123e0>) :passed "/nix/store/b4n6h95vay4xnn79rk4ld5rm122ia6xm-emacs-..."] 11)
  require(merlin-cap)
  byte-code("\301\302N\204\f\0\303\301\302\304#\210\303\301\305\306#\210\303\301\307\310C#\210\311\312\313\10\310\211%\210\314\315!\210\316\317!\210\316\320!\207" [merlin-mode-map merlin-mode-hook variable-documentation put "Hook run after entering or leaving `merlin-mode'.\n..." custom-type hook standard-value nil add-minor-mode merlin-mode (:eval (merlin-lighter)) provide merlin require merlin-cap merlin-xref] 6)
  require(merlin)

voodoos added a commit to voodoos/merlin that referenced this pull request Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Planned for backporting
Status: Planned for backporting
Development

Successfully merging this pull request may close these issues.

7 participants