Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move semantic tokens to LSP implementation #1806

Merged
merged 1 commit into from
Sep 8, 2021

Conversation

0dinD
Copy link
Contributor

@0dinD 0dinD commented Jun 24, 2021

Fixes #1678
Fixes redhat-developer/vscode-java#1999

This PR moves the semantic tokens implementation from a custom command to the proper LSP 3.16 implementation.

Perhaps someone could update the LSP support wiki page once this PR is merged? Currently, we only support textDocument/semanticTokens/full, not textDocument/semanticTokens/full/range or textDocument/semanticTokens/delta.

Copy link
Contributor

@Eskibear Eskibear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, we can remove extra delegate commands now.

I'll try it later and below are some immediate thoughts.

  • A monitor should be passed into AST provider, for cancellation and better progress indication.
  • Some changes look unrelevant to semantic tokens. I suggest to create a separate PR if you think they are buggy.
  • LSP4J is using wrapper objects instead of primitive types. I remember in our previous discussion we switched to use primitives for performance concern. We should double confirm whether there's any performance degradation.

@Eskibear
Copy link
Contributor

ping~
@0dinD will you update this PR accordingly?

@0dinD
Copy link
Contributor Author

0dinD commented Jul 15, 2021

Yes, I plan to look at this again in a few days. I've been quite busy recently so that's why I haven't responded yet.

@0dinD
Copy link
Contributor Author

0dinD commented Jul 18, 2021

A monitor should be passed into AST provider

Thanks, good catch!

Some changes look unrelevant to semantic tokens. I suggest to create a separate PR if you think they are buggy.

Those changes don't affect the code, I was just trying to get rid of this warning:
fix-warning
If you want me to revert the change that's fine, but I'm not going to make another PR for it. I usually take the opportunity to fix small things like this in the files that I'm changing for a PR.

LSP4J is using wrapper objects instead of primitive types. I remember in our previous discussion we switched to use primitives for performance concern. We should double confirm whether there's any performance degradation.

I switched to primitives mainly because it seemed like the right thing to do while I was at it, but the main performance improvement probably came from avoiding to resolve bindings in some situations and using bitmasks for modifiers. If you want to test this then go ahead, but either way we would need upstream changes to LSP4J in order to use primitives, which I don't have time to deal with right now (and I'm not sure if the performance gain is worth it).


After testing this PR again, I noticed that unfortunately redhat-developer/vscode-java#1597 is back, because now we don't have the client-side code in the extension to handle "stale" requests. We could re-implement the same workaround in the extension using LanguageClientOptions.middleware.provideDocumentSemanticTokens, but I really wanted to understand the root cause of this annoying issue, so I went back to debugging.

Finally, I believe I've found the problem! This is what happens when we get a flicker of incorrect highlighting after frequent changes to the document:

  1. The client requests semantic tokens for version 1 of the document.
  2. The server begins computing the request.
  3. The client detects a change to the document and notifies the server.
  4. The server now computes tokens for version 2 instead.
  5. The server completes the request.
  6. The client applies the tokens, but adjusts them according to the state of the document at the version of the initial request (version 1).

The relevant code can be found here. By setting a breakpoint in the VS Code devtools and removing the changes from pendingChanges, I was able to verify that the semantic tokens get applied correctly without those lines of code.

But this isn't necessarily a bug with VS Code. It just expects the semantic tokens returned by the server to be for the exact version of the document at the time of the request, not necessarily the latest version. Knowing this, I just removed the call to waitForLifecycleJobs, which seems to have fixed the issue. It should also provide a pretty large performance improvement for semantic highlighting, since waiting for lifecycle jobs was taking the majority of the compute time for most requests.

However, the issue could still be reproduced if the server updates the document before the AST is computed. It's not something I'm able to reproduce on my machine, and I don't know if it's possible at all to happen in a real-world scenario. But by inserting a simple Thread.sleep(1000); at the top of SemanticTokensHandler.full, the issue once again appears. Again, what happens is that the server receives a request at version 1 of the document, but computes tokens for version 2 which came in during the Thread.sleep. The tokens are correct for the current version of the document, but since VS Code thinks they are for version 1, it tries to "fix" them, which in reality just messes them up.

The only way to work around this seems to be cancelling the request. But returning null will clear all the tokens (causing flickers of no highlighting), so it has to be done another way. Again, we could do it like redhat-developer/vscode-java#1632, via middleware in the languageclient. But I'm wondering whether or not it would be a good idea to implement it on the server side instead. There are some LSP error codes which could be used to cancel the request and avoid clearing the tokens, see microsoft/vscode-languageserver-node#576 and microsoft/vscode-languageserver-node@dae62de. But I think it would require an update to the vscode-languageclient package in the extension.

What do you think? The easiest fix would probably be to apply the old workaround in the extension, but it might be more "correct" to fix the problem on the server side depending on whether or not all clients expect the tokens to be for the version at the time of the request. Can't find anything about that in the spec though.

@Eskibear
Copy link
Contributor

Knowing this, I just removed the call to waitForLifecycleJobs, which seems to have fixed the issue. It should also provide a pretty large performance improvement

It makes sense. Good job.

But by inserting a simple Thread.sleep(1000); at the top of SemanticTokensHandler.full, the issue once again appears.

Sorry but I cannot reproduce it. In my test, after copy&paste for a couple of times, in the end vscode seems to always send an extra request to fetch tokens, and LS can return correct results. The only consequence by inserting above line is, a 1000ms flicker before I get correct highlighting.
I'm using latest vscode-insiders(b805d2e94937976bb17d0439f57fcd3a9d423c31), and test with [email protected] which is used current vscode-java extension.

but it might be more "correct" to fix the problem on the server side depending on whether or not all clients expect the tokens to be for the version at the time of the request.

Agree.

@0dinD 0dinD force-pushed the semantic-tokens-lsp branch 4 times, most recently from 3f62dd5 to a6b1f1d Compare July 25, 2021 15:18
@0dinD
Copy link
Contributor Author

0dinD commented Jul 25, 2021

Sorry but I cannot reproduce it. In my test, after copy&paste for a couple of times, in the end vscode seems to always send an extra request to fetch tokens, and LS can return correct results. The only consequence by inserting above line is, a 1000ms flicker before I get correct highlighting.

Yes, that flicker is the issue I was referring to. So only "part two" of the issue, not "part one" (see explanation here).

I just pushed some changes to this PR which should fix this issue. It works by cancelling the request by throwing a ResponseErrorException with the ContentModified error code, if the document changes while semantic tokens are computed. I think sending an error with the ContentModified code is correct based on reading microsoft/vscode-languageserver-node#576 and microsoft/language-server-protocol#584.

You'll also need to use redhat-developer/vscode-java#2000, because the languageclient needed an update (microsoft/vscode-languageserver-node@dae62de). On the old version it just clears all tokens when receiving an error, which causes flickering (microsoft/vscode-languageserver-node#576).

BTW, I realized that we do need to wait for document lifecycle jobs, because otherwise it's possible that the server has received the latest version of the document but hasn't processed it yet, leading to incorrect tokens that don't automatically repair themselves.

@0dinD
Copy link
Contributor Author

0dinD commented Sep 6, 2021

Test failures seem unrelated, the change in 023a0a1 was just to rebase against master and a6b1f1d passed all tests. When I ran some of the tests locally, they sometimes failed and sometimes passed, at random.

@Eskibear
Copy link
Contributor

Eskibear commented Sep 6, 2021

I'm ok with this PR. Just to ensure we don't have a regression.

As for the flickering issue on vscode, if bumping vscode-languageclient can perfectly fix that, I suggest we get this PR merged, and update languageclient lib in vscode-java.

@0dinD
Copy link
Contributor Author

0dinD commented Sep 6, 2021

Not sure if I completely understand your question, so to clarify and recap:

This PR and redhat-developer/vscode-java#2000 started out as just removing the delegate commands and using LSP for semantic tokens instead. In doing so, I discovered that the offset bug with semantic tokens got introduced again, because the previous workaround was in the extension's client-side code.

So I started investigating a more robust solution for how to solve the problem on the server-side, which relies on cancelling the request on the server side by returning an error with the ContentModified code as mentioned here. This is needed because the client expects the result of the semantic tokens command to return tokens for the version of the document at the time of the request, i.e. if the document updates mid-request, the tokens shouldn't. As far as I know this can't be guaranteed in an easy way when using the shared AST provider, that's why I cancel the request. This also allows us to return early, as opposed to "fixing" it in client-side code, which has to wait for the request to finish.

The last problem was that when cancelling the request using the ContentModified error, the tokens would flicker in VS Code (microsoft/vscode-languageserver-node#576). That's why an update to the languageclient library is needed (if you use the old version with this PR you should see flickering when making lots of edits to a document).

So yes, if you merge this PR, also merge redhat-developer/vscode-java#2000 or else there will be flickering. If you want to see the difference for yourself, make sure to completely rebuild the extension when changing the version of the languageclient.


And because I finally understood in entirety the way the offset bug occurred, I believe all edge cases should be covered now, as opposed to the previous workaround in redhat-developer/vscode-java#1632 which just discards the tokens on the client-side and naively uses a timer to "wait out" the document changes before trying again. I can totally see there being edge cases with this workaround where the bug can still happen if a request is abnormally slow.

@Eskibear Eskibear mentioned this pull request Sep 7, 2021
@Eskibear
Copy link
Contributor

Eskibear commented Sep 7, 2021

But by inserting a simple Thread.sleep(1000); at the top of SemanticTokensHandler.full, the issue once again appears.

There was misunderstanding, that I thought flickering issue was still there even after you apply this PR and update langaugeclient to v7.1.0-next.5. Now it's clear that you are talking about current workaround (delay) doesn't cover all edge cases.

So yes, if you merge this PR, also merge redhat-developer/vscode-java#2000 or else there will be flickering.
I believe all edge cases should be covered now,

This makes sense, as I saw no flickering issue last time when I tried with you both fixes in (#1806 & redhat-developer/vscode-java#2000).

If you want to see the difference for yourself, make sure to completely rebuild the extension when changing the version of the languageclient.

I'll test this PR + [email protected], see if I can see the difference.

@Eskibear
Copy link
Contributor

Eskibear commented Sep 7, 2021

test this please

@Eskibear
Copy link
Contributor

Eskibear commented Sep 7, 2021

By inserting Thread.sleep(1000) at beginning of SemanticTokensHandler.full(...), I can reproduce the flickering issue with both combinations below, and I don't see difference:

flickering.mp4

I can totally see there being edge cases with this workaround where the bug can still happen if a request is abnormally slow.

This PR looks good to me, and I'm happy to get it merged as soon as possible. My only concern is, for those unpowerful machines, requests are slow and users might frequently see the flickering issue. This turns out to be a "regression" to them if they don't see flickering at the moment. Any idea to get this coverred?

Let me know if I misunderstand anything.

@0dinD
Copy link
Contributor Author

0dinD commented Sep 7, 2021

By inserting Thread.sleep(1000) at beginning of SemanticTokensHandler.full(...), I can reproduce the flickering issue with both combinations below, and I don't see difference:

That to me sounds like you haven't applied this PR (or it hasn't been compiled successfully), I can't reproduce what you're showing in the video even with a Thread.sleep at the head of SemanticTokensHandler.full. I can reproduce what you're showing if I comment out the three documentMonitor.checkChanged(); lines, essentially reverting the offset fix from this PR.

The difference between the languageclient versions should be that when using the old version, the tokens flicker completely off for a second, i.e. you only see TextMate-based highlighting. That's because the old languageclient version clears the tokens when receiving the ContentModified error, whereas the new version simply ignores the incoming tokens but keeps the old.

Could you please try again, making sure to include both PRs, and that the extension and server have been successfully recompiled? If you can still reproduce the error, can you share your settings/environment so I can try to reproduce this?

@0dinD
Copy link
Contributor Author

0dinD commented Sep 7, 2021

languageclient 7.1.0-next.5: (works perfectly, unless I revert the offset fix by commenting out some code)

offset-flicker.mp4

languageclient 7.0.0: (flashes of no semantic tokens, notice the package names)

textmate-flicker.mp4

Again, note that when changing the version of the languageclient you need to do a full recompile, i.e. kill the watch task and start the build again after running npm install, otherwise you won't see the bug shown in the second video.

@Eskibear
Copy link
Contributor

Eskibear commented Sep 8, 2021

I'm able to see it now, it's working as expected! (I was probably using an early commit without DocumentMonitor...)

@Eskibear
Copy link
Contributor

Eskibear commented Sep 8, 2021

test this please

Copy link
Contributor

@Eskibear Eskibear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Kudos to your work.

@Eskibear
Copy link
Contributor

Eskibear commented Sep 8, 2021

CI failed, tracking in #1869

@fbricon fbricon merged commit 00b8336 into eclipse-jdtls:master Sep 8, 2021
@fbricon
Copy link
Contributor

fbricon commented Sep 8, 2021

Thanks @0dinD!

datho7561 added a commit to datho7561/eclipse.jdt.ls that referenced this pull request Nov 18, 2022
datho7561 added a commit to datho7561/eclipse.jdt.ls that referenced this pull request Dec 7, 2022
datho7561 added a commit to datho7561/eclipse.jdt.ls that referenced this pull request Dec 8, 2022
datho7561 added a commit to datho7561/eclipse.jdt.ls that referenced this pull request Dec 9, 2022
rgrunber pushed a commit that referenced this pull request Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Semantic highlighting is not available in lightweight mode Support 3.16 semantic tokens
3 participants