-
Notifications
You must be signed in to change notification settings - Fork 401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition with the shared AST provider #1918
Comments
Actually, I think I may have found a more "correct" way to work around it (how I'm guessing it's done in the Eclipse IDE). What we probably should do is make sure that whenever we call
This sounds similar to what we're already doing when calling Do note that the So it seems like we should find places where we call Does this make sense to you? |
I guess another workaround would be to always call |
Using IWorkspace#run() to wrap the semantic token calculation looks good to me. As long as the scheduling rule set to that URI, the semantic token calculation should be mutually exclusive with the document lifecycle for the file with that URI. @0dinD Did you observe any perf change if we apply such changes? |
I didn't thoroughly test performance, but I don't remember there being any noticeable impact. As far as I understand, using Overall I think it's a pretty good workaround and it should fix the desync bug for semantic tokens. But I do wonder if other parts of the codebase could be affected, because it seems like using the I can have a look at using Update: |
I have now had some time to investigate this in more detail again. First of all, I don't think my previous suggestion to use Investigating CoreASTProvider usage in JDT LS vs JDT UIAfter concluding that it would probably be better to investigate the root cause rather than working around it, I had a look at the JDT UI code in an attempt to understand why this synchronization issue doesn't seem to affect JDT UI. I'm still not 100% sure how In JDT LS, we call In JDT UI, So how does JDT UI create the new AST after calling So in conclusion, JDT UI uses a very different strategy for CoreASTProvider, compared to JDT LS. This is probably the reason why the synchronization problem (eclipse-jdt/eclipse.jdt.core#1151) doesn't seem to affect JDT UI. Maybe With the above in mind, what can be done to solve this issue in JDT LS?I see a few different alternatives: Option AInvestigate whether or not JDT LS should be using Option BInvestigate whether or not we can fix eclipse-jdt/eclipse.jdt.core#1151 in JDT Core. Option CCall I will submit a PR to implement this approach as it seems like the best option to me (it completely fixes this issue and the implementation is very simple). Of course, I welcome any help in exploring and investigating the other options, I just don't feel like I have the time to do that (and I don't have that much experience with the inner workings of JDT UI). I have done some more testing using the steps to reproduce the race condition 100% of the time (which I shared in my initial comment), and after adding the Update: Using |
Indeed, pretty much anything that uses |
In my debugging efforts on redhat-developer/vscode-java#2176, I've found what I believe to be the root cause of that issue: a race condition with the shared AST provider.
Here's what happens when you get desynced semantic tokens in the client (time increases with each row):
CoreASTProvider#getAST
CoreASTProvider#createAST
because there was no cached ASTCoreASTProvider#disposeAST
CoreASTProvider#cache
with the newly created (but now outdated) ASTCoreASTProvider#getAST
documentMonitor#checkChanged
, which aborts the request (and the client will send a new one)CoreASTProvider#getAST
CoreASTProvider#disposeAST
CoreASTProvider#getAST
, which creates a new (correct) ASTNote: the document version doesn't change during the second semantic tokens request on Thread C, and the buffer contents are correct. What causes the desynced tokens is that the AST being used is outdated.
This is a race condition between the second semantic tokens request and the publish diagnostics job. Because if the publish diagnostics job calls
CoreASTProvider#disposeAST
before the second semantic tokens request callsCoreASTProvider#getAST
, the cached AST will be updated in time and hence the tokens won't become desynced. That's why the issue is so hard to reproduce consistently: the time window where the second semantic tokens request "wins the race" over the publish diagnostics job is very narrow.With all this newfound knowledge, I managed to create a reproducible scenario that works every time:
See it in action here:
reproduce-race-condition.mp4
One solution to this problem is to wait for both
DOCUMENT_LIFE_CYCLE_JOBS
andPUBLISH_DIAGNOSTICS_JOBS
in theSemanticTokensHandler
, to avoid the race condition. However, in many cases this will just slow the semantic tokens request down since there isn't always a race.Another solution is to always call
CoreASTProvider#disposeAST
in theSemanticTokensHandler
, to make sure the AST being used is never from the cache. But this will also slow the request down in the cases where there is perfectly good cached AST.Neither of these workarounds really address what I think is the real race condition: the one between
CoreASTProvider#getAST
andCoreASTProvider#disposeAST
. If two threads call these methods at roughly the same time, the methods will race andCoreASTProvider#disposeAST
might not actually be successful in disposing the AST. This is of course assuming thatCoreASTProvider#getAST
was called with something other thanWAIT_NO
.I'm not familiar enough with upstream Eclipse libraries to know whether or not this should be considered a bug (and in that case, does it affect the Eclipse IDE?), or if it's an implementation error in JDT.LS.
Since
CoreASTProvider#disposeAST
is a synchronized method, I did manage to work around the issue by wrapping much of the code inSemanticTokensHandler
in asynchronized (CoreASTProvider.getInstance()) {...}
block.But that only solves the race condition for the
SemanticTokensHandler
. What about all the other places where we might callCoreASTProvider#getAST
? I feel like this issue with the shared AST provider might affect other parts of the code, see for example redhat-developer/vscode-java#2173 which looks somewhat similar (but also affects formatting).So, to summarize: I know fixes/workarounds for redhat-developer/vscode-java#2176, but am not sure which is the best one. It's also possible that there are better ways of solving the problem, or that the race condition should be fixed upstream. If this issue doesn't affect the Eclipse IDE, I wonder how they've worked around it?
I'd appreciate any help from those who have more experience working with the upstream libraries.
The text was updated successfully, but these errors were encountered: