Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race causing workspace diagnostics to be stale #73653

Merged
merged 2 commits into from
May 30, 2024

Conversation

dibarbet
Copy link
Member

@dibarbet dibarbet commented May 23, 2024

Fixes an issue I believe causing the workspace integration tests in VSCode to be flaky.

Also noticed this when running workspace diagnostics locally in VS - sometimes workspace diagnostics requests would not be closed after a didChange happened, leading to potentially stale diagnostics.

@dibarbet dibarbet requested a review from a team as a code owner May 23, 2024 01:29
@dotnet-issue-labeler dotnet-issue-labeler bot added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels May 23, 2024
// Spin waiting until our LSP change flag has been set. When the flag is set (meaning LSP has changed),
// we reset the flag to false and exit out of the loop allowing the request to close.
// The client will automatically trigger a new request as soon as we close it, bringing us up to date on diagnostics.
while (Interlocked.CompareExchange(ref _lspChanged, value: 0, comparand: 1) == 0)
while (!HasChanged())
Copy link
Member Author

@dibarbet dibarbet May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The race was here - when there are multiple diagnostic categories (and therefore multiple requests from the client for ws diagnostics), we would only succeed is closing 1 of the requests, leaving the rest open (and stale).

That was because foreach request there is a call to WaitForChangesAsync- however only one of the waiting requests would observe the update to _lspChanged, as the first request immediately resets it back to 0 when it closes, leaving the rest still open.

This fixes the issue by storing the changed state per category, so the request for each category can separately observe the state change.

{
// Get the LSP changed value of this category. If it doesn't exist we add it with a value of 'changed' since this is the first
// request for the category and we don't know if it has changed since the request started.
var changed = _categoryToLspChanged.GetOrAdd(category, true);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight behavior change here - since we don't know all the categories up front, we aren't able to tell on the first request if the lsp solution has changed (we don't have prior state).

So we assume it has and tell the client to re-query. This will only ever happen once on the very first request for a given category

@@ -1926,7 +1926,12 @@ public async Task TestWorkspaceDiagnosticsWaitsForLspTextChanges(bool useVSDiagn
await using var testLspServer = await CreateTestWorkspaceWithDiagnosticsAsync(
[markup1, markup2], mutatingLspWorkspace, BackgroundAnalysisScope.FullSolution, useVSDiagnostics);

// The very first request should return immediately (as we're have no prior state to tell if the sln changed).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fallout of the behavior change above.

Copy link
Member

@sharwell sharwell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it will be good to fix this race. I suspect it's one of the reasons I've been having all sorts of trouble with diagnostics. Requesting a different primitive operation though.

/// Stores the LSP changed state on a per category basis. This ensures that requests for different categories
/// are 'walled off' from each other and only reset state for their own category.
/// </summary>
private readonly Dictionary<string, bool> _categoryToLspChanged = new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❗ If we make this a Dictionary<string, CancellationTokenSeries>, we can get rid of polling for changes altogether.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, with CancellationTokenSeries as the value, I believe we can move to ImmutableDictionary<string, CancellationTokenSeries> and also get rid of the gate.

Copy link
Member Author

@dibarbet dibarbet May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sharwell I'm not seeing an easy way to use CancellationSeries here - I was initially thinking something like

private void UpdateLspChanged()
{
      // Loop through our map of source -> has changed and mark them as all having changed.
      foreach (var category in _categoryToLspChanged.Keys.ToImmutableArray())
      {
          _categoryToLspChanged[category].Token.Cancel()?
      }
}

protected async Task WaitForChangesAsync(...)
{
    var token = _categoryToLspChanged[category].Token?;
  
    // Wait for token to be cancelled
    ...
  
    _categoryToLspChanged[category].CreateNext();
}

but cancellation series doesn't seem to provide access to the current cancellation token. I can't have UpdateLspChanged create the next item in the cancellation series because the change event could happen before the WaitForChangesAsync method gets called (and therefore we only observe the latest, uncancelled token if though there has been a change).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dibarbet I implemented the described change in a local commit that builds on this change. Please let me know if you would like me to push it to this pull request.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented a version of this for review in 82dde3a

@CyrusNajmabadi
Copy link
Member

Will review later tonight (after 10pm) or tomorrow morning.

return;

bool HasChanged()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i'd prefer this as a normal method, with 'category' passed in. but up do you.

@CyrusNajmabadi
Copy link
Member

there are probably some nicer patterns for this. However, this fixes the problem immediately in a way that fits the current pattern. Feel free to work with sam on refactorings here to other patterns if time permits. But we can get this PR in asap to address the race we have right now.

@sharwell sharwell dismissed their stale review May 30, 2024 13:07

Dismissing because the polling was already present before this change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants