feat: Verification diagnostics (core version) #1942

MikaelMayer · 2022-03-29T17:44:32Z

Real-time verification diagnostics for the language server. Illustration on an example put in the documentation:

Also fixes #1918.

Features implemented:

Record the real-time verification diagnostics provided by Boogie into the DafnyDocument
Send verification diagnostics along with per-line status to the client for display
Migrates verification diagnostics on document change
Activate/deactivate verification diagnostics

Try this in VSCode IDE yourself:

Checkout the branch verification-diagnostics of Dafny
in dafny/, clone boogie and checkout the commit 6c703b1070fe3c96965289b7cb43c8fc79c5b2e3 (currently two above master)
Compile Dafny. It automatically depends on your installed nested boogie.
Clone verification-diagnostics-gutter from the VSCode extension, run code . in a terminal to open VSCode on it (or just run vSCode and open the ide-vscode folder), and press F5 to run another instance of VSCode.
Ensure in the settings that the paths to DafnyLanguageServer.dll points to the dafny/Binaries/DafnyLanguageServer.dll compiled in step 3.

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

WIP: send intermediate verification results Added information to publish verification diagnostics on the gutter

Removed obsolete

…ough for now.

All unvisited methods are set to verified by default.

Children are updated once everything finishes.

…solutionSucceeded

Started to add tests

Fixed timing computation.

keyboardDrummer · 2022-05-03T13:57:11Z

Personally I find that the API makes more choices about the visuals than I'd like.

It seems that he gutter icons combine information from different dimensions: status of verification pipeline, errorness, reachability of assertions. Personally I have trouble understanding all the combinations, and I think I'd prefer to see these dimensions separately. For example, I wouldn't mind having a single icon show the status of the verification pipeline, possibly on the method header. For the errorness, I don't understand why we need more information than what the diagnostics currently provide. Is that because for postcondition violations, the violated postcondition is in a related location and that doesn't get a red underline by default? For the reachability dimensions I don't understand the use-case well yet, but I guess that's on me.

Would it be possible to keep the different dimensions of information separate in the API and leave the precise choice of how to visualise things to the IDE ?

Something like:

Status of verification pipeline:

  public record FileVerificationStatus(
    DocumentUri Uri,
    int? Version,
    IReadOnlyList<MethodVerificationStatus> Methods) : IRequest;

  public record MethodVerificationStatus(Location Name, Location Body, VerificationStatus Status);

Reachability of assertions:

  public record FileAssertionInformation(
    DocumentUri Uri,
    int? Version,
    IReadOnlyList<AssertionInformation> Assertions) : IRequest;

  public record AssertionInformation(Location Body, bool WasVerified);

For the errorness, I guess that the information contained in ImmediatelyRelatedRanges is already available in the IDE through diagnostics.

For the counterexample trace, I would prefer a pull based API where the IDE has to request the information before it gets it.

keyboardDrummer · 2022-05-03T14:00:26Z

Source/DafnyLanguageServer/Workspace/Notifications/VerificationDiagnosticsParams.cs

+  /// A verification tree is an abstraction over the code to represent the verification
+  /// status of a region of the document, useful for IDE verification inspection.
+  /// A verification tree can contain other child trees.
+  /// It can currently be rendered linearly, e.g. for gutter display, or used as a tree in a test-like display. 


or used as a tree in a test-like display

Did the scope of this PR increase? For a 1800 line PR, I suggest we only try to cut the scope.

I'm unable to understand what this tree is doing. I feel like it's supposed to provide information for many different features. I'd like to have separate APIs for different features and to be able to verify that each API sends the minimal amount of required information, but with this large record I find it intractable to do that.

MikaelMayer · 2022-05-03T21:36:18Z

Personally I find that the API makes more choices about the visuals than I'd like.

Thanks for expressing your personal concern. I'll do my best to address it.

It seems that he gutter icons combine information from different dimensions: status of verification pipeline, errorness, reachability of assertions.

Errorness and reachability of assertions are the same dimension: Error, Verified, or not run.

Personally I have trouble understanding all the combinations, and I think I'd prefer to see these dimensions separately. For example, I wouldn't mind having a single icon show the status of the verification pipeline, possibly on the method header.

The icons on the gutter make it possible to see the verification status of the entire method only with the peripheral vision, which is less constraining than the focus that would be required by a single icon on the method header, icon which is almost always more far away than the gutter (and not viewing horizontally).

For the errorness, I don't understand why we need more information than what the diagnostics currently provide. Is that because for postcondition violations, the violated postcondition is in a related location and that doesn't get a red underline by default?

Yes, that's one point, violated postconditions don't get a red underline by default. And we have no interest in duplicating diagnostics (that would be required to get underlines on two different things).
But moreover, and this was even more important, since diagnostics were published at the very end, all these errors were not available until verification was finished, which was extremely frustrating. This might have changed with your recent incremental diagnostic publication PR, but I haven't checked yet.

For the reachability dimensions I don't understand the use-case well yet, but I guess that's on me.

The way the verifier works is that it will first figure out if an assertion fails, and it will normally be the first one. Then it will assume it, and figure out if other assertions fail, up to 5 times. Hence, if there are less than 5 errors, we can put a green check on all remaining assertions, because they are verified (conditionally to all assertions before them, but I assume users know that). If not, I don't display anything (to represent that the assertion wasn't even checked). Hence, this is the same dimension as the errorness.

Would it be possible to keep the different dimensions of information separate in the API and leave the precise choice of how to visualise things to the IDE ?

Something like:

Status of verification pipeline:

  public record FileVerificationStatus(
    DocumentUri Uri,
    int? Version,
    IReadOnlyList<MethodVerificationStatus> Methods) : IRequest;

  public record MethodVerificationStatus(Location Name, Location Body, VerificationStatus Status);

Reachability of assertions:

  public record FileAssertionInformation(
    DocumentUri Uri,
    int? Version,
    IReadOnlyList<AssertionInformation> Assertions) : IRequest;

  public record AssertionInformation(Location Body, bool WasVerified);

There is also the notion of assertion batch that I want to expose, so the format above seems limited. Did you have in mind two send two records?
Also, since down the road we want the tests to be triggerables, we also need the relationship between methods and assertions, so I don't think we should put them in two different buckets from an API's perspective.

For the errorness, I guess that the information contained in ImmediatelyRelatedRanges is already available in the IDE through diagnostics.

Yes, indirectly. Making it available directly makes it easier for IDE extensions not to reimplement the wheel.

For the counterexample trace, I would prefer a pull based API where the IDE has to request the information before it gets it.

Counterexample trace is another topic to be discussed in this other PR:
#1977

robin-aws · 2022-05-06T23:12:02Z

Working on catching up and just offering high-level thoughts for now.

I agree with @keyboardDrummer's difficulty in keeping track of all the different verification statuses, but I don't think it's beneficial to separate dimensions more because they are causally linked. You can't simultaneously have a "currently verifying" pipeline state and a "failed" verification status on the same line.

Instead to simplify things, I'd suggest not keeping around obsolete data. The important thing to communicate is whether verification obligations exist (i.e. the difference between Nothing and Scheduled or any other state), and where they are currently being determined or there is an answer. I don't think there's value in knowing that a line WAS correct/invalid previously if it may or may not change once verification catches up (either automatically or once you save).

I think having a more generic tree of data works better here to a point. Methods are not THAT significant a type here, given that each gets two assertion batches anyway, and there are lots of assertions that don't live in methods. Splitting up per file/document certainly makes sense though.

robin-aws · 2022-05-06T22:14:30Z

Source/DafnyLanguageServer.Test/Util/AssertWithDiff.cs

+
+namespace Microsoft.Dafny.LanguageServer.IntegrationTest.Util;
+
+public class AssertWithDiff {


Couldn't you add a dependency from DafnyLanguageServer.Test to XUnitExtensions instead of duplicating code?

These two projects do not have the same testing framework (one throws an XUnitException, the other one is Microsoft.VisualStudio.TestTools.UnitTesting).
Alternatively, I could refactor both codes, but I'm worried that it could cause trouble to have two different testing frameworks in the same project.

Source/DafnyLanguageServer/README.md

Source/DafnyLanguageServer.Test/Diagnostics/LinearVerificationDiagnosticTester.cs

robin-aws · 2022-05-06T23:09:53Z

Source/DafnyLanguageServer/Workspace/Notifications/VerificationDiagnosticsParams.cs

+  /// The verification status consists of two orthogonal concepts:
+  /// - StatusVerification: Nothing (initial), Error, Verified, or Inconclusive
+  /// - StatusCurrent: Current (Up-to-date), Obsolete (outdated), and Verifying (as notified by the verifier)


I don't think these are orthogonal, especially if you agree to drop Obsolete. You just end up with Nothing (initial), Verifying, Error, Verified, or Inconclusive

I don't think there's value in knowing that a line WAS correct/invalid previously if it may or may not change once verification catches up (either automatically or once you save).

The problem of removing the previous verification result is that, during the time it verifies (which are a few seconds), what you suggests really means hiding the previous state. Why would you hide the previous state way before you have some more up-to-date information? Users trying to fix an error might be thinking in parallel about the code and the previous verification result in other places which they believe would not have been impacted. Removing this information would result in a frustrating experience for them.

There are precedents about that: We decided to still display previous verification diagnostics (migrate them) although there might be parse/resolution errors. Moreover, in general, diagnostics are "updated" rather than "removed" and then "re-added" a few seconds later.
So, in short, it's something that users would expect and it's desirable.

To illustrate my point, let's consider a CALC statement with four statements, where Q2 is failing but Q3 and Q4 are succeeding. Let's say a user makes an edit on Q2, which makes it verify, but now Q3 does not verify anymore. The trace under this PR would be the following:

[ ][I][S][ ]: calc { [ ][I][S][ ]: Q1; [=][-][~][O]: Q2; [O][o][Q][=]: Q3; [O][o][Q][O]: Q4; [ ][I][S][ ]: }

With your approach of not making statuses orthogonal, I don't think how it would look better than this:

[ ] I S [ ]: calc { [ ] I S [ ]: Q1; [=] I S [O]: Q2; [O] I S [=]: Q3; [O] I S [O]: Q4; [ ] I S [ ]: }

Here are notable differences in the two approaches:

In the first version, the switching state from "error" to "verified" is instant on the fourth update (still giving the hint to the user that the error might be obsolete, and then actively re-checked), whereas in the second version, there is an intermediate state that completely hides all the previous information.

If I did not pay attention about the verification result between Q3 and Q4 (which can be farther away) because I was focused on Q1 to Q2, during the whole time the verifier runs (a few seconds), I would be blind about having reasons to believe that Q3 is still equals to Q4 (maybe I missed it, and the moment I want to know about whether Q3 == Q4, the verifier hid the previous result).

A few seconds of completely removing the previous information would have the effect to remove alignment information for the eye, so instead of seeing if an error has been fixed and the user can move on, the eye will have to start its course again from the line being edited to the left where the gutter is.

During the UX development with a UX designer, she agreed with me that it's better to minimize the frequency of full visual changes made to the gutter, because it should be supportive for the eye, not distract it.

Ideally, the eye should be able to stay on the code and only use peripheral vision to know the verification status. With this PR, the eye will be able to see an error "going down" with the peripheral vision, and since the user previously saw that this error was on the line being edited, they won't even need to look from right to left to see if the line has an error or not. With your alternative, after few seconds during which that information disappears, the eye will not be able to use peripheral vision to determine if a line verifies or not until it does the actual movement from right to left.

I can go on and on, and also explain you why displaying the error context is important as well, rather than just repeating the verification status on every line. Just ask me.

keyboardDrummer · 2022-05-08T12:22:02Z

Source/DafnyLanguageServer/Workspace/TextDocumentLoader.cs

+          foreach (var topLevelDecl in module.TopLevelDecls) {
+            if (topLevelDecl is TopLevelDeclWithMembers topLevelDeclWithMembers) {
+              foreach (var member in topLevelDeclWithMembers.Members) {
+                if (member.tok.filename != documentFilePath) {


In what scenario is this if branch true ?

I think, when we include a module from another file.

Could you document that by extracting the condition into a variable: var memberWasIncluded = member.tok.filename != documentFilePath ?

keyboardDrummer · 2022-05-08T12:22:32Z

Source/DafnyLanguageServer/Workspace/TextDocumentLoader.cs

+                  continue;
+                }
+                if (member is Field) {
+                  if (member.BodyEndTok.line == 0) {


In what scenario is this if branch true?

Looking deeper, it currently means that there is no body, so the constant does not have a value. See here:
https://github.com/dafny-lang/dafny/blob/verification-diagnostics-core/Source/Dafny/Dafny.atg#L1374
However, the goal of this if was to prevent the GetLspRange next line to raise an exception

Would it be possible to replace if (member is Field) { if (member.BodyEndTok.line == 0).. with if (member is ConstantField) ?

Alternatively, you could extract this into a variable to explain the condition, like var constantHasNoBody = member.BodyEndTok.line == 0.

Since that ConstantFields can have bodies, I'll go the second route.

Since that ConstantFields can have bodies, I'll go the second route.

My impression was that only ConstantFields have bodies and they always have them, so if you check for ConstantField you no longer need another check. Is that not the case?

You can declare Constant fields without a body. it's just that the compiler won't accept it, but it works for verification purposes, I think.

Source/DafnyLanguageServer/Workspace/TextDocumentLoader.cs

keyboardDrummer · 2022-05-08T12:32:08Z

Source/DafnyLanguageServer/Workspace/DiagnosticPublisher.cs

+        document.Version,
+        document.VerificationTree.Children.Select(child => child.GetCopyForNotification()).ToArray(),
+        errors,
+        linesCount,


What is this used for? I expect the IDE to know how many lines are in a document.

linescount is not sent to the IDE, it's only used for internal computation (i.e. creating the array of LineVerificationStatus)

I see. I would prefer it if VerificationStatusGutter was a record with a primary constructor, to make it more clear how much data it contains. However, I don't have good arguments for this so feel to leave it if you prefer that ^,^

If it was a record, indeed, every initialization variable would be public, so that's why I did not use a record.

Note that you can also add non-primary constructors to a record with a primary constructor:

record Computed(int TotalLength) { Computed(int bodyHeight, int shoeHeight) : this(bodyHeight + shoeHeight) { } }

But for a DTO I would prefer having the logic that computes the DTO be in another class altogether, like

record GutterIcons(IReadOnlyList<GutterIconEnum> Icons); class GutterIconsRenderer { public GutterIcons Render(..) }

Or, as a viable alternative, add a static method to GutterIcons that will do this computation and use the primary constructor, right?

Or, as a viable alternative, add a static method to GutterIcons that will do this computation and use the primary constructor, right?

That works

Implemented here
#2143

Source/DafnyLanguageServer/Workspace/DocumentDatabase.cs

Source/DafnyLanguageServer/Workspace/TextDocumentLoader.cs

Source/DafnyLanguageServer/Workspace/DiagnosticPublisher.cs

Source/DafnyLanguageServer/Workspace/DocumentDatabase.cs

Source/DafnyLanguageServer/Workspace/TextDocumentLoader.cs

keyboardDrummer · 2022-05-09T08:10:30Z

Working on catching up and just offering high-level thoughts for now.

I agree with @keyboardDrummer's difficulty in keeping track of all the different verification statuses, but I don't think it's beneficial to separate dimensions more because they are causally linked. You can't simultaneously have a "currently verifying" pipeline state and a "failed" verification status on the same line.

Instead to simplify things, I'd suggest not keeping around obsolete data. The important thing to communicate is whether verification obligations exist (i.e. the difference between Nothing and Scheduled or any other state), and where they are currently being determined or there is an answer. I don't think there's value in knowing that a line WAS correct/invalid previously if it may or may not change once verification catches up (either automatically or once you save).

I think having a more generic tree of data works better here to a point. Methods are not THAT significant a type here, given that each gets two assertion batches anyway, and there are lots of assertions that don't live in methods. Splitting up per file/document certainly makes sense though.

I've discussed this PR with @MikaelMayer and @cpitclaudel and we decided to keep the gutter icon enumeration as is, so that we move towards getting customer feedback on this feature.

We're accepting the risk that we might have to change the API in the future in a backwards incompatible way. Because this is a push API, we can update the API by adding a version number dafny/verification/status/gutterv2, and that means existing IDEs will simply not pick it up anymore while newer IDEs that only listen to dafny/verification/status/gutterv2 will simply not pick up the dafny/verification/status/gutter from older language servers.

@cpitclaudel expressed performance concerns with the added LSP API in the context of Emacs, so we might also change the API later to address that. Since this PR only publishes the new notification when this feature is turned on by the client, we're not worried about performance issues affecting customers that don't use the feature.

Co-authored-by: Robin Salkeld <[email protected]>

…fny-lang/dafny into verification-diagnostics-core

Source/DafnyLanguageServer/Workspace/DafnyDocument.cs

Source/DafnyLanguageServer/Workspace/DiagnosticPublisher.cs

robin-aws · 2022-05-11T17:08:57Z

I've discussed this PR with @MikaelMayer and @cpitclaudel and we decided to keep the gutter icon enumeration as is, so that we move towards getting customer feedback on this feature.

We're accepting the risk that we might have to change the API in the future in a backwards incompatible way. Because this is a push API, we can update the API by adding a version number dafny/verification/status/gutterv2, and that means existing IDEs will simply not pick it up anymore while newer IDEs that only listen to dafny/verification/status/gutterv2 will simply not pick up the dafny/verification/status/gutter from older language servers.

This sounds like a great compromise to me. :)

MikaelMayer added 30 commits February 16, 2022 09:15

Support for displaying intermediate verificaiton steps

57f060e

WIP: send intermediate verification results Added information to publish verification diagnostics on the gutter

WIP: Interface to gather verification diagnostics

366acd7

WIP adding real-time node diagnostics.

95a4bba

WIP reporting

c33ba29

Everything ready from server side to push realtime diagnostics !

0d6f029

Merge branch 'master' into verification-diagnostics

4627486

WIP added linesCount

0a61d12

It publishes correctly diagnostics and diagnostics per line

f34969d

Removed obsolete

Fixed the rendering problem. No real-time verification of branches th…

fa433be

…ough for now.

Verification priority implemented

332d385

All unvisited methods are set to verified by default.

Display resolution errors in the gutter

8379047

Big name refactoring

aa69de3

Relocate previous verification node diagnostics

5ff4004

Children are now in sync with their parents

e7dd3a3

Avoid considered done a method when it's verified twice.

f91ab64

record diagnostics to avoid concurrency bugs

dfea765

Children are updated once everything finishes.

Removed unused variable

bf46545

No setting of methods as verified if cancellation was requested

86e8960

Support for unnamed files

2c73f00

Corectly migrate verification diagnostics + error graph

3e98229

Merge branch 'master' into verification-diagnostics

0f7179c

When parsing fail and there is no verification, still set the flag Re…

b8dba6d

…solutionSucceeded

Ensure verification diagnostics only on one file.

ad14dd4

Started to add tests

Added the reference to the test

a027bcb

Expose metrics on the line of a function/method

eb9ecf3

Fixed timing computation.

Prettifying numbers for time and resource count

300eddf

Merge branch 'master' into verification-diagnostics

d01c415

Report implementations instead of tokens

a931a64

WIP to report single assertions

7e0d94b

WIP Moving assertions

be04ac3

keyboardDrummer reviewed May 3, 2022

View reviewed changes

MikaelMayer requested a review from robin-aws May 3, 2022 20:16

MikaelMayer mentioned this pull request May 6, 2022

Incremental verification diagnostics with migration support #2093

Merged

MikaelMayer and others added 2 commits May 6, 2022 16:38

Focus on the gutter only

f224012

Merge branch 'master' into verification-diagnostics-core

ea59783

MikaelMayer requested a review from keyboardDrummer May 6, 2022 21:39

robin-aws reviewed May 6, 2022

View reviewed changes

keyboardDrummer reviewed May 9, 2022

View reviewed changes

MikaelMayer and others added 6 commits May 9, 2022 10:51

Update Source/DafnyLanguageServer/README.md

4d78a8f

Co-authored-by: Robin Salkeld <[email protected]>

Review refactorings

149f885

Merge branch 'verification-diagnostics-core' of https://github.com/da…

78b895f

…fny-lang/dafny into verification-diagnostics-core

Merge branch 'master' into verification-diagnostics-core

bf06ea0

Updated README.md with gutterStatus instead of diagnostics

da3d18a

Merge branch 'verification-diagnostics-core' of https://github.com/da…

b887388

…fny-lang/dafny into verification-diagnostics-core

keyboardDrummer reviewed May 10, 2022

View reviewed changes

Source/DafnyLanguageServer/Workspace/DafnyDocument.cs Outdated Show resolved Hide resolved

Source/DafnyLanguageServer/Workspace/DafnyDocument.cs Outdated Show resolved Hide resolved

Source/DafnyLanguageServer/Workspace/DiagnosticPublisher.cs Outdated Show resolved Hide resolved

MikaelMayer added 4 commits May 10, 2022 09:51

More review comments

d021a3e

More review comments + fixing a strange out of bound exception

9f1632f

Last comments.

92c833c

Fixed a typo.

fe890a1

MikaelMayer mentioned this pull request May 10, 2022

Publish verification status notifications #2094

Merged

MikaelMayer and others added 2 commits May 10, 2022 15:27

Fixed a CI test

811874a

Merge branch 'master' into verification-diagnostics-core

8b4e611

keyboardDrummer approved these changes May 11, 2022

View reviewed changes

MikaelMayer merged commit 52f7552 into master May 11, 2022

MikaelMayer deleted the verification-diagnostics-core branch May 11, 2022 17:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Verification diagnostics (core version) #1942

feat: Verification diagnostics (core version) #1942

MikaelMayer commented Mar 29, 2022

keyboardDrummer commented May 3, 2022

keyboardDrummer May 3, 2022 •

edited

Loading

MikaelMayer commented May 3, 2022

robin-aws commented May 6, 2022

robin-aws May 6, 2022

MikaelMayer May 9, 2022

robin-aws May 6, 2022

MikaelMayer May 9, 2022 •

edited

Loading

keyboardDrummer May 8, 2022

MikaelMayer May 9, 2022

keyboardDrummer May 10, 2022

keyboardDrummer May 8, 2022

MikaelMayer May 9, 2022

keyboardDrummer May 10, 2022

MikaelMayer May 10, 2022

keyboardDrummer May 11, 2022 •

edited

Loading

MikaelMayer May 11, 2022

keyboardDrummer May 8, 2022

MikaelMayer May 9, 2022

keyboardDrummer May 10, 2022

MikaelMayer May 10, 2022

keyboardDrummer May 11, 2022 •

edited

Loading

MikaelMayer May 11, 2022

keyboardDrummer May 11, 2022

MikaelMayer May 16, 2022

keyboardDrummer commented May 9, 2022 •

edited

Loading

robin-aws commented May 11, 2022


		namespace Microsoft.Dafny.LanguageServer.IntegrationTest.Util;

		public class AssertWithDiff {

feat: Verification diagnostics (core version) #1942

feat: Verification diagnostics (core version) #1942

Conversation

MikaelMayer commented Mar 29, 2022

Try this in VSCode IDE yourself:

keyboardDrummer commented May 3, 2022

keyboardDrummer May 3, 2022 • edited Loading

Choose a reason for hiding this comment

MikaelMayer commented May 3, 2022

robin-aws commented May 6, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MikaelMayer May 9, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keyboardDrummer May 11, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keyboardDrummer May 11, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keyboardDrummer commented May 9, 2022 • edited Loading

robin-aws commented May 11, 2022

keyboardDrummer May 3, 2022 •

edited

Loading

MikaelMayer May 9, 2022 •

edited

Loading

keyboardDrummer May 11, 2022 •

edited

Loading

keyboardDrummer May 11, 2022 •

edited

Loading

keyboardDrummer commented May 9, 2022 •

edited

Loading