-
Notifications
You must be signed in to change notification settings - Fork 260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow writing tests with inline range and position definitions #2799
Allow writing tests with inline range and position definitions #2799
Conversation
…nTest over to new system
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this new format for the definition tests, and the code for computing newlines is way cleaner.
I have a few suggestions to make the tests even more maintainable.
function Foo([|value|]: Identity<Colors>): bool { | ||
match va$$lue { | ||
case Ide$$ntity(Red()) => true | ||
case Identity(Gr$$een) => false // Warning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this new test format ! I can finally understand what each position and range is about :-)
I also see that you can extract named spans in your code, but I don't see them yet in tests....
So I would like to challenge you on this second but final improvement: Rather on referring to these ranges and positions by ranges[i]
and positions[j]
, could you use named ranges instead?
function Foo([|value|]: Identity<Colors>): bool { | |
match va$$lue { | |
case Ide$$ntity(Red()) => true | |
case Identity(Gr$$een) => false // Warning | |
function Foo({|0:value|}: Identity<Colors>): bool { | |
match va$0$lue { | |
case Ide$$ntity(Red()) => true | |
case Identity(Gr$$een) => false // Warning |
The point of using named ranges (and thus named positions), is that your test becomes self-contained in the source: You can just iterate over named positions, query for the definition range, and verify it corresponds to the same named range.
It makes adding new tests extremely easy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good :)
var spanStartStack = new Stack<(int matchIndex, string name)>(); | ||
var namedSpanStartStack = new Stack<(int matchIndex, string name)>(); | ||
|
||
while (true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the current algorithm is the following:
- Start with index 0
- Collect the first index at which there is a match for each of $$, [|, |], |}, or {|name:
- Find the first match. If there is an ambiguity such as [|], then try to close existing opened spans or named spans first..
- If it's a position, we add it to the position
- If we can close a span or named span, then we add it. If it's an open span or named span, we put it on a stack.
- In any case, we keep an offset to always keep the right position.
I see that you need to do something like this because you can have positions inside spans, so that makes sense.
However, could you envision the following cleaner code:
- Use
{{name:
,[[
,]]
,}}
or another markup chosen by you so that you just cannot have the[|]
or{|]
ambiguities. I also though of[<
and>]
and have the positions marked like a cross like><
, but feel free to use your own ideas. - Make use of a regular expression that avoids running match four times and one extra time for the regular expression. I'm not concerned about performance, since these are small tests, but more about maintenance::
var r = new Regex(@"(?<Position>\$\$)|(?<SpanStart>\[\|)|(?<SpanEnd>\|\])|(?<NameSpanStart>\{\| ([-_.A-Za-z0-9\+]+) \:)|(?<NameSpanEnd>\|\})");
var lastIndex = 0;
foreach(var match in r.matches(input)) {
if(match.Groups["Position"].Success) {
....
} else if(match.Groups["SpanStart"].Success) {
} ....
// Store the string between lastIndex and match.Index
// Reassign lastIndex
var lastIndex = match.Index + match.Groups[0].Value.Length;
// And other offset computations to keep track of positions
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also though of [< and >] and have the positions marked like a cross like ><, but feel free to use your own ideas.
I like that, nice idea.
I'm hesitant about changing the code though since I took this from Roslyn so I'm confident it works and won't need much maintenance. Is the ambiguity a problem that needs to be solved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that 15 lines of code are dedicated to solve an ambiguity which has no reason to exist other than an arbitrary design is kind-of bothering me :-) But do you think the proposed fix for the ambiguity requires too much work?
I'm also surprised that Roslyn provided such code that seems underperforming at best (O(N*M) where N is the length of the code and M is the number of spans/positions), but moreover it was very confusing to me when I reviewed it - I kept asking myself why do we do all these matches as well as one regex? Why sorting a list? Why implementing in a worse complexity what a simpler regular expression can do?
Do you think the version I suggested with regex.matches might be less clear, too long or less maintainable ?
I know you already did a lot of effort there so I don't want to ask you too much.
For me, implementing it with regex.matches will make me much more comfortable to approve it than the current code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the change. I picked [>...<]
and {>...<}
over [<...>]
and {<...>}
because I found it more visually appealing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just comments to change for review, but thanks for implementing this regex-based algorithm ! It was very easy to understand.
I'm ok dropping my ask to make tests totally self contained (by introducing a syntax like >0<
) because as now it's already quite good.
Co-authored-by: Mikaël Mayer <[email protected]>
Co-authored-by: Mikaël Mayer <[email protected]>
Co-authored-by: Mikaël Mayer <[email protected]>
Co-authored-by: Mikaël Mayer <[email protected]>
Co-authored-by: Mikaël Mayer <[email protected]>
Are they not self-contained right now? By ordering the ranges it also works out right? |
Co-authored-by: Mikaël Mayer <[email protected]>
They are almost all self-contained, except one or two tests where the ranges are not in the same order as the positions: But that's ok. |
The problem with those tests is that some positions don't map to any range, since they're supposed to not resolve, so we would need something like |
Clever idea, both the way to solve this problem and for the associated symbol ! That seems great to me, as long as it's documented at the beginning of the test file. Alternatively, |
Changes
MarkupTestFile.cs
, a file for writing tests with inline range and position definitions.DefinitionTest.cs
O(DocumentSize)
computation to convert between positions and indicesTesting
By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.