Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update 'convert to raw string' to help cleanup strings with leading whitespace #60936

Merged
merged 7 commits into from
Apr 26, 2022

Conversation

CyrusNajmabadi
Copy link
Member

@CyrusNajmabadi CyrusNajmabadi commented Apr 24, 2022

Fixes: #59591

This is for the case where someone writes code like so:

image

With teh existing feature you would get:

image

Which is not ideal. Now, we offer two options here. The normal semantic-preserving change, but also a new semantic-changing option:

image

Choosing this will now produce:

image

Which now matches the user's intent here for their particular domain.

@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner April 24, 2022 23:20
@jnm2
Copy link
Contributor

jnm2 commented Apr 25, 2022

Will this be able to be applied solution-wide? I remember some fixes don't automatically get this ability.

@mavasani
Copy link
Contributor

Will this be able to be applied solution-wide? I remember some fixes don't automatically get this ability.

We should be able to after we merge #60906 and enable FixAll support for this specific refactoring. This is the exact thing that @CyrusNajmabadi suggested to me yesterday while reviewing that PR.

<comment>This clause is a follow up to the "Convert to raw string" loc string.
The intent is that the user sees "Convert to raw string" as an option to select,
and that is then followed with this clause. This is so we don't have a huge string
saying "Convert to raw string without leading whitespace (may change semantics)"</comment>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need "(may change semantics)"? I would think that a) removing leading whitespace would imply that and b) its a refactoring so could change semantics

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do think it's important (personally). Otherwise, i worry that someone may be confused about waht "indentation" means. e.g. if it's the content indentation that is adjusted, or the syntax indentation is adjusted.

var length = Math.Min(leadingWhitespace1.Length, leadingWhitespace2.Length);

var current = 0;
while (current < length && IsCSharpWhitespace(leadingWhitespace1[current]) && leadingWhitespace1[current].Rune == leadingWhitespace2[current].Rune)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we care about cases of mixed tabs/spaces?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with thsi only trimming off the part it is certain is shared across the files. If you mix tabs/spaces... well... that's either rare or bizarre, and i'm ok ignoring :)

return commonLeadingWhitespace.Length;
}

private static VirtualCharSequence ComputeCommonWhitespacePrefix(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this just be an int?

Where you consume the current line text up until either the current 'common' index or a non-whitespace char?

Or maybe just a simple extension to virtual char that is GetFirstNonWhitespaceIndex and then compare the indices? Think that could replace some other helpers (AllWhitespace) as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll see if i can do that! :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, this can't be an int. if it's an int, we can't distinguish spaces/tabs.

}

// Remove all trailing whitespace and newlines from the final string.
while (result.Count > 0 && (IsCSharpNewLine(result[^1]) || IsCSharpWhitespace(result[^1])))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasn't this done here?
https://github.com/dotnet/roslyn/pull/60936/files#diff-0428e947f9429e42713669adec6cd527db6d211b4197c9fdf2b6af0cf55cd77aR190

We can't add more trailing whitespace as part of skipping the common ones can we?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's slightly different. the first loop is removing entirely blank lines. The last loop is removing any trailing newline from the last line (which is not an allwhitespace line (since htat would have already been removed)).

return line.GetSubSequence(TextSpan.FromBounds(0, current));
}

private static void BreakIntoLines(VirtualCharSequence characters, ArrayBuilder<VirtualCharSequence> lines)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should virtualcharsequence have a split extension that takes in the characters to split on?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly. though \r\n would be a PITA to have to deal with :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah good point

lines.RemoveAt(lines.Count - 1);

if (lines.Count == 0)
return VirtualCharSequence.Empty;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check be moved up and prevent the fix from being offered? I can't think of why someone would have a bunch of whitespace is a verbatim string, but offering to convert it to an empty string is probably not what they want to happen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair point. Updated to not offer in that case.

@CyrusNajmabadi CyrusNajmabadi merged commit ceadf94 into dotnet:main Apr 26, 2022
@ghost ghost added this to the Next milestone Apr 26, 2022
@CyrusNajmabadi CyrusNajmabadi deleted the rawStringCleanupWhitespace branch April 27, 2022 04:58
@RikkiGibson RikkiGibson modified the milestones: Next, 17.3 P3 Jun 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"Convert to raw string" fixer which removes indentation and initial blank line from the string value
7 participants