Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache the unescaped form of globalconfig section names #59301

Merged

Conversation

jasonmalinowski
Copy link
Member

@jasonmalinowski jasonmalinowski commented Feb 4, 2022

We were unescaping section names each time we were trying to find the applicable config rules for a file; if we were computing options for each file in a compilation this resulted in a huge amount of duplicate work.

Fixes AB#1472298

@jasonmalinowski
Copy link
Member Author

@tmat @sharwell for review.

public GlobalAnalyzerConfig(AnalyzerConfig.Section globalSection, ImmutableArray<AnalyzerConfig.Section> namedSections)
{
GlobalSection = globalSection;
NamedSections = namedSections;

var unescapedSectionNames = ArrayBuilder<string?>.GetInstance();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually use the unescaped section name anywhere? Consider just updating the namedSections in place with the unescaped name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm thinking about it, maybe we can just do the escaping as we build the global config. I don't think the unescaped name is ever used anywhere, so we can probably just do it once upfront instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chsienki I was somewhat worried whether that might introduce new bugs since it means some section names are escaped and some aren't and it's more or less up to you to know what's what. Now that said, there's other code here:

if (!_values.TryGetValue(section.Name, out var sectionDict))
{
sectionDict = ImmutableDictionary.CreateBuilder<string, (string, string, int)>(Section.PropertiesKeyComparer);
_values.Add(section.Name, sectionDict);
}
_duplicates.TryGetValue(section.Name, out var duplicateDict);

That's using the escaped string when trying to merge sections. Looking at our lexer \ is used to escape and character, even if it otherwise doesn't require escaping, so it would mean Foo and F\oo will convert to the same string, but we won't merge the properties because we're dealing with escaped and not unescaped strings.

I can't tell if that's a bug in our lexer that we even allow such a construct (the .editorconfig specification says "Special characters can be escaped with a backslash so they won't be interpreted as wildcard patterns.", which isn't "all characters can be escaped..."), but in general this is arguing for doing the escaping earlier, I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasonmalinowski Yeah, the specification is annoyingly vague on that front. I'm not sure where I land on which way we should interpret it, but think the current behavior is probably fine. If you want a backslash you can escape it, so its at least consistent and keeps the lexer simpler.

That being said global configs are already not editor configs, so I think as long as we decide on the explicit set of rules there (and document them) we're probably ok.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chsienki Yeah, I'm OK keeping the lexer behavior. Does that then mean we have a bug for the Foo and Fo\o example? Because if we're calling that a bug then that absolutely motivates me to redo this PR to do the unescape process earlier and just update the section name directly since the more look the more I'm finding problems here otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think doing it earlier is the right approach here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I just went to check about path separators, and in calling out that \ is not a valid path seperator it actually lists its as \\
image

So I think the intention behind the spec is also that a backslash char is always an escape character. And we can just process up front and be correct / compliant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, time to rewrite this PR then!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my bad we don't have the bug. Once we're actually looping through global configs we don't break on the first one but keep searching regardless. Still going to simplify this though that way.

We were unescaping section names each time we were trying to find the applicable config rules for a file; if we were computing options for each file in a compilation this resulted in a huge amount of duplicate work.

Fixes AB#1472298
@jasonmalinowski jasonmalinowski dismissed sharwell’s stale review February 9, 2022 20:45

Fundamentally different approach taken, so resetting approval.

@jasonmalinowski
Copy link
Member Author

@sharwell @chsienki Rewritten to the other approach of just changing the section name right away.

@dotnet dotnet deleted a comment from azure-pipelines bot Feb 11, 2022
@jasonmalinowski
Copy link
Member Author

Thanks @jaredpar and @chsienki for the reviews!

@jasonmalinowski jasonmalinowski merged commit 73eeb59 into dotnet:main Feb 15, 2022
@jasonmalinowski jasonmalinowski deleted the fix-global-config-perf-issue branch February 15, 2022 00:54
@ghost ghost added this to the Next milestone Feb 15, 2022
@RikkiGibson RikkiGibson modified the milestones: Next, 17.2.P2 Mar 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants