Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix analyzer conflicts on Linux #69406

Merged
merged 6 commits into from
Aug 9, 2023
Merged

Fix analyzer conflicts on Linux #69406

merged 6 commits into from
Aug 9, 2023

Conversation

jaredpar
Copy link
Member

@jaredpar jaredpar commented Aug 4, 2023

This changes analyzer loading on Linux to be per user where before clashes could occur in the temp directory.

This fix initially started out by making our temp directory logic more robust. Essentially add per user sub directories, do extended permission checks, etc ... That exacerbated a number of existing corner cases in the code and forced us to put a lot more error handling into the workspaces layer.

After discussion with the IDE team I decided to take the approach of loading from a Stream on non-Windows platforms. The performance difference is neglible (1ms to 2ms per assembly) and it removes the overhead of copying them around on disk as well as the complexities that come with managing the shadow copy directory.

This will be observable to any analyzer that used Assembly.Location as it will now be "". That is only useful though for inspecting disks which is already illegal for analyzers hence it's considered an acceptable break.

To test the performance of this change I wrote a simple benchmark that loads assemblies directly from path, copy then load (simulate shadow copy) and loading them from stream. The results on Windows is pretty striking

Method Mean Error StdDev Median
LoadViaPath 1.552 ms 0.0387 ms 0.1142 ms 1.529 ms
LoadViaPathShadow 33.147 ms 0.9000 ms 2.6537 ms 33.883 ms
LoadViaStream 495.945 ms 9.5040 ms 9.7599 ms 493.297 ms

Notice the penalty of LoadFromStream is severe on Windows. That is due to the AV having to run on every load since it can't cache like it can when loading from path. On Linux though the results are much nicer

Method Mean Error StdDev Median
LoadViaPath 1.301 ms 0.0258 ms 0.0326 ms 1.299 ms
LoadViaPathShadow 4.126 ms 0.3567 ms 1.0060 ms 3.742 ms
LoadViaStream 2.514 ms 0.0501 ms 0.0536 ms 2.497 m

This means the change should be a small win on Linux but probably not in the noticable range.

closes #65415

This changes analyzer loading on Linux to be per user where before
clashes could occur in the temp directory.

This fix initially started out by making our temp directory logic more
robust. Essentially add per user sub directories, do extended permission
checks, etc ... That exacerbated a number of existing corner cases in
the code and forced us to put a lot more error handling into the
workspaces layer.

After discussion with the IDE team I decided to take the approach of
loading from a `Stream` on non-Windows platforms. The performance
difference is neglible (1ms to 2ms per assembly) and it removes the
overhead of copying them around on disk as well as the complexities that
come with managing the shadow copy directory.

This will be observable to any analyzer that used `Assembly.Location` as
it will now be `""`. That is only useful though for inspecting disks
which is already illegal for analyzers hence it's considered an
acceptable break.

closes dotnet#65415
@dotnet-issue-labeler dotnet-issue-labeler bot added Area-Analyzers untriaged Issues and PRs which have not yet been triaged by a lead labels Aug 4, 2023
@jaredpar jaredpar marked this pull request as ready for review August 7, 2023 16:47
@jaredpar jaredpar requested review from 333fred, dibarbet and a team as code owners August 7, 2023 16:47
@jaredpar
Copy link
Member Author

jaredpar commented Aug 7, 2023

@dotnet/roslyn-compiler PTAL
@jasonmalinowski PTAL at IDE changes

@@ -18,6 +18,7 @@ internal sealed class RemoteAnalyzerAssemblyLoader : AnalyzerAssemblyLoader

public RemoteAnalyzerAssemblyLoader(string baseDirectory)
{
// jason should we load from stream here on Linux or is this VS only?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasonmalinowski I wanted you to look at this. The current state is always loading directly from disk (no shadow copy) and confirm it is the correct decision. If so I'll just leave this file alone (maybe add a comment to the effect). If this runs on Linux we could consider doing a stream based load here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file isn't used on Linux, so your question may be somewhat moot. But the comment here makes me think that this kicks in if we're loading analyzers that shipped in Roslyn itself, so reading directly from disk is fine since there's not locking concerns in the first place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@genlu may also want to chime in here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the sole purpose of this is to load our own assemblies in ServiceHub process

@RikkiGibson RikkiGibson self-assigned this Aug 8, 2023
@jaredpar jaredpar removed the untriaged Issues and PRs which have not yet been triaged by a lead label Aug 9, 2023
@jaredpar jaredpar merged commit f7cb1f2 into dotnet:main Aug 9, 2023
27 checks passed
@jaredpar jaredpar deleted the temp-dir2 branch August 9, 2023 16:47
@ghost ghost added this to the Next milestone Aug 9, 2023
jaredpar added a commit to jaredpar/roslyn that referenced this pull request Aug 10, 2023
For dotnet#69406 I updated several places where the IDE created analyzers.
Even though I specifically called out the PR to the IDE team, I forgot
I'd changed their code and merged once I had two compiler sign offs.
This PR is commenting the parts of the IDE I changed in the PR so I can
get actual sign off on these changes.
Copy link
Member

@jasonmalinowski jasonmalinowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments left but nothing concerning -- I would have signed off on the original PR (except for that line tagging me. 😄)

# Version 4.8.0

### Changed `Assembly.Location` behavior in non-Windows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to adjust the title to say something about analyzers/generators so the context is clearer from the outset.

# Version 4.8.0

### Changed `Assembly.Location` behavior in non-Windows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Massive kudos for remembering to update the docs!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kudos go to @jjonescz for reminding me to do this

#69406 (comment)

Comment on lines +158 to +159
using var stream = File.Open(assemblyPath, FileMode.Open, FileAccess.Read, FileShare.Read);
return LoadFromStream(stream);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random question: for the performance penalty on Windows due the virus scanner, is that because .NET is explicitly invoking the virus scanner on the stream data, or just because the virus scanner doesn't see this as a memory mapped file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runtime team said it has to do with caching. When the file is on disk, even when it's copied, the scans can be cached, while when it's in memory they cannot be. Feels like you could still do a SHA-256 cache but right now this is the reality 😦

@@ -17,14 +18,18 @@ internal sealed class DefaultAnalyzerAssemblyLoader : AnalyzerAssemblyLoader
{
#if NETCOREAPP

// Called from a netstandard2.0 project, so need to ensure a parameterless constructor is available.
internal DefaultAnalyzerAssemblyLoader()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To @RikkiGibson's other question -- do we need this still? Because I don't quite understand the comment here that we had a net2.0 project calling something #ifdef'ed in a NETCOREAPP TFM?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a need for a parameterless ctor but I think it can be pulled out of the #if now and just have a single one.

/// will be the base directory where shadow copy assemblies are stored. </param>
internal static IAnalyzerAssemblyLoader CreateNonLockingLoader(string windowsShadowPath)
{
#if NETCOREAPP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you restructure this as:

#if NETCOREAPP
if (!RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
{
    return new DefaultAnalyzerAssemblyLoader(loadOption: AnalyzerLoadOption.LoadFromStream);
}
#endif
// (the contents of createShadowLoaderWindows)

You can shorten this up a bit and also remove the local function. That might make it a bit clearer that we are only doing something special on NETCOREAPP and non-Windows.

src/Compilers/Shared/BuildServerConnection.cs Show resolved Hide resolved
@@ -18,6 +18,7 @@ internal sealed class RemoteAnalyzerAssemblyLoader : AnalyzerAssemblyLoader

public RemoteAnalyzerAssemblyLoader(string baseDirectory)
{
// jason should we load from stream here on Linux or is this VS only?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file isn't used on Linux, so your question may be somewhat moot. But the comment here makes me think that this kicks in if we're loading analyzers that shipped in Roslyn itself, so reading directly from disk is fine since there's not locking concerns in the first place.

@@ -18,6 +18,7 @@ internal sealed class RemoteAnalyzerAssemblyLoader : AnalyzerAssemblyLoader

public RemoteAnalyzerAssemblyLoader(string baseDirectory)
{
// jason should we load from stream here on Linux or is this VS only?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@genlu may also want to chime in here.

dibarbet added a commit to dibarbet/roslyn that referenced this pull request Aug 14, 2023
jaredpar added a commit that referenced this pull request Aug 17, 2023
* IDE changes for analyzer loading

For #69406 I updated several places where the IDE created analyzers.
Even though I specifically called out the PR to the IDE team, I forgot
I'd changed their code and merged once I had two compiler sign offs.
This PR is commenting the parts of the IDE I changed in the PR so I can
get actual sign off on these changes.

* PR feedback
@dibarbet dibarbet modified the milestones: Next, 17.8 P2 Aug 28, 2023
jaredpar added a commit to jaredpar/roslyn that referenced this pull request Nov 10, 2023
This fixes a bug where-by the compiler did not load satellite assemblies
properly on Unix and Linux. The compiler recently changed (dotnet#69406) our
loading strategy there to use `AssemblyLoadContext.LoadFromStream`. The
runtime fallback logic for loading satellite assemblies is effectively
the following:

1. Call `AssemblyLoadContext.Load(satelliteAssemblyName)`
2. If that fails call `AssemblyLoadContext.Load(realAssemblyName)` for
   the real assembly
3. Look at `Assembly.Location` on the return and change it to have the
   appropriate satellite path.
4. Call `AssemblyLoadContext.LoadFromPath(...)` on the result of (3)

That logic breaks when assemblies are loaded via `LoadFromStream` as
there is no location information.

This changes our `AssemblyLoadContext` implementation to just handle
satellite loads in step (1). This is done irrespective of whether the
loader is loading via path or via stream. The loader has the information
already so simpler to just handle it the same way every time.

closes dotnet#56708

Note: these are [the
docs](https://learn.microsoft.com/en-us/dotnet/core/dependency-loading/loading-resources)
describing satellite assembly loading.
jaredpar added a commit to jaredpar/roslyn that referenced this pull request Nov 10, 2023
This fixes a bug where-by the compiler did not load satellite assemblies
properly on Unix and Linux. The compiler recently changed (dotnet#69406) our
loading strategy there to use `AssemblyLoadContext.LoadFromStream`. The
runtime fallback logic for loading satellite assemblies is effectively
the following:

1. Call `AssemblyLoadContext.Load(satelliteAssemblyName)`
2. If that fails call `AssemblyLoadContext.Load(realAssemblyName)` for
   the real assembly
3. Look at `Assembly.Location` on the return and change it to have the
   appropriate satellite path.
4. Call `AssemblyLoadContext.LoadFromPath(...)` on the result of (3)

That logic breaks when assemblies are loaded via `LoadFromStream` as
there is no location information.

This changes our `AssemblyLoadContext` implementation to just handle
satellite loads in step (1). This is done irrespective of whether the
loader is loading via path or via stream. The loader has the information
already so simpler to just handle it the same way every time.

closes dotnet#56708

Note: these are [the
docs](https://learn.microsoft.com/en-us/dotnet/core/dependency-loading/loading-resources)
describing satellite assembly loading.
jaredpar added a commit that referenced this pull request Nov 15, 2023
* Handle satellite assembly loading in .NET Core

This fixes a bug where-by the compiler did not load satellite assemblies
properly on Unix and Linux. The compiler recently changed (#69406) our
loading strategy there to use `AssemblyLoadContext.LoadFromStream`. The
runtime fallback logic for loading satellite assemblies is effectively
the following:

1. Call `AssemblyLoadContext.Load(satelliteAssemblyName)`
2. If that fails call `AssemblyLoadContext.Load(realAssemblyName)` for
   the real assembly
3. Look at `Assembly.Location` on the return and change it to have the
   appropriate satellite path.
4. Call `AssemblyLoadContext.LoadFromPath(...)` on the result of (3)

That logic breaks when assemblies are loaded via `LoadFromStream` as
there is no location information.

This changes our `AssemblyLoadContext` implementation to just handle
satellite loads in step (1). This is done irrespective of whether the
loader is loading via path or via stream. The loader has the information
already so simpler to just handle it the same way every time.

closes #56708

Note: these are [the
docs](https://learn.microsoft.com/en-us/dotnet/core/dependency-loading/loading-resources)
describing satellite assembly loading.

* fixup

* Apply suggestions from code review

Co-authored-by: Jan Jones <[email protected]>

* PR feedback

* build failure

* style issue

* Apply suggestions from code review

Co-authored-by: Rikki Gibson <[email protected]>

* PR feedback

---------

Co-authored-by: Jan Jones <[email protected]>
Co-authored-by: Rikki Gibson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Analyzer / generator shadow load directory needs to be per user on Unix
6 participants