Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visual Studio Deadlocks with latest beta #1697

Closed
tpaxatb opened this issue May 31, 2019 · 14 comments
Closed

Visual Studio Deadlocks with latest beta #1697

tpaxatb opened this issue May 31, 2019 · 14 comments
Labels

Comments

@tpaxatb
Copy link
Contributor

tpaxatb commented May 31, 2019

Somewhere between 5.0.0.beta.2+101 and the current 5.0.0.beta.3+29, Visual Studio will deadlock when loading a solution that contains project that references GitVersion nuget package. I just got back from being out of town and haven't been able to debug it, and won't be able to until the weekend at the earliest. I am going to hazard a guess that it is the rework of the nuget utils portion.

@arturcic
Copy link
Member

arturcic commented May 31, 2019

I'm just curious, if you could give a try for the package from the #1659 (comment). That is a PR that was implemented before those changes and it's not using nuget utils. Could you provide some feedback? I need some motivation to re-implement that PR ;)

@tpaxatb
Copy link
Contributor Author

tpaxatb commented May 31, 2019

OK I'll provide feedback as I am able to look at it (debugging the dependencies are very difficult, to say the least).

As an aside, and related to visual studio, the new dependency on the NuGetUtils vs the old UtilsNuget package breaks Visual Studio 2015, as the new NuGetUtils package is seemingly unavailable in that version. Honestly, that's a rather HUGE breaking change to require an upgrade to Visual Studio that not everyone is able to do immediately (for example, certain ISO specifications require the same tool-chain to be used until a re-certification against an upgraded tool-chain has been completed).

@arturcic
Copy link
Member

I agree, that's why in that PR I have tried to embed the Libgit2Sharp into the package, this way we don't need to resolve dynamically the nuget packages anymore, meaning we don't need NugetUtils

@tpaxatb
Copy link
Contributor Author

tpaxatb commented Jun 3, 2019

After being able to examine things, it appears that there is a massive issue when using all the tasks in the latest beta with the NuGetUtils package. I don't quite understand the underlying root cause, but taking the async/await pattern out of the toolset gets it to work. If I am being honest, removing that dependency (i.e making libgit2sharp a part of the package and not utilize nugetutils) is to me the better solution. I never understood or cared for the dependency on the NuGetUtils libraries, and it seems to cause more problems (e.g. this task deadlock) than it is intended to solve.

@dazinator
Copy link
Member

dazinator commented Jun 3, 2019

@stazz it looks like nugetutils is unfortunately causing a fair few problems which is a shame given all the effort you've put in which we are grateful for. @arturcic if we can get your pr tested a bit more I'd be open to removing nugetutils as I think it's unfair for us to keep burdening @stazz with issues. Just my current opinion, I'm not in charge by any means! @stazz @arturcic what do you think?

@stazz
Copy link
Contributor

stazz commented Jun 4, 2019

@tpaxatb Let me explain, since you don't understand the NuGetUtils purpose. When you create a custom MSBuild task as NuGet package, you need to provide the path to the assembly containing your task to the MSBuild. This works well as long as your task depends only on framework assemblies and MSBuild assemblies. But as soon as you add custom dependency, you get a problem: MSBuild loads the task assemblies in its own process, so you can't really use .deps.json or runtimeconfig.json files (AFAIK, this might have changed since beginning of creating UtilPack/NuGetUtils, but I doubt that) to load any dependencies.

The most simple solution here is to include all your dependencies into the same folder as your task assembly, allowing MSBuild to automatically load the dependencies. However, this creates another problems:

  1. The package size can grow to pretty big (if you have to include your task + all dependencies separately for .NET Core and .NET Desktop), and
  2. You still need to manage native dependency loading yourself, and that is drastically different in .NET Core vs .NET Desktop.

The UtilPack/NuGetUtils was created to solve those two problems: it will take care to load the dependencies of your task assembly from their respective NuGet packages so you won't need to include them in your own package manually. It also handles the native assembly load for both .NET Core and .NET Desktop, in configurable way. Furthermore, it even allows you to write your tasks for runtime like netstandard1.3, and will take care of loading correct framework and dependency assemblies during runtime, so you can write as much .NET Core/Desktop -agnostic code as possible.

Now, those previous problems that were going on, stem from two root causes:

  1. Some MSBuild runtimes (dotnet build for .NET Core, and Visual Studio environment) have locked down the NuGet assemblies version (UtilPack/NuGetUtils uses NuGet assemblies to perform package restore/dependency walk/etc), and
  2. NuGet assemblies get binary-incompatible change even between minor releases.

The old UtilPack task factory used NuGet assemblies directly, resulting in bad crashes (e.g. MethodNotFoundException) whenever it was run in such environment, where NuGet assemblies had binary-incompatible changes from the ones which were used for UtilPack task factory compilation. For example, upgrading from .NET Core 2.1 to 2.2. (or was it 2.0 to 2.1? I can't remember for sure) resulted in such errors.

The new NuGetUtils invokes all NuGet-specific functionality in separate sub-processes, which have bundled NuGet assemblies. This means that you won't get those MethodNotFoundExceptions even if the MSBuild environment has locked down NuGet assemblies version.

I hope now you ( @tpaxatb ) both understand and care for the NuGetUtils dependency after this explanation, and maybe what you say about it after this may be a bit more convincing.

@dazinator I do not take issues as a burden, but as a way to improve the code I've written. :) But I'm not quite sure if I have capacity to do anything big related to this during summer, unfortunately. So if bundling all your dependencies in GitVersionTask is not a problem for you guys, then you should go ahead with it.

About this actual issue (deadlock), does the original reporter @tpaxatb have any additional information, stack traces, error messages, reproduceable cases, etc? How did you "took out" the async/await pattern in your investigations, and from where? I'm looking at hints as to what truly is the cause of this problem, and whether the solution could be something simple. Even if you guys decide to drop NuGetUtils dependency, this still would be interesting to solve, as it may have effect on other projects you and me work with.

@dazinator
Copy link
Member

dazinator commented Jun 4, 2019

@stazz - thanks for the explanation. Just an FYI - I am completely aligned with the promise / goal of NuGetUtils - I think conceptually this is a nice problem for GitVersion to outsource as a concern. I personally would be pleased to see NuGetUtils grow more stable. However I can also understand the desire to remove it if it results in a class of errors being eliminated at the cost of larger package sizes and some additional complexity in GitVersionTask. I'll let others decide on that tradeoff :-). I am also a great believer in the open source community, and I think a stable NuGetUtils would be good for any projects that want cross platform msbuild tasks so would like to see it do well from that perspective too. My only suggestion would be, right now GitVersionTask seems to be the primary vehicle of issue discovery for NuGetUtils. It must be a nightmare trying to automate tests for the scenarios that NuGetUtils deals with (for example: Omnisharp & Mono (various versions), .NET Core + .NET Core MsBuIild (various versions), .NET Framework + Desktop MSBuild (Various versions) - not to mention minor platform variations across linux etc - however do you see a way in future to start putting in some automated tests for NuGetUtils?

@stazz
Copy link
Contributor

stazz commented Jun 4, 2019

@dazinator Thanks for the kind words! I agree with you - the promise is great, but as usually, the real world situations sometimes just have a very ugly heads, which tend to surface around too good promises. :)

I would love to see NuGetUtils to get fixed and improved by OSS community, and I like it when other people use my libraries. But if this deadlock (or some other issue) is really because of NuGetUtils, and is causing a lot of trouble for GitVersionTask, you should definitely try to go with the bundling all dependencies with the NuGet package itself. Since right now I am the sole developer of NuGetUtils, and, as you mentioned, GitVersionTask is the only user of the NuGetUtils. :) One can always re-introduce NuGetUtils to GitVersionTask later if such need arises.

About testing NuGetUtils - you are right that it is quite a nightmare. But I think it is still doable, now that Docker is such a big thing - at least for all the Linux versions. It seems that now there is also some .NET Framework Docker images as well now ( https://github.com/Microsoft/dotnet-framework-docker ) at least for some .NET Framework versions. But yeah, fully automating all the possible environment + version combos will be a huge task, not to mention that every such full test run would probably take quite a long time (especially considering that most likely some free-tier CI environment will be used). Currently there are some tests, but nothing end-to-end kind of big system test, which is a big minus of course. I hope I'll have time to some day to fix that.

@tpaxatb
Copy link
Contributor Author

tpaxatb commented Jun 4, 2019

When I said I removed the async/await pattern, what I mean is that I made everything sequential and not asynchronous task based. When doing this, it seems to work. It is a minor pain in the butt to debug, but I was finally able to do so with some finessing. Attaching a debugger to visual studio allowed me to discover that the await on the task _cache.DetectEnvironmentAsync call never continues to go on to even though the task has completed on the worker thread. I honestly don't know the root cause, if this is some interaction with console redirection from the sub process or a .NET framework bug, but the sub process await continues and I am able to see await be.LogProcessEndMessage( PROCESS, startTime ); continue and return. However this return value is never propagated to the caller...the line following the _cache.DetectEnvironmentAsync never gets hit, but the envoronment is in a Wait state as the call stack shows the thread is in a WaitForSingleObject call. I was able to get out of the deadlock by removing the asynchronous pattern and making all the calls synchronous. If the only reason these calls are being made asynchronous is to have the cancellation token for potential to kill the subprocess early, I would vote to remove the async/await pattern and lose the cancelability, as the factory entry point has to wait for completion, and the asynchronous pattern is adding a complexity level that could be introducing issues (i'm almost tempted to say it is due to the stdin redirect of the console application, but without further investigation don't know for sure).

All this stems from the fact that Visual Studio uses an inproc msbuild for non C#/VB projects, so I am probably the first person to run across this issue as I am using the GitVersionTask from a .wixproj. I understand the complexity and need for it...but the old UtilPack (that is included in older betas i.e. beta.2+101) is broken because the remoting object expires and VS has to be restarted (Visual Studio is not helpful here as it eats the exception so the only way you know what happened is by catching first chance exceptions). The new package NuGetUtils is broken because it causes Visual Studio to deadlock on project loading if there are inproc msbuild object factories in use. I honestly don't know the best solution moving forward; if it is to temporarily embed the git libs internally for now until the NuGetUtils package is updated and reintegrated or what. The biggest thing for ME is that GitVersionTask/Exe is the only solution out there that integrates very well into CI...

@arturcic
Copy link
Member

arturcic commented Jun 5, 2019

I'm just curious, if you could give a try for the package from the #1659 (comment). That is a PR that was implemented before those changes and it's not using nuget utils. Could you provide some feedback? I need some motivation to re-implement that PR ;)

@tpaxatb have you had a chance to try the package from that PR? as it has the Libgit2sharp embedded, with the native dlls.

@stazz
Copy link
Contributor

stazz commented Jun 5, 2019

@tpaxatb Thank you for detailed explanation! Yes, one reason for async/await pattern is to implement nice cancelability (in case in the future, MSBuild will have some kind of "InitializeAsync" and "ExecuteAsync" methods). But what you report is deeply disturbing: under certain preconditions (since it works outside Visual Studio), something causes the async/await code to hang indefinetely, while the same non-async code completes successfully.

One possible explanation might be that something awaits on the result concurrently. Since the Visual Studio loads everything in-process and concurrent calls may occur, there might be situation occurring which does not happen in other circumstances (e.g. using dotnet build/MSBuild.exe). As you can see, the UtilPack.AsyncLazy is used to encapsulate the asynchronous process invocation result, which has its own logic on how it handles the asynchronous lazy value creation.

@tpaxatb , since you already tried with correcting code and running it in Visual Studio, I have one request for you, if you may, please: Try surrounding the whole code of DetectEnvironmentAsync method in NuGetExecutionCache with lock(this._lock) { ... } (and add private Object _lock = new Object(); to fields), so that the the asynchronous awaiting would not happen concurrently. If the code works then, we know the real culprit (UtilPack.AsyncLazy) on this issue.

@tpaxatb
Copy link
Contributor Author

tpaxatb commented Jun 11, 2019

Sorry I have been a little busy... but since you can't lock around an await, that's kind of not possible. :)

@arturcic
Copy link
Member

@tpaxatb could you please check the latest 5.0.0-beta4-1 version?

@stale
Copy link

stale bot commented Sep 25, 2019

This issue has been automatically marked as stale because it has not had recent activity. After 30 days from now, it will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Sep 25, 2019
@stale stale bot closed this as completed Oct 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants