Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dotnet msbuild commands sometimes hang after writing MSBuild_*.failure.txt files #1808

Closed
dougbu opened this issue Mar 7, 2017 · 10 comments
Closed

Comments

@dougbu
Copy link
Member

dougbu commented Mar 7, 2017

From @dougbu on February 25, 2017 23:21

Steps to reproduce

  1. clone the MVC repo (https://github.com/aspnet/Mvc.git)
  2. checkout the dougbu/does.it.hang.yes.it.does branch
  3. initialize the build and restore everything: .\build.cmd initialize
  4. either use build.cmd to compile (this involves wrapper / waiting sake.exe and dotnet processes) or run inner MsBuild command with stderr redirected. I.e. execute one of
  • .\build.cmd build-compile
  • dotnet msbuild /nologo "C:\dd\dnx\Universe\Mvc\.build/makefile.proj" "/p:LifecycleType=Standard;Configuration=Debug" /t:Compile 2>build.err

Have similar issues with 2>&1 |tee build.log, |tee build.log or 2>&1 in the second case. The unadorned second command and that with 2>build.err >build.log usually work fine.

Expected behavior

Build completes and exits cleanly.

Errors and warnings during the build are expected as well, especially MSB3277 due to .microsoft/vstest#393 or perhaps because Mvc.Core project depends on Microsoft.Extensions.DependencyModel version 1.2.0-* while the xUnit runner uses version 1.1.0.

Actual behavior

Commands hang consistently. (Hang may take some work in Task Manager due to dotnet/cli#4856.)

It's not predictable what project is last mentioned in build.log. But, hangs seem to occur when the build is near its end.

There's never just one dotnet process when hangs occur. Usually see the original command, the dotnet exec command (to invoke the latest copy of dotnet.exe), and at least one of the worker processes.

Environment data

dotnet --info output:

.NET Command Line Tools (1.0.0-rc4-004883)

Product Information:
 Version:            1.0.0-rc4-004883
 Commit SHA-1 hash:  fa2c9e025c

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.14393
 OS Platform: Windows
 RID:         win10-x64
 Base Path:   C:\Users\dougbu\AppData\Local\Microsoft\dotnet\sdk\1.0.0-rc4-004883

Additional information

I have process dumps captured in Visual Studio and Task Manager. If they will help, contact me at my Microsoft email (same alias as here) and I'll share.

Problems appear specific to this branch. Command-line builds complete reliably in the dougbu/migration branch for example. I cleaned up some dotnet migrate artifacts in the dougbu/migration branch and the *.sln files are different. But, largest change is probably a fix that's only in the dougbu/migration branch to build just Mvc.sln and not all three solutions.

Copied from original issue: dotnet/cli#5849

@dougbu
Copy link
Member Author

dougbu commented Mar 7, 2017

FYI @Eilon this is the hang we discussed a couple of days back.

@dougbu
Copy link
Member Author

dougbu commented Mar 7, 2017

Behaviour more consistent after I rebased dougbu/does.it.hang.yes.it.does to include latest from the Mvc dev branch. Problems now sometimes reproduce without redirects. Likely the redirects just change timings slightly and previously made races occur more often.

But, largest change is probably a fix that's only in the dougbu/migration branch to build just Mvc.sln and not all three solutions.

I tested in the dougbu/migration branch after backing out that fix. No hangs.

@dougbu
Copy link
Member Author

dougbu commented Mar 7, 2017

Just saw a message about MSBuild_*.failure.txt files written into $env:temp. These messages aren't normally output. But, I've got files for all or most hangs I've experienced this morning. Contents include

UNHANDLED EXCEPTIONS FROM PROCESS 16096:
=====================
2/27/2017 09:27:51
System.NullReferenceException: Object reference not set to an instance of an object.
   at Microsoft.Build.BackEnd.NodePacketTranslator.SerializedException.ToException(SerializedException serializedException)
   at Microsoft.Build.BackEnd.NodePacketTranslator.NodePacketReadTranslator.TranslateException(Exception& value)
   at Microsoft.Build.Execution.BuildResult.Microsoft.Build.BackEnd.INodePacketTranslatable.Translate(INodePacketTranslator translator)
   at Microsoft.Build.BackEnd.NodePacketTranslator.NodePacketReadTranslator.Translate[T](T& value)
   at Microsoft.Build.BackEnd.BuildRequestUnblocker.Translate(INodePacketTranslator translator)
   at Microsoft.Build.BackEnd.BuildRequestUnblocker.FactoryForDeserialization(INodePacketTranslator translator)
   at Microsoft.Build.BackEnd.NodePacketFactory.PacketFactoryRecord.DeserializeAndRoutePacket(Int32 nodeId, INodePacketTranslator translator)
   at Microsoft.Build.BackEnd.NodePacketFactory.DeserializeAndRoutePacket(Int32 nodeId, NodePacketType packetType, INodePacketTranslator translator)
   at Microsoft.Build.Execution.OutOfProcNode.Microsoft.Build.BackEnd.INodePacketFactory.DeserializeAndRoutePacket(Int32 nodeId, NodePacketType packetType, INodePacketTranslator translator)
   at Microsoft.Build.BackEnd.NodeEndpointOutOfProcBase.RunReadLoop(Stream localReadPipe, Stream localWritePipe, Queue`1 localPacketQueue, AutoResetEvent localPacketAvailable, AutoResetEvent localTerminatePacketPump)
===================

and / or (or is much more common)

UNHANDLED EXCEPTIONS FROM PROCESS 16360:
=====================
2/27/2017 09:37:13
Microsoft.Build.Shared.InternalErrorException: MSB0001: Internal MSBuild Error: Items already exist for target CoreBuild.
   at Microsoft.Build.Shared.ErrorUtilities.ThrowInternalError(String message, Object[] args)
   at Microsoft.Build.Shared.ErrorUtilities.VerifyThrow(Boolean condition, String unformattedMessage, Object arg0)
   at Microsoft.Build.Execution.BuildResult.AddResultsForTarget(String target, TargetResult result)
   at Microsoft.Build.BackEnd.TargetBuilder.<ProcessTargetStack>d__22.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
   at Microsoft.Build.BackEnd.TargetBuilder.<BuildTargets>d__11.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at Microsoft.Build.BackEnd.RequestBuilder.<BuildProject>d__57.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at Microsoft.Build.BackEnd.RequestBuilder.<BuildAndReport>d__49.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
   at Microsoft.Build.BackEnd.RequestBuilder.<RequestThreadProc>d__48.MoveNext()
===================

Of course, these failures should not result in hangs.

@dougbu
Copy link
Member Author

dougbu commented Mar 7, 2017

@pranavkm please add your smaller (IdentityServer) repro steps to this bug.

@dougbu dougbu changed the title dotnet msbuild commands hang with some command-line redirections dotnet msbuild commands sometimes hang after writing MSBuild_*.failure.txt files Mar 7, 2017
@dougbu
Copy link
Member Author

dougbu commented Mar 7, 2017

Renamed issue because redirection is not required to see the hangs.

@dougbu
Copy link
Member Author

dougbu commented Mar 7, 2017

/cc @piotrpMSFT since you put the old bug into a milestone.

@pranavkm
Copy link
Contributor

pranavkm commented Mar 7, 2017

Saw this in our CI and locally recently when building this specific commit https://github.com/aspnet/IdentityService/commit/7f3707520cc5bba05947ef8a660b3c41eb5968a7:

  1. src\Microsoft.AspNetCore.Identity.Service\Microsoft.AspNetCore.Identity.Service.csproj failed to compile due to TreatWarningsAsErrors:
ApplicationManager.cs(13,22): error CS0649: Field 'ApplicationManager<TApplication>._disposed' is never assigned to, and will always have its default value false [IdentityService\src\Microsoft.AspNetCore.Identity.Service\Microsoft.AspNetCore.Identity.Service.csproj]
  1. Projects referencing it failed to build. The error message doesn't seem particularly correct:
dotnet\sdk\1.0.0\Sdks\Microsoft.NET.Sdk\build\Microsoft.NET.Sdk.targets(92,5): error : Cannot find project info for 'IdentityService\src\Microsoft.AspNetCore.Identity.Service\Microsoft.AspNetCore.Identity.Service.csproj'. This can indicate a missing project reference. [IdentityService\src\Microsoft.AspNetCore.Identity.Service.EntityFrameworkCore\Microsoft.AspNetCore.Identity.Service.EntityFrameworkCore.csproj]
dotnet\sdk\1.0.0\Sdks\Microsoft.NET.Sdk\build\Microsoft.NET.Sdk.targets(92,5): error : Cannot find project info for 'IdentityService\src\Microsoft.AspNetCore.Identity.Service\Microsoft.AspNetCore.Identity.Service.csproj'. This can indicate a missing project reference. [IdentityService\test\Microsoft.AspNetCore.Identity.Service.Test\Microsoft.AspNetCore.Identity.Service.Test.csproj]

MSBuild prints this error and ends up hanging. Our CI eventually killed it after 240 minutes of inactivity:
MSBUILD : error MSB4166: Child node "5" exited prematurely. Shutting down. Diagnostic information may be found in files in the temporary files directory named MSBuild_*.failure.txt.

Failure log file says:

UNHANDLED EXCEPTIONS FROM PROCESS 150400:
=====================
3/7/2017 8:41:02 AM
Microsoft.Build.Shared.InternalErrorException: MSB0001: Internal MSBuild Error: Items already exist for target CoreBuild.
   at Microsoft.Build.Shared.ErrorUtilities.ThrowInternalError(String message, Object[] args)
   at Microsoft.Build.Shared.ErrorUtilities.VerifyThrow(Boolean condition, String unformattedMessage, Object arg0)
   at Microsoft.Build.Execution.BuildResult.AddResultsForTarget(String target, TargetResult result)
   at Microsoft.Build.BackEnd.TargetBuilder.<ProcessTargetStack>d__22.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
   at Microsoft.Build.BackEnd.TargetBuilder.<BuildTargets>d__11.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at Microsoft.Build.BackEnd.RequestBuilder.<BuildProject>d__57.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at Microsoft.Build.BackEnd.RequestBuilder.<BuildAndReport>d__49.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
   at Microsoft.Build.BackEnd.RequestBuilder.<RequestThreadProc>d__48.MoveNext()
===================

@natemcmaster
Copy link
Contributor

This happened again today with MSBuild 15.2.47.30403 on .NET Core. I've gathered process dumps of MSBuild in its hung state. Let me know if you'd like me to email them to you.

@mikeharder
Copy link

Simplified Repro Steps

  1. VM with clean Win10
  2. Install .NET 4.5.1 and 4.5.2 targeting packs
    A. https://www.microsoft.com/en-us/download/details.aspx?id=40772
    B. https://www.microsoft.com/en-us/download/details.aspx?id=42637
  3. Unzip https://dotnetcli.azureedge.net/dotnet/Sdk/2.0.0-preview1-005724/dotnet-dev-win-x64.2.0.0-preview1-005724.zip
  4. git clone https://github.com/aspnet/IdentityService
  5. cd IdentityService
  6. git checkout 7f3707520cc5bba05947ef8a660b3c41eb5968a7 (known good commit)
  7. ..\dotnet-dev-win-x64.2.0.0-preview1-005724\dotnet.exe restore /p:TargetFrameworks=netcoreapp1.1
  8. ..\dotnet-dev-win-x64.2.0.0-preview1-005724\dotnet.exe build /p:TargetFrameworks=netcoreapp1.1

AndyGerlicher added a commit to AndyGerlicher/msbuild that referenced this issue May 3, 2017
This patch addresses a fatal error and potential node crash in MSBuild
in a very specific condition.

Consider project A builds project B, C, B (using MSBuild task), project
B builds project C, and project C produces an error. In a multi-proc
build, the master node may build A, start building B and yield execution
for C to be built (on another node). On another node, the dependencies
of B are built and failed, returning the results to the master node.
When the execution of B resumes, the failure occurs trying to set the
result of the error target (target result already exists).

This change calls the CheckSkipTarget method to determine if the target
has already been built in the case of an error. Previously this check
was made when checking dependencies of the target to inject or executing
the target. I believe this was an oversight in the original change that
was made to address this issue in a previous release of Visual Studio.

Note: The scenario described is implemented in unit test
VerifyMultipleRequestForSameProjectWithErrors_ErrorAndContinue. However,
since this issue causes a node to crash the unit test written did not
detect the more serious exception since the build was intended to fail
anyway. With this change, the test now passes.

Closes dotnet#1808
AndyGerlicher added a commit to AndyGerlicher/msbuild that referenced this issue May 3, 2017
Addresses an issue in
VerifyMultipleRequestForSameProjectWithErrors_ErrorAndContinue where the
test will fail (writing an MSBuild_*.failure.txt file in the temp
folder) but is not detected because a failure was expected. The primary
change is to detect the number of MSBuild_*.failure.txt files and assert
the value is unchanged before and after the test runs.

I used this opportunity to write TestEnvironment helper class to track
environmental variants before and after an individual test executes.
This class defines a common set of things that should not change (temp
path, environment variables, etc.) as a TestVariant and things that are
expected to change (environment variables, temp files/folders, etc.) as
a TestInvariant which is automatically reverted when the TestEnvironment
is disposed. This class is generic and can be extended to additional
cases.

Related to dotnet#1808
AndyGerlicher added a commit to AndyGerlicher/msbuild that referenced this issue May 3, 2017
Addresses an issue in
VerifyMultipleRequestForSameProjectWithErrors_ErrorAndContinue where the
test will fail (writing an MSBuild_*.failure.txt file in the temp
folder) but is not detected because a failure was expected. The primary
change is to detect the number of MSBuild_*.failure.txt files and assert
the value is unchanged before and after the test runs.

I used this opportunity to write TestEnvironment helper class to track
environmental variants before and after an individual test executes.
This class defines a common set of things that should not change (temp
path, environment variables, etc.) as a TestVariant and things that are
expected to change (environment variables, temp files/folders, etc.) as
a TestInvariant which is automatically reverted when the TestEnvironment
is disposed. This class is generic and can be extended to additional
cases.

Related to dotnet#1808
AndyGerlicher added a commit to AndyGerlicher/msbuild that referenced this issue May 4, 2017
Addresses an issue in
VerifyMultipleRequestForSameProjectWithErrors_ErrorAndContinue where the
test will fail (writing an MSBuild_*.failure.txt file in the temp
folder) but is not detected because a failure was expected. The primary
change is to detect the number of MSBuild_*.failure.txt files and assert
the value is unchanged before and after the test runs.

I used this opportunity to write TestEnvironment helper class to track
environmental variants before and after an individual test executes.
This class defines a common set of things that should not change (temp
path, environment variables, etc.) as a TestVariant and things that are
expected to change (environment variables, temp files/folders, etc.) as
a TestInvariant which is automatically reverted when the TestEnvironment
is disposed. This class is generic and can be extended to additional
cases.

Related to dotnet#1808
@dougbu
Copy link
Member Author

dougbu commented May 4, 2017

@AndyGerlicher does 1022b48 correct MSBuild hangs in general? Or, is it a targeted fix for hangs during compilation? I ask because we've seen hangs while MSBuild is testing projects and / or without *.failure.txt files.

Sticking to the *.failure.txt cases, does 1022b48 avoid both stack traces shown in my comment above?

And, which MSBuild release will include 1022b48?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants