Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v7.0.0-rc.1.22426.10] Extremely slow build of non-portable runtime on armv7 #75811

Closed
ayakael opened this issue Sep 18, 2022 · 27 comments
Closed
Labels
area-Infrastructure source-build Issues relating to dotnet/source-build

Comments

@ayakael
Copy link
Contributor

ayakael commented Sep 18, 2022

Non-portable build of runtime v7.0.0-rc.1.22426.10 extremly slow in alpine.3.17-arm environment. Portable build w/ RID linux-musl-arm has no issue, and v6.0.9 also builds without issue.

Environment:

  • armv8l environment (i.e. 32-bit userspace in 64-bit aarch64 environment)
  • Alpine Edge (3.17)
  • clang / llvm version 14

Any pointers on what might be the problem with this?

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Sep 18, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@jkotas
Copy link
Member

jkotas commented Sep 18, 2022

How did you build the runtime that is very slow? Is it release build?

build.sh produces Debug builds by default. Debug builds are have asserts on and compiler optimizations off that makes them multiple times slower than Release builds. Pass -c Release to build.sh to produce Release build.

@ghost
Copy link

ghost commented Sep 18, 2022

Tagging subscribers to this area: @dotnet/runtime-infrastructure
See info in area-owners.md if you want to be subscribed.

Issue Details

Non-portable build of runtime v7.0.0-rc.1.22426.10 extremly slow in alpine.3.17-arm environment. Portable build w/ RID linux-musl-arm has no issue, and v6.0.9 also builds without issue.

Environment:

  • armv8l environment (i.e. 32-bit userspace in 64-bit aarch64 environment)
  • Alpine Edge (3.17)
  • clang / llvm version 14

Any pointers on what might be the problem with this?

Author: ayakael
Assignees: -
Labels:

area-Infrastructure, untriaged

Milestone: -

@ayakael
Copy link
Contributor Author

ayakael commented Sep 18, 2022

My apologies for not specifying - this is as built in source-build tarball. Here are the build settings:

/builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/src/runtime/build.sh  --ci --configuration Release --restore --build --pack --publish -bl /p:ArcadeBuildFromSource=true /p:CopyWipIntoInnerSourceBuildRepo=true /p:DotNetBuildOffline=true /p:CopySrcInsteadOfClone=true /p:DotNetPackageVersionPropsPath="/builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/artifacts/obj/arm/Release/PackageVersions.props" /p:AdditionalSourceBuiltNupkgCacheDir="/builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/artifacts/obj/arm/Release/blob-feed/packages/" /p:ReferencePackageNupkgCacheDir="/builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/packages/reference/" /p:PreviouslySourceBuiltNupkgCacheDir="/usr/lib/dotnet/artifacts/7.0.100_rc1/" /p:TargetRid=alpine.3.17-arm /p:SourceBuildNonPortable=true  /p:DotNetPackageVersionPropsPath=/builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/artifacts/obj/arm/Release/PackageVersions.props /p:DotNetRestoreSourcePropsPath=/builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/artifacts/obj/arm/Release/RestoreSources.props /p:DotNetBuildOffline=true
    Log: /builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/artifacts/logs/runtime.log
    With Environment Variables:
      DotNetBuildFromSource=true
      DotNetBuildFromSourceFlavor=Product
      DotNetPackageVersionPropsPath=/builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/artifacts/obj/arm/Release/PackageVersions.props
      DotNetRestorePackagesPath=/builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/packages/restored/
      DotNetBuildOffline=true
      AddDotnetfeedProjectSource=false
      DOTNET_INSTALL_DIR=/builds/ayakael/aports/testing/dotnet7-build/src/bootstrap/
      DOTNET_PATH=/builds/ayakael/aports/testing/dotnet7-build/src/bootstrap/
      DOTNET_HOST_PATH=/builds/ayakael/aports/testing/dotnet7-build/src/bootstrap/dotnet
      _InitializeDotNetCli=/builds/ayakael/aports/testing/dotnet7-build/src/bootstrap/
      _DotNetInstallDir=/builds/ayakael/aports/testing/dotnet7-build/src/bootstrap/
      _InitializeToolset=/builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/Tools/source-built/Microsoft.DotNet.Arcade.Sdk/tools/Build.proj
      _OverrideArcadeInitializeBuildToolFramework=net7.0
      DotNetUseShippingVersions=true
      PreReleaseVersionLabel=rc.1
      PackageVersionStamp=rc.1
      PB_VersionStamp=rc.1
      ContinuousIntegrationBuild=true
      MSBUILDDISABLENODEREUSE=1
      OfficialBuildId=20220826.10
      BUILD_BUILDNUMBER=20220826.10
      GitCommitCount=
      GitCommitHash=06aceb7015f3bd2ff019ef5920d2354eb2ea2c92
      GitInfoCommitHash=06aceb7015f3bd2ff019ef5920d2354eb2ea2c92
      SourceRevisionId=06aceb7015f3bd2ff019ef5920d2354eb2ea2c92
      RepositoryCommit=06aceb7015f3bd2ff019ef5920d2354eb2ea2c92
      COMMIT_SHA=06aceb7015f3bd2ff019ef5920d2354eb2ea2c92
      GIT_COMMIT=06aceb7015f3bd2ff019ef5920d2354eb2ea2c92
      RepositoryType=Git
      DeterministicSourcePaths=true
      SourceRoot=/builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/src/runtime/
      BuildInParallel=false
      LatestCommit=06aceb7015f3bd2ff019ef5920d2354eb2ea2c92
      SOURCE_BUILT_SDK_ID_ARCADE=Microsoft.DotNet.Arcade.Sdk
      SOURCE_BUILT_SDK_ID_ARCADE_PACKAGING=Microsoft.DotNet.Build.Tasks.Packaging
      SOURCE_BUILT_SDK_ID_ARCADE_TGT_FX=Microsoft.DotNet.Build.Tasks.TargetFramework
      SOURCE_BUILT_SDK_ID_ARCADE_SHARED_FX_SDK=Microsoft.DotNet.SharedFramework.Sdk
      SOURCE_BUILT_SDK_VERSION_ARCADE=7.0.0-beta.22418.4
      SOURCE_BUILT_SDK_VERSION_ARCADE_PACKAGING=7.0.0-beta.22418.4
      SOURCE_BUILT_SDK_VERSION_ARCADE_TGT_FX=7.0.0-beta.22418.4
      SOURCE_BUILT_SDK_VERSION_ARCADE_SHARED_FX_SDK=7.0.0-beta.22418.4
      SOURCE_BUILT_SDK_DIR_ARCADE=/builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/Tools/source-built/Microsoft.DotNet.Arcade.Sdk/
      SOURCE_BUILT_SDK_DIR_ARCADE_PACKAGING=/builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/Tools/source-built/Microsoft.DotNet.Build.Tasks.Packaging/
      SOURCE_BUILT_SDK_DIR_ARCADE_TGT_FX=/builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/Tools/source-built/Microsoft.DotNet.Build.Tasks.TargetFramework/
      SOURCE_BUILT_SDK_DIR_ARCADE_SHARED_FX_SDK=/builds/ayakael/aports/testing/dotnet7-build/src/dotnet-v7.0.100-rc.1.22431.12/Tools/source-built/Microsoft.DotNet.SharedFramework.Sdk/

Indeed, configuration is set to Release

@jkotas jkotas added the source-build Issues relating to dotnet/source-build label Sep 18, 2022
@ayakael
Copy link
Contributor Author

ayakael commented Sep 19, 2022

For references, build on aarch64 of source-build takes 1h48min. Build on that same machine in armv7 user-space takes 7h50 minutes, most of it (6hours) is runtime build in non-portable.

@jkotas
Copy link
Member

jkotas commented Sep 19, 2022

It means that some build commands take a lot longer in non-portable build. Are you able to identify these from the logs?

@ayakael
Copy link
Contributor Author

ayakael commented Sep 19, 2022

Interesting - the extreme slowness only occurs with gcc. In clang mode (with -clang), both aarch64 and armv7 build about the same speed. I don't know if this is reproducible on other (non-musl) platforms. As a mitigation, I'll use clang to build dotnet on armv7.

@ayakael
Copy link
Contributor Author

ayakael commented Sep 19, 2022

The above comment is a lie - the slowness on gcc was just a fluke. Basically back to zero. Building just runtime in non-portable mode outside of source-build finishes in 15 minutes, just like its portable counterpart. Moving to sourcebuild inflates that to hours. Essentially everything gets slower, there's not just one step that slows down. Of course, having timestamps on everything would be great to really measure out how long each steps take, but unfortunately I do not know of any switch that would do that

@hoyosjs
Copy link
Member

hoyosjs commented Sep 19, 2022

I wonder if using -bl would help us check here for some high level broad timestamps.

@ayakael
Copy link
Contributor Author

ayakael commented Sep 20, 2022

I'm trying now a build with /p:LogVerbosity:diag, if memory serves that activated log dumps of info. Right now I'm testing against the possibility that my bootstrap is to blame. The latter is a product of a linux-musl-x64 to linux-musl-armv7 crossbuild using:

ROOTFS_DIR="$CBUILDROOT" ./build.sh 
		-c Release
		-arch $_dotnet_target
		-clang
		/p:NoPgoOptimize=true
		/p:EnableNgenOptimization=false
		/p:VersionSuffix=${_runtimever##*-}
		/p:GitCommitHash=$(cat ./.git/HEAD)

Full script here also crossbuild llvm, sdk, aspnetcore, installer and roslyn.

@ayakael
Copy link
Contributor Author

ayakael commented Sep 20, 2022

Could the disabling of NgenOptimization and NoPgoOptimize be the cause for this? I had issues building them due to lack of pgo packages, thus disabled.

@hoyosjs
Copy link
Member

hoyosjs commented Sep 20, 2022

Not something I expect to be so bad - provided that everything is crossgened all should work. And these switches don't provide that functionality.

@ayakael
Copy link
Contributor Author

ayakael commented Sep 20, 2022

I tried full build with gcc and it fails with error: '_alloca' was not declared in this scope. Between a broken build and a slow one, the slow one is better.

@ayakael
Copy link
Contributor Author

ayakael commented Sep 20, 2022

More data: I did some more in-depth baby sitting today, taking note of how long things take. The clang part of the build doesn't seem to be slower, it seems more like what's really slow is generating the dlls and nupkgs of non-portable runtime. The bootstrap is confirmed not the issue, and it would explain the out-of-sourcebuild difference as in those tests I did not specify --pack.

Now my question is, why would --pack on runtime be slower on armv7 when the build is non-portable?

@jkotas
Copy link
Member

jkotas commented Sep 20, 2022

The bulk of the work done by --pack is compression.

Non-portable builds use compression from the distro instead of the ones that come with the runtime (https://github.com/dotnet/runtime/tree/main/src/native/external). Maybe the compression that comes from the distro is much slower, possibly due to confusion about compression levels?

@ayakael
Copy link
Contributor Author

ayakael commented Sep 20, 2022

The bulk of the work done by --pack is compression.

Non-portable builds use compression from the distro instead of the ones that come with the runtime (https://github.com/dotnet/runtime/tree/main/src/native/external). Maybe the compression that comes from the distro is much slower, possibly due to confusion about compression levels?

Is there a setting that can disable use of system compression to rule that out?

I also need to test --publish for unusual slowness.

@jkotas
Copy link
Member

jkotas commented Sep 20, 2022

Is there a setting that can disable use of system compression to rule that out?

Actually, we seem to be using the system libz everywhere, so I am not sure what can be the problem.

@ayakael
Copy link
Contributor Author

ayakael commented Sep 20, 2022

Is there a setting that can disable use of system compression to rule that out?

Actually, we seem to be using the system libz everywhere, so I am not sure what can be the problem.

Indeed, it isn't pack nor publishing - can't reproduce the issue out-of-sourcebuild when packing and publishing, I'm out of ideas on what's causing this so I'm gonna take the hit in builder time that this represents. It's just nuts how dotnet can take 8 hours to build on armv7 vs 2h on aarch64 on the same hardware. It really makes no sense

@ayakael
Copy link
Contributor Author

ayakael commented Sep 27, 2022

Setting --consoleLoggerParameters:ShowTimestamp has added timestamps to my logs. A significant amount of time is used up for oob -> Trimming alpine.3.17-arm out-of-band assemblies with ILLinker... operations. Strongest lead yet, might it be due to trimming? What would slow down trimming to a crawl? Once the build is complete I'll share the logs aarch64 vs arm.

@ViktorHofer
Copy link
Member

ViktorHofer commented Sep 27, 2022

Yeah, that step trims all the net7.0 out-of-band assemblies (for the current platform):

<ILLink AssemblyPaths=""
RootAssemblyNames="@(OOBAssemblyToTrim)"
ReferenceAssemblyPaths="@(OOBAssemblyReference)"
OutputDirectory="$(OOBAssembliesTrimmedArtifactsPath)"
ExtraArgs="$(OOBILLinkArgs)"
ToolExe="$(_DotNetHostFileName)"
ToolPath="$(_DotNetHostDirectory)" />

Presumably that means that the linker itself is the slow component on this platform?

@ayakael
Copy link
Contributor Author

ayakael commented Sep 27, 2022

Might be related to dotnet/linker#2975. I'm building using a bootstrapped version of illink that was before rc1, I think.

@ayakael
Copy link
Contributor Author

ayakael commented Sep 28, 2022

Building with Microsoft.NET.ILLink.7.0.100-1.22423.4.nupkg rather than Microsoft.NET.ILLink.7.0.100-1.22377.1.nupkg shows no difference. That step still takes >1 hour on armv7 rather than the usual 20 seconds on aarch64.

@jkotas
Copy link
Member

jkotas commented Sep 28, 2022

You should be able to attach profiler or debugger to the process to figure out where it is spending so much time. I would start with https://learn.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-counters to check on the overall health of the process.

The trimming step is an optimization that can be skipped. The build is still going to produce working binaries, they are just going to be a bit bigger.

@ayakael
Copy link
Contributor Author

ayakael commented Sep 28, 2022

You should be able to attach profiler or debugger to the process to figure out where it is spending so much time. I would start with https://learn.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-counters to check on the overall health of the process.

The trimming step is an optimization that can be skipped. The build is still going to produce working binaries, they are just going to be a bit bigger.

What flag can be set to skip trimming? Once linker is confirmed to be the issue, I'll spin up an arm32 environment and explore this further.

@jkotas
Copy link
Member

jkotas commented Sep 28, 2022

I do not see any flag that can be used to skip trimming. You would need to edit the build files, e.g. replace

<ILLink AssemblyPaths=""
RootAssemblyNames="@(OOBAssemblyToTrim)"
ReferenceAssemblyPaths="@(OOBAssemblyReference)"
OutputDirectory="$(OOBAssembliesTrimmedArtifactsPath)"
ExtraArgs="$(OOBILLinkArgs)"
ToolExe="$(_DotNetHostFileName)"
ToolPath="$(_DotNetHostDirectory)" />
with copying from OOBAssemblyToTrim to OOBAssembliesTrimmedArtifactsPath

@ViktorHofer
Copy link
Member

Note that this step doesn't impact the output binaries at all. This is just a validation step that writes to OOBAssembliesTrimmedArtifactsPath which never gets used.

If you would like to skip this step locally, you could just remove the target entirely:

<Target Name="TrimOOBAssemblies"
AfterTargets="Build"
Condition="'$(RefOnly)' != 'true' and '@(OOBAssembly)' != ''"
DependsOnTargets="GetTrimOOBAssembliesInputs;PrepareForAssembliesTrim"
Inputs="$(ILLinkTasksAssembly);@(OOBAssemblyToTrim);@(OOBAssemblyReference);@(OOBLibrarySuppressionsXml)"
Outputs="$(OOBAssembliesMarkerFile)">
<Message Text="$(MSBuildProjectName) -> Trimming $(PackageRID) out-of-band assemblies with ILLinker..." Importance="high" />
<PropertyGroup>
<OOBILLinkArgs>$(ILLinkArgs)</OOBILLinkArgs>
<!-- Unnecessary suppressions - disable for now since we didn't clean the runtime yet -->
<OOBILLinkArgs>$(ILLinkArgs) --nowarn IL2121</OOBILLinkArgs>
<OOBILLinkArgs Condition="'@(OOBLibrarySuppressionsXml)' != ''" >$(OOBILLinkArgs) --link-attributes &quot;@(OOBLibrarySuppressionsXml->'%(FullPath)', '&quot; --link-attributes &quot;')&quot;</OOBILLinkArgs>
</PropertyGroup>
<ILLink AssemblyPaths=""
RootAssemblyNames="@(OOBAssemblyToTrim)"
ReferenceAssemblyPaths="@(OOBAssemblyReference)"
OutputDirectory="$(OOBAssembliesTrimmedArtifactsPath)"
ExtraArgs="$(OOBILLinkArgs)"
ToolExe="$(_DotNetHostFileName)"
ToolPath="$(_DotNetHostDirectory)" />
<!-- Create a marker file which serves as the target's output to enable incremental builds. -->
<MakeDir Directories="$([System.IO.Path]::GetDirectoryName('$(OOBAssembliesMarkerFile)'))" />
<Touch Files="$(OOBAssembliesMarkerFile)"
AlwaysCreate="true" />
</Target>

@steveisok steveisok removed the untriaged New issue has not been triaged by the area owner label Sep 28, 2022
@ayakael
Copy link
Contributor Author

ayakael commented Oct 17, 2022

Current release/7.0.1xx of installer seems to not have this issue. Closing as seems fixed.

@ayakael ayakael closed this as completed Oct 17, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Nov 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Infrastructure source-build Issues relating to dotnet/source-build
Projects
None yet
Development

No branches or pull requests

5 participants