Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: repo consolidation scouting kick-off - make clr build locally on Windows #1

Closed
wants to merge 0 commits into from
Closed

Conversation

trylek
Copy link
Member

@trylek trylek commented Oct 15, 2019

This is my first contribution to the scouting to let other people
see the first problems I'm hitting as I believe them to be mostly
general. With this small change I now have at least the native part
of CoreCLR build running - I'm not yet past that so then I'll see
what other problems I hit when building the managed components,
building and running tests.

As you can easily see, the first (and so far the only) problem is
the split of the two roots (subrepo root vs. repo root) making the
build script fail when trying to locate "msbuild.ps1" which is now
under a repo-global eng folder. I can imagine it may be undesirable
to query GIT in the build script; in this particular case we have
multiple alternate ways to proceed (keep the duplicates of
msbuild.ps1? autodetecting the root based on some well-known file?).

Thanks

Tomas

@jkoritzinsky
Copy link
Member

I'm seeing a similar issue in core-setup.

@trylek trylek requested a review from dagood October 15, 2019 17:54
@ViktorHofer
Copy link
Member

Can you use the RepositoryEngineeringDir variable?

@jkoritzinsky
Copy link
Member

@ViktorHofer this is just to find the eng/common folder before even invoking the scripts.

@trylek
Copy link
Member Author

trylek commented Oct 15, 2019

Hmm, where is this supposed to be coming from? I'm normally building the repo from the Windows prompt - who is going to set the variable there? Or are you saying we should go through some central script that would set the variable? That would probably solve the problem but we're back to chicken-egg problem w.r.t. how to locate such a global script?

@ViktorHofer
Copy link
Member

Arcade defines that property, so you would get it if you import the Arcade SDK. Are you saying that at this point you don't have Arcade imported yet?

If that's the case we can probably manually define it in a runtime root's Directory.Build.props file which is then imported by the repos.

@trylek
Copy link
Member Author

trylek commented Oct 15, 2019

In fact, that might be one of the solutions - we would check in a special script at the root of each of the repos being merged, that would set the path to the repo root. In the source repos that would equal the repo root while we would massage it during the port to point to the global root.

@jkoritzinsky
Copy link
Member

@ViktorHofer we haven't even found MSBuild yet, let alone invoked a build with the Arcade SDK at this point.

I encountered the same issue in core-setup's build.cmd just to find Arcade's build.ps1.

@ViktorHofer
Copy link
Member

OK got it.

@trylek
Copy link
Member Author

trylek commented Oct 15, 2019

I am afraid that we're still on the chicken-egg plan as in CoreCLR running "msbuild.ps1" is exactly the thing that requires locating the repo root.

@ViktorHofer
Copy link
Member

OK there are multiple options here.

  1. Importing a cmd/sh script that defines those paths that lives either in the parent (src) directory or parent-parent (repo root) dir?
  2. Hardcoding the relative path to the repo root (../../)
  3. Using git to resolve to the repo-root

Other ideas?

@trylek
Copy link
Member Author

trylek commented Oct 15, 2019

Circling back to my suggestion to "check in a script at the root of each repo being merged", that may be also usable for other things like exporting an environment variable declaring whether we're in the split or merged repo.

@trylek
Copy link
Member Author

trylek commented Oct 15, 2019

[not sure what kind of script that should be, ideally something that doesn't require two versions for Windows vs. non-Windows]

@trylek
Copy link
Member Author

trylek commented Oct 15, 2019

I think I like Jared's suggestion to ideally make most of the changes in the original repos so that we can arbitrarily repeat this process. So I think hard-coding .... is not ideal unless it comes from a script the porting process will create. I leave it to others to chime in as to whether it's legal to query git or not. I don't think it's strictly illegal, I think that the BuildVersion script is doing that already, and both in the lab and locally we always have a legal GIT repository available whenever we run the build script. Considering the dichotomy is supposed to be temporary, using the GIT magic may be a reasonable workaround.

@ViktorHofer
Copy link
Member

I agree with you. We should make most changes in the original repository. Just as an FYI, in cases where this isn't possible, you can create a GIT patch and place it here: https://github.com/dotnet/consolidation/tree/master/patches/coreclr

@trylek
Copy link
Member Author

trylek commented Oct 15, 2019

Conceptually, if we decide to emit / special case some scripts in the subrepos to interface the subrepos to their "embedding" within either the split or consolidated repo, we might also want to introduce similar per-subrepo interfacing YAML files under eng somewhere, going back to Jared's proposal for encoding the script / source location details via variables.

@trylek
Copy link
Member Author

trylek commented Oct 15, 2019

Ahh, thanks Viktor for making me aware of the patch list, that's very nice!

@trylek
Copy link
Member Author

trylek commented Oct 15, 2019

OK, so if I may try to divide the problem space, I believe that we're basically in agreement that we need some environment variable representing the "main root repo" that translates to the normal thing for the existing repos like CoreCLR and to the root of the consolidated repo in the merged case. In such case I can continue making further progress using my temporary git query hack while we discuss how to tackle this cleanly as there seem to be several available options.

@crummel
Copy link
Contributor

crummel commented Oct 15, 2019

Source-build does have a requirement to be able to build without a valid Git repo since we package sources and tools into a tarball without the accompanying Git repo. We could work with an environment variable override or similar though.

NuGet.config Outdated
@@ -14,6 +14,7 @@
<add key="dotnet-core" value="https://dotnetfeed.blob.core.windows.net/dotnet-core/index.json" />
<add key="dotnet-coreclr" value="https://dotnetfeed.blob.core.windows.net/dotnet-coreclr/index.json" />
<add key="dotnet-windowsdesktop" value="https://dotnetfeed.blob.core.windows.net/dotnet-windowsdesktop/index.json" />
<add key="myget.org dotnet-core" value="https://dotnet.myget.org/F/dotnet-core/api/v3/index.json" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trylek I think we should try to get rid of this feed in coreclr before we consolidate as myget is unreliable. What assets are you currently pushing there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is only needed by crossgen2 and R2RDump - without the additional feed they cannot find the [old] System.CommandLine parser package. I have already switched SuperIlc over to use the new parser and it would be probably feasible to do the same for the two other apps if necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC there was some work in the dotnet/jitutils repo that was waiting on R2RDump to move off the old System.CommandLine, so there'd be a number of people happy about removing old System.CommandLine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

myget is unreliable but I wouldn't block consolidation on it.

My mental model for should or shouldn't we require this before consolidation is roughly this (assuming it's not a functional blocker);

Could we fix this in the consolidated repository in less than one business day. If so don't block consolidation on it because it's something we could iterate on pretty fast.

eng/Tools.props Outdated
<?xml version="1.0" encoding="utf-8"?>
<Project>
<PropertyGroup>
<RestoreSources Condition="'$(AdditionalRestoreSources)' != ''">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, RestoreSources aren't honored by Arcade anymore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit I actually don't know what exactly this file is needed for, I just see that in its absence CoreCLR build seems unable to install "dotnet.exe" (dotnet-install) and subsequently pull down some packages needed by the build like Microsoft.DotNet.Build.Tasks.Packaging. Having said that, I fully admit I've never been a Nuget expert so I may be easily misinterpreting the symptome.


# Always use the local repo packages directory instead of
# the user's NuGet cache
$script:useGlobalNuGetCache = $false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make sure that coreclr works with the global nuget cache enabled. This is the default for corefx and it would be really sad if we need to go back to the old world with the repo packages dir.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the repo package level directory has helped us achieve higher fidelity in our CI. At the moment I believe arcade still uses the global nuget cache as well, but our goal pre-consolidation was to move as much as we could to the repo level nuget cache.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have some data on that? We aren't hitting any issues with using the user global nuget cache locally and the repo package cache in CI (the default behavior in an arcade enabled repo) in corefx.

but our goal pre-consolidation was to move as much as we could to the repo level nuget cache

Can you please elaborate on that? Not sure what you mean.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that the default global cache is on the home drive. Which for vms is the most expensive and smallest size disk. This equates to a performance problem, for our arm hardware our home disks are extremely small. Helix also consumes most of the home disk, so accidentally filling the disk with the global nuget cache on the main disk used to happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the comment. It is more of less whet I said. Coreclr’s expectation was to continue to use the non global nuget cache. I think there are still blockers keeping us from migrating away from that plan.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, that makes sense. What I don't understand is why this causes a problem as on CI the user global cache isn't used by default (arcade convention).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# commands like init-tools so for now we need to disable
# using the globally installed dotnet

$script:useInstalledDotNetCli = $false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for the SDK here. CoreClr isn't using buildtools anymore, why is still required?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is possibly an artifact of our older hack to support building with arcade and buildtools. I am not familiar exactly with what can be removed.

@karelz karelz added this to the 5.0.0 milestone Aug 18, 2020
kunalspathak referenced this pull request in kunalspathak/runtime Sep 18, 2020
Only build Windows_NT_x64, disable test builds and runs
rokonec added a commit to rokonec/runtime that referenced this pull request Oct 22, 2020
- removing var where unclear
- remove outdated comments
- using local variable as oppose to static
rokonec added a commit to rokonec/runtime that referenced this pull request Nov 8, 2020
- renaming LUT
- swapping leaf flag meaning (MSB)
- typos
rokonec added a commit to rokonec/runtime that referenced this pull request Nov 9, 2020
- fix shift count computation
- change acc size to 16 bits and append to it by byte
rokonec added a commit to rokonec/runtime that referenced this pull request Nov 10, 2020
- change local variables from ushort to int
kouvel referenced this pull request in kouvel/runtime Nov 11, 2020
…ding in some cases

1. When suspending for the debugger is in progress (the debugger is waiting for some threads to reach a safe point for suspension), a thread that is not yet suspended may trigger another runtime suspension. This is currently not allowed because the order of operations conflicts with requirements to send GC events for managed data breakpoints to work correctly when suspending for a GC. Instead, the thread suspends for the debugger first, and after the runtime is resumed, continues suspending for GC.
2. At the same time, if the thread that is not suspended yet is in a forbid-suspend-for-debugger region, it cannot suspend for the debugger, which conflicts with the above scenario, but is currently necessary for the issue fixed by dotnet#40060
3. The current plan is to change managed data breakpoints implementation to pin objects instead of using GC events to track object relocation, and to deprecate the GC events APIs
4. With that, the requirement in #1 goes away, so this change conditions the check to avoid suspending the runtime during a pending suspension for the debugger when GC events are not enabled

- Verified that the latest deadlock seen in dotnet#42375 manifests only when a data breakpoint set and not otherwise
- Combined with dotnet#44471 and a VS update to use that to switch to the pinning mechanism, the deadlock issue seen above should disappear completely
rokonec added a commit to rokonec/runtime that referenced this pull request Nov 29, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 12, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants