Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite the entire Azure DevOps build system #15808

Merged
merged 130 commits into from
Aug 11, 2023

Conversation

DHowett
Copy link
Member

@DHowett DHowett commented Aug 8, 2023

This pull request rewrites the entire Azure DevOps build system.

The guiding principles behind this rewrite are:

  • No pipeline definitions should contain steps (or tasks) directly.
  • All jobs should be in template files.
  • Any set of steps that is reused across multiple jobs must be in
    template files.
  • All artifact names can be customized (via a property called
    artifactStem on all templates that produce or consume artifacts).
  • No compilation happens outside of the "Build" phase, to consolidate
    the production and indexing of PDBs.
  • Building the project produces a bin directory. That bin
    directory is therefore the primary currency of the build. Jobs will
    either produce or consume bin if they want to do anything with the
    build outputs.
  • All step and job templates are named with step or job first,
    which disambiguates them in the templates directory.
  • Most jobs can be run on different pools, so that we can put
    expensive jobs on expensive build agents and cheap jobs on cheap
    build agents. Some jobs handle pool selection on their own, however.

Our original build pipelines used the VSBuild task all over the
place.
This resulted in Terminal being built in myriad ways, different
for every pipeline. There was an attempt at standardization early on,
where ci.yml consumed jobs and steps templates... but when
release.yml was added, all of that went out the window.

The new pipelines are consistent and focus on a small, well-defined set
of jobs:

  • job-build-project
    • This is the big one!
    • Takes a list of build configurations and platforms.
    • Produces an artifact named build-PLATFORM-CONFIG for the entire
      matrix of possibilities.
    • Optionally signs the output and produces a bill of materials.
    • Admittedly has a lot going on.
  • job-build-package-wpf
    • Takes a list of build configurations and platforms.
    • Consumes the build- artifact for every config/platform
      possibility, plus one for "Any CPU" (hardcoded; this is where the
      .NET code builds)
    • Produces one wpf-nupkg-CONFIG for each configuration, merging
      all platforms.
    • Optionally signs the output and produces a bill of materials.
  • job-merge-msix-into-bundle
    • Takes a list of build configurations and platforms.
    • Consumes the build- artifact for every config/platform
    • Produces one appxbundle-CONFIG for each configuration, merging
      all platforms for that config into one msixbundle.
    • Optionally signs the output and produces a bill of materials.
  • job-package-conpty
    • Takes a list of build configurations and platforms.
    • Consumes the build- artifact for every config/platform
    • Produces one conpty-nupkg-CONFIG for each configuration, merging
      all platforms.
    • Optionally signs the output and produces a bill of materials.
  • job-test-project
    • Takes one build config and one platform.
    • Consumes build-PLATFORM-CONFIG
    • Selects its own pools (hardcoded) because it knows about
      architectures and must choose the right agent arch.
    • Runs tests (directly on the build agent).
  • job-run-pgo-tests
    • Just like the above, but runs tests where IsPgo is true
    • Collects all of the PGO counts and publishes a pgc-intermediates
      artifact for that platform and configuration.
  • job-pgo-merge-pgd
    • Takes one build config and multiple platforms.
    • Consumes build-$platform-CONFIG for each platform.
    • Consumes pgc-intermediates-$platform-CONFIG for each platform.
    • Merges the pgc files into pgd files
    • Produces a new pgd- artifact.
  • job-pgo-build-nuget-and-publish
    • Consumes the pgd- artifact from above.
    • Packs it into a nupkg and publishes it.
  • job-submit-windows-vpack
    • Only expected to run against Release.
    • Consumes the appxbundle-CONFIG artifact.
    • Publishes it to a vpack for Windows to consume.
  • job-check-code-format
    • Does not use artifacts. Runs clang-format.
  • job-index-github-codenav
    • Does not use artifacts.

Fuzz submission is broken due to changes in the onefuzz client.

I have removed the compliance and security build because it is no longer
supported.

Finally, this pull request has some additional benefits:

  • I've expanded the PGO build phase to cover ARM64!
  • We can remove everything Helix-related except the WTT parser
    • We no longer depend on Helix submission or Helix pools
  • The WPF control's inner DLLs are now codesigned (The WPF nuget package isn't signed by our CI #15404)
  • Symbols for the WPF control, both .NET and C++, are published
    alongside all other symbols.
  • The files we submit to ESRP for signing are batched up into a single
    step1

Closes #11874
Closes #11974
Closes #15404

Footnotes

  1. This will have to change if we want to sign the individual
    per-architecture .appx files before bundling so that they can be
    directly installed.

@DHowett DHowett marked this pull request as ready for review August 9, 2023 23:03
env:
target_exe_path: $(Build.ArtifactStagingDirectory)/$(artifactName)/Fuzzing/x64/test/OpenConsoleFuzzer.exe
test_name: WriteCharsLegacy
- job:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh i said no pipeline yml file should ever have steps in it, but here I am violating that rule

@@ -1,5 +1,27 @@
trigger: none
pr: none
schedules:
- cron: "0 5 * * 2-6" # Run at 05:00 UTC Tuesday through Saturday (Even later than Localization, after the work day in Pacific, Mon-Fri)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've switched the PGO build to have its own built-in nightly schedule, rather than relying on the triggers from the Azure DevOps side.

- cron: "0 5 * * 2-6" # Run at 05:00 UTC Tuesday through Saturday (Even later than Localization, after the work day in Pacific, Mon-Fri)
displayName: "Nightly Instrumentation Build"
branches:
include:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, we could automatically PGO release- branches if I had a way to set the branding correctly...

- main
always: false # only run if there's code changes!

parameters:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewer note: this file is an excellent overview of how the simple rules fit together.

PGO does this:

  • builds for x64, arm64 at the same time
  • after BOTH of those are done, runs PGO on x64 and arm64 at the same time
  • after both of THOSE are done, runs a single job to take all pgc files for all architectures and merge them into pgd files per architecture
  • after that, takes all merged PGD files and slaps them into a nuget

buildConfigurations: [Release]
buildEverything: true
pgoBuildMode: Instrument
artifactStem: -instrumentation
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great use of artifactStem. It looks like this in the backend:

image

includePseudoLoc: true

- ${{ if eq(parameters.buildWPF, true) }}:
# Add an Any CPU build flavor for the WPF control bits
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is interesting. It takes the normal build-project job and runs it AGAIN with a different set of parameters. The most important one is buildPlatforms = [Any CPU]

generateSbom: ${{ parameters.generateSbom }}
codeSign: ${{ parameters.codeSign }}
beforeBuildSteps: # Right before we build, lay down the universal package and localizations
- task: PkgESSetupBuild@12
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stuff is specific to the release build, and so it gets inserted by release.yml ONLY

Comment on lines +49 to +65
- pwsh: |-
$Arch = '${{ platform }}'
$Conf = '${{ parameters.buildConfiguration }}'
$PGCDir = '$(Build.SourcesDirectory)/pgc/${{ platform }}/${{ parameters.buildConfiguration }}'
$PGDDir = '$(Build.SourcesDirectory)/pgd/${{ platform }}/${{ parameters.buildConfiguration }}'
# Flatten the PGD directory
Get-ChildItem $PGDDir -Recurse -Filter *.pgd | Move-Item -Destination $PGDDir -Verbose
Get-ChildItem $PGCDir -Filter *.pgc |
ForEach-Object {
$Parts = $_.Name -Split "!";
$_ | Add-Member Module $Parts[0] -PassThru
} |
Group-Object Module |
ForEach-Object {
& "$(VCToolsInstallDir)\bin\Hostx64\${{ platform }}\pgomgr.exe" /merge $_.Group.FullName "$PGDDir\$($_.Name).pgd"
}
displayName: Merge PGO Counts for ${{ platform }}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the replacement for that weird nuget package called pgo-helpers

Comment on lines 66 to 82
# - task: PowerShell@2
# displayName: 'Convert Test Logs from WTL to xUnit format'
# inputs:
# targetType: filePath
# filePath: build\Helix\ConvertWttLogToXUnit.ps1
# arguments: -WttInputPath '${{ parameters.testLogPath }}' -WttSingleRerunInputPath 'unused.wtl' -WttMultipleRerunInputPath 'unused2.wtl' -XUnitOutputPath 'onBuildMachineResults.xml' -TestNamePrefix '$(BuildConfiguration).$(BuildPlatform)'
# condition: ne(variables['PGOBuildMode'], 'Instrument')
#
# - task: PublishTestResults@2
# displayName: 'Upload converted test logs'
# condition: ne(variables['PGOBuildMode'], 'Instrument')
# inputs:
# testResultsFormat: 'xUnit' # Options: JUnit, NUnit, VSTest, xUnit, cTest
# testResultsFiles: '**/onBuildMachineResults.xml'
# testRunTitle: 'On Build Machine Tests' # Optional
# buildPlatform: $(BuildPlatform) # Optional
# buildConfiguration: $(BuildConfiguration) # Optional
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, i've gotta delete these commented craps

@microsoft-github-policy-service microsoft-github-policy-service bot added Issue-Bug It either shouldn't be doing this or needs an investigation. Issue-Task It's a feature request, but it doesn't really need a major design. Area-WPFControl Things related to the WPF version of the TermControl Product-Terminal The new Windows Terminal. labels Aug 10, 2023
@DHowett
Copy link
Member Author

DHowett commented Aug 10, 2023

So @carlos-zamora, you were asking about running things on the build agents. I would direct you to the job-run-pgo-tests job template. It is very simple: download bin for the right architecture, and run stuff inside it

@DHowett
Copy link
Member Author

DHowett commented Aug 10, 2023

image Hey, it's a lot of builds when you do this kind of thing!

@DHowett
Copy link
Member Author

DHowett commented Aug 10, 2023

@zadjii-msft your tests are failing on all architectures!

@DHowett
Copy link
Member Author

DHowett commented Aug 10, 2023

PGO publish failed because there's already a package with that version number on the feed (this is expected)

Copy link
Member

@lhecker lhecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, yes, of course. Clearly.

Copy link
Member

@carlos-zamora carlos-zamora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! Thanks for doing this.

@@ -7,6 +7,7 @@ pr:
paths:
include:
- src/features.xml
- build/pipelines/feature-flag-ci.yml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this file have to include itself??

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So! This file isn't including itself in the sense that it will run yaml contained inside itself twice. This is part of the paths "included" in the trigger for this pipeline. i.e. this pipeline should run when either of these files is changed!

@DHowett DHowett merged commit 69eff7e into main Aug 11, 2023
19 checks passed
@DHowett DHowett deleted the dev/duhowett/try-to-merge-the-build-rules branch August 11, 2023 19:06
DHowett added a commit that referenced this pull request Aug 11, 2023
This pull request rewrites the entire Azure DevOps build system.

The guiding principles behind this rewrite are:

- No pipeline definitions should contain steps (or tasks) directly.
- All jobs should be in template files.
- Any set of steps that is reused across multiple jobs must be in
  template files.
- All artifact names can be customized (via a property called
  `artifactStem` on all templates that produce or consume artifacts).
- No compilation happens outside of the "Build" phase, to consolidate
  the production and indexing of PDBs.
- **Building the project produces a `bin` directory.** That `bin`
  directory is therefore the primary currency of the build. Jobs will
  either produce or consume `bin` if they want to do anything with the
  build outputs.
- All step and job templates are named with `step` or `job` _first_,
  which disambiguates them in the templates directory.
- Most jobs can be run on different `pool`s, so that we can put
  expensive jobs on expensive build agents and cheap jobs on cheap
  build agents. Some jobs handle pool selection on their own, however.

Our original build pipelines used the `VSBuild` task _all over the
place._ This resulted in Terminal being built in myriad ways, different
for every pipeline. There was an attempt at standardization early on,
where `ci.yml` consumed jobs and steps templates... but when
`release.yml` was added, all of that went out the window.

The new pipelines are consistent and focus on a small, well-defined set
of jobs:

- `job-build-project`
    - This is the big one!
    - Takes a list of build configurations and platforms.
    - Produces an artifact named `build-PLATFORM-CONFIG` for the entire
      matrix of possibilities.
    - Optionally signs the output and produces a bill of materials.
    - Admittedly has a lot going on.
- `job-build-package-wpf`
    - Takes a list of build configurations and platforms.
    - Consumes the `build-` artifact for every config/platform
      possibility, plus one for "Any CPU" (hardcoded; this is where the
      .NET code builds)
    - Produces one `wpf-nupkg-CONFIG` for each configuration, merging
      all platforms.
    - Optionally signs the output and produces a bill of materials.
- `job-merge-msix-into-bundle`
    - Takes a list of build configurations and platforms.
    - Consumes the `build-` artifact for every config/platform
    - Produces one `appxbundle-CONFIG` for each configuration, merging
      all platforms for that config into one `msixbundle`.
    - Optionally signs the output and produces a bill of materials.
- `job-package-conpty`
    - Takes a list of build configurations and platforms.
    - Consumes the `build-` artifact for every config/platform
    - Produces one `conpty-nupkg-CONFIG` for each configuration, merging
      all platforms.
    - Optionally signs the output and produces a bill of materials.
- `job-test-project`
    - Takes **one** build config and **one** platform.
    - Consumes `build-PLATFORM-CONFIG`
    - Selects its own pools (hardcoded) because it knows about
      architectures and must choose the right agent arch.
    - Runs tests (directly on the build agent).
- `job-run-pgo-tests`
    - Just like the above, but runs tests where `IsPgo` is `true`
    - Collects all of the PGO counts and publishes a `pgc-intermediates`
      artifact for that platform and configuration.
- `job-pgo-merge-pgd`
    - Takes **one** build config and multiple platforms.
    - Consumes `build-$platform-CONFIG` for each platform.
    - Consumes `pgc-intermediates-$platform-CONFIG` for each platform.
    - Merges the `pgc` files into `pgd` files
    - Produces a new `pgd-` artifact.
- `job-pgo-build-nuget-and-publish`
    - Consumes the `pgd-` artifact from above.
    - Packs it into a `nupkg` and publishes it.
- `job-submit-windows-vpack`
    - Only expected to run against `Release`.
    - Consumes the `appxbundle-CONFIG` artifact.
    - Publishes it to a vpack for Windows to consume.
- `job-check-code-format`
    - Does not use artifacts. Runs `clang-format`.
- `job-index-github-codenav`
    - Does not use artifacts.

Fuzz submission is broken due to changes in the `onefuzz` client.

I have removed the compliance and security build because it is no longer
supported.

Finally, this pull request has some additional benefits:

- I've expanded the PGO build phase to cover ARM64!
- We can remove everything Helix-related except the WTT parser
    - We no longer depend on Helix submission or Helix pools
- The WPF control's inner DLLs are now codesigned (#15404)
- Symbols for the WPF control, both .NET and C++, are published
  alongside all other symbols.
- The files we submit to ESRP for signing are batched up into a single
  step[^1]

Closes #11874
Closes #11974
Closes #15404

[^1]: This will have to change if we want to sign the individual
per-architecture `.appx` files before bundling so that they can be
directly installed.

(cherry picked from commit 69eff7e)
Service-Card-Id: 90183388
Service-Version: 1.18
DHowett added a commit that referenced this pull request Aug 11, 2023
This pull request rewrites the entire Azure DevOps build system.

The guiding principles behind this rewrite are:

- No pipeline definitions should contain steps (or tasks) directly.
- All jobs should be in template files.
- Any set of steps that is reused across multiple jobs must be in
  template files.
- All artifact names can be customized (via a property called
  `artifactStem` on all templates that produce or consume artifacts).
- No compilation happens outside of the "Build" phase, to consolidate
  the production and indexing of PDBs.
- **Building the project produces a `bin` directory.** That `bin`
  directory is therefore the primary currency of the build. Jobs will
  either produce or consume `bin` if they want to do anything with the
  build outputs.
- All step and job templates are named with `step` or `job` _first_,
  which disambiguates them in the templates directory.
- Most jobs can be run on different `pool`s, so that we can put
  expensive jobs on expensive build agents and cheap jobs on cheap
  build agents. Some jobs handle pool selection on their own, however.

Our original build pipelines used the `VSBuild` task _all over the
place._ This resulted in Terminal being built in myriad ways, different
for every pipeline. There was an attempt at standardization early on,
where `ci.yml` consumed jobs and steps templates... but when
`release.yml` was added, all of that went out the window.

The new pipelines are consistent and focus on a small, well-defined set
of jobs:

- `job-build-project`
    - This is the big one!
    - Takes a list of build configurations and platforms.
    - Produces an artifact named `build-PLATFORM-CONFIG` for the entire
      matrix of possibilities.
    - Optionally signs the output and produces a bill of materials.
    - Admittedly has a lot going on.
- `job-build-package-wpf`
    - Takes a list of build configurations and platforms.
    - Consumes the `build-` artifact for every config/platform
      possibility, plus one for "Any CPU" (hardcoded; this is where the
      .NET code builds)
    - Produces one `wpf-nupkg-CONFIG` for each configuration, merging
      all platforms.
    - Optionally signs the output and produces a bill of materials.
- `job-merge-msix-into-bundle`
    - Takes a list of build configurations and platforms.
    - Consumes the `build-` artifact for every config/platform
    - Produces one `appxbundle-CONFIG` for each configuration, merging
      all platforms for that config into one `msixbundle`.
    - Optionally signs the output and produces a bill of materials.
- `job-package-conpty`
    - Takes a list of build configurations and platforms.
    - Consumes the `build-` artifact for every config/platform
    - Produces one `conpty-nupkg-CONFIG` for each configuration, merging
      all platforms.
    - Optionally signs the output and produces a bill of materials.
- `job-test-project`
    - Takes **one** build config and **one** platform.
    - Consumes `build-PLATFORM-CONFIG`
    - Selects its own pools (hardcoded) because it knows about
      architectures and must choose the right agent arch.
    - Runs tests (directly on the build agent).
- `job-run-pgo-tests`
    - Just like the above, but runs tests where `IsPgo` is `true`
    - Collects all of the PGO counts and publishes a `pgc-intermediates`
      artifact for that platform and configuration.
- `job-pgo-merge-pgd`
    - Takes **one** build config and multiple platforms.
    - Consumes `build-$platform-CONFIG` for each platform.
    - Consumes `pgc-intermediates-$platform-CONFIG` for each platform.
    - Merges the `pgc` files into `pgd` files
    - Produces a new `pgd-` artifact.
- `job-pgo-build-nuget-and-publish`
    - Consumes the `pgd-` artifact from above.
    - Packs it into a `nupkg` and publishes it.
- `job-submit-windows-vpack`
    - Only expected to run against `Release`.
    - Consumes the `appxbundle-CONFIG` artifact.
    - Publishes it to a vpack for Windows to consume.
- `job-check-code-format`
    - Does not use artifacts. Runs `clang-format`.
- `job-index-github-codenav`
    - Does not use artifacts.

Fuzz submission is broken due to changes in the `onefuzz` client.

I have removed the compliance and security build because it is no longer
supported.

Finally, this pull request has some additional benefits:

- I've expanded the PGO build phase to cover ARM64!
- We can remove everything Helix-related except the WTT parser
    - We no longer depend on Helix submission or Helix pools
- The WPF control's inner DLLs are now codesigned (#15404)
- Symbols for the WPF control, both .NET and C++, are published
  alongside all other symbols.
- The files we submit to ESRP for signing are batched up into a single
  step[^1]

Closes #11874
Closes #11974
Closes #15404

[^1]: This will have to change if we want to sign the individual
per-architecture `.appx` files before bundling so that they can be
directly installed.

(cherry picked from commit 69eff7e)
Service-Card-Id: 90183387
Service-Version: 1.17
crutkas pushed a commit to microsoft/PowerToys that referenced this pull request Sep 25, 2024
This pull request rewrites the entire Azure DevOps build system.

The guiding principles behind this rewrite are:

- No pipeline definitions should contain steps (or tasks) directly.
- All jobs should be in template files.
- Any set of steps that is reused across multiple jobs must be in
  template files.
- All artifact names can be customized (via a property called
  `artifactStem` on all templates that produce or consume artifacts).
- No compilation happens outside of the "Build" phase, to consolidate
  the production and indexing of PDBs.
- All step and job templates are named with `step` or `job` _first_,
  which disambiguates them in the templates directory.
- Most jobs can be run on different `pool`s, so that we can put
  expensive jobs on expensive build agents and cheap jobs on cheap
  build agents. Some jobs handle pool selection on their own, however.

Our original build pipelines used the `VSBuild` task _all over the
place._ This resulted in PowerToys being built in myriad ways, different
for every pipeline. There was an attempt at standardization early on,
where `ci.yml` consumed jobs and steps templates... but when
`release.yml` was added, all of that went out the window.

It's the same story as Terminal (microsoft/terminal#15808).

The new pipelines are consistent and focus on a small, well-defined set
of jobs:

- `job-build-project`
    - This is the big one!
    - Takes a list of build configurations and platforms.
    - Produces an artifact named `build-PLATFORM-CONFIG` for the entire
      matrix of possibilities.
    - Builds all of the installers.
    - Optionally signs the output (all of the output).
    - Admittedly has a lot going on.
- `job-test-project`
    - Takes **one** build config and **one** platform.
    - Consumes `build-PLATFORM-CONFIG`
    - Selects its own pools (hardcoded) because it knows about
      architectures and must choose the right agent arch.
    - Runs tests (directly on the build agent).
- `job-publish-symbols-using-symbolrequestprod-api`
    - Consumes `**/*.pdb` from all prior build phases.
    - Uploads all PDBs in one artifact to Azure DevOps
    - Uses Microsoft's internal symbol publication REST API to submit
      stripped symbols to MSDL for public consumption.

Finally, this pull request has some additional benefits:

- Symbols are published to the private and public feeds at the same
  time, in the same step. They should be available in the public symbol
  server for public folks to debug against!
- We have all the underpinnings necessary to run tests on ARM64 build
  agents.
    - Right now, `ScreenResolutionUtility` is broken
    - I had to introduce a custom version of `UseDotNet` which would
      install the right architecture (🤦); see microsoft/azure-pipelines-tasks#20300.
- All dotnet and nuget versioning is consolidated into a small set of
  step templates.
- This will provide a great place for us to handle versioning changes
  later, since all versioning happens in one place.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Build Issues pertaining to the build system, CI, infrastructure, meta Area-WPFControl Things related to the WPF version of the TermControl Issue-Bug It either shouldn't be doing this or needs an investigation. Issue-Task It's a feature request, but it doesn't really need a major design. Product-Meta The product is the management of the products. Product-Terminal The new Windows Terminal. zBugBash-Consider
Projects
4 participants