Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update concurrency rules and limit build queue to one #2932

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

bmorelli25
Copy link
Member

@bmorelli25 bmorelli25 commented Feb 9, 2024

Summary

This PR builds on #2931.

Goal 1: Set concurrency at the pipeline level, not the step level.

Pipeline-level concurrency isn't a feature of BK: https://forum.buildkite.community/t/pipeline-concurrency-limit/240. Instead, you have to set concurrency limits on each step in a build within a pipeline. In this implementation, I use a concurrency gate as it seems to be the cleanest implementation.

In the following screenshot you can see how the concurrency gate prevents new builds from starting the "Build the docs" step and instead holds them at the "Start concurrency gate" step:

Screenshot 2024-02-09 at 2 19 21 PM

Goal 2: Set a queue limit of one running build and one queued build.

The downside of concurrency limits paired with scheduled builds is that the queue for builds will grow if/when we have a rebuild. This happens because our rebuild build time is way longer (70 mins) than our scheduled build time (30 mins).

To fix this problem, this PR adds a new first BK build step: check not this build and not the previous build, but the third build back to determine if it is still in a running state. If it is, it means we already have a BK build in the queue. We can then safely cancel the current build to prevent the queue from growing. See this diagram for more on build states.

Because the "Full rebuild or incremental build?" step is concurrency gated, we will only determine the type of build after the previous build has been unblocked. This prevents this step from breaking the functionality added in #2931.

Testing

Testing this PR is fun and easy. Go here and create a new build: https://buildkite.com/elastic/docs-build/builds?branch=move-concurrency. Wait until an agent is assigned. Once an agent is assigned, create another new build. Repeat this process until you have three builds running.

Build 1

The first build should do the following in the Check for queue step:

2024-02-09 13:14:08 PST | Determining if there are multiple builds waiting.
2024-02-09 13:14:08 PST | The pipeline is ready for a new build.

The first build will then run the ✅ Build Docs step.

Build 2

The second build should do the following in the Check for queue step:

2024-02-09 13:16:08 PST | Determining if there are multiple builds waiting.
2024-02-09 13:16:08 PST | The pipeline is ready for a new build.

The second build should say "Waiting on currency group" in the Start of concurrency step:
Screenshot 2024-02-09 at 1 50 00 PM

Build 3

The third build should do the following in the Check for queue step:

2024-02-09 22:17:07 UTC | Determining if there are multiple builds waiting.
2024-02-09 22:17:07 UTC | The pipeline is congested. Canceling this build.

The third build will then cancel itself. Example: https://buildkite.com/elastic/docs-build/builds/3638

Screenshot 2024-02-09 at 2 17 35 PM

Build 2

Eventually, Build 1 will complete. This will open the concurrency gate and allow Build 2 to enter the ✅ Build Docs step.

Copy link

github-actions bot commented Feb 9, 2024

A documentation preview will be available soon.

Request a new doc build by commenting
  • Rebuild this PR: run docs-build
  • Rebuild this PR and all Elastic docs: run docs-build rebuild

run docs-build is much faster than run docs-build rebuild. A rebuild should only be needed in rare situations.

If your PR continues to fail for an unknown reason, the doc build pipeline may be broken. Elastic employees can check the pipeline status here.

@nkammah
Copy link
Contributor

nkammah commented Feb 9, 2024

I could be wrong but this still looks to be set at a step level, not pipeline - but i'm not too sure how that's supposed to work, best thing would be to try it out.

@bmorelli25 bmorelli25 marked this pull request as draft February 9, 2024 21:18
@bmorelli25 bmorelli25 changed the title Update concurrency rules Update concurrency rules and limit build queue to one Feb 9, 2024
@bmorelli25 bmorelli25 marked this pull request as ready for review February 9, 2024 22:18
@bmorelli25 bmorelli25 requested review from colleenmcginnis, a team, scottybollinger and glitteringkatie and removed request for lcawl, leemthompo, KOTungseth and a team February 9, 2024 22:18
Comment on lines +2 to +27
# - input: "Build parameters"
# if: build.source == "ui"
# fields:
# - select: "Rebuild?"
# key: "REBUILD"
# default: ""
# required: false
# options:
# - label: "no"
# value: ""
# - label: "yes"
# value: "rebuild"
# hint: "Should all books be rebuilt, regardless of what has changed? Build once with this set to true after every release."
# - select: "How should broken links be handled?"
# key: "BROKEN_LINKS"
# default: ""
# required: false
# options:
# - label: "Continue without warning"
# value: "skiplinkcheck"
# - label: "Continue, but log a warning"
# value: "warnlinkcheck"
# - label: "Fail the build"
# value: ""
# hint: "Should we ignore checking broken links? Should we allow to run the build without failing if there's a broken link? Ignoring broken links is dangerous not just because bad links will leak into the public site but because subsequent builds and pull requests that do not fix the links fail."
# - wait
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewer: All of this is commented out so we can test this PR. Once a reviewer has tested this PR, I'll uncomment this out and we can merge.

Copy link
Contributor

@nkammah nkammah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to work as designed - but I have not tested nor thought deeply of edge cases.

@colleenmcginnis
Copy link
Contributor

Testing this PR is fun and easy. Go here and create a new build: https://buildkite.com/elastic/docs-build/builds?branch=move-concurrency. Wait until an agent is assigned. Once an agent is assigned, create another new build. Repeat this process until you have three builds running.

@bmorelli25 when I tested this, both build 2 and build 3 were canceled:

Determining if there are multiple builds waiting.
The pipeline is congested. Canceling this build.

Build 1 started running as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants