Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: only run jobs when relevant files have been changed #12006

Merged
merged 25 commits into from
Oct 18, 2023

Conversation

agilgur5
Copy link
Contributor

@agilgur5 agilgur5 commented Oct 14, 2023

Fixes #10156
Fixes #10265

  • Note: these can be further segmented and improved as well, see "Future Work" below

Motivation

Make CI run faster for certain kinds of PRs, use less CI minutes (and just reduce unnecessary compute in general), and limit the blast radius of flaky tests as well

  • e.g. don't run E2E tests if only UI or only Docs have changed
    • or don't run UI CI when UI has not chaged

Modifications

Changed Files checks

Changes to needs for codegen and lint

  • both can run independently of each other and independently of tests

    • they should fail fast if the other one fails though, I suppose
  • without this change some of the checks malfunction, since if tests are skipped, then codegen and lint are skipped too (see needs docs)

Changes to Status Checks

Note that due to lack of GH features such as actions/runner#952 and https://github.com/orgs/community/discussions/9141, there is a workaround embedded here for the E2E test matrix

  • skipping the overall e2e-tests job will cause status checks for individual E2E tests to just hang indefinitely
    • so instead we can change the required status checks to check the new composite result
      • that or we'd have to put an if for every single step of the E2E tests and also check for changed files within each E2E test
  • the new e2e-tests-composite-result job aggregates all of the results as a workaround

Verification

I tested this quite extensively in my fork in agilgur5#1:

Notes to Reviewers

  • See the "Changes to Status Checks" section above that we will need to implement
  • We will also need to be a bit careful in making sure that any new code is properly listed in the files lists

Future Work

  1. Further segment E2E tests -- not all tests necessarily need to run for all back-end changes
  2. Add to other GH Workflows, like the Docs Workflow?
    • would not be a huge efficiency increase, but would limit some unnecessary runs
  3. Add a check for argoexec-image job as well
    • this one may require some more effort to get everything necessary, particularly with repetition from .dockerignore etc

- e.g. don't run E2E tests if only UI or only Docs have changed
  - or don't run UI CI when UI has not chaged

- use [`tj-actions/changed-files`](https://github.com/tj-actions/changed-files) action for this
  - the most popular and full featured one I could find

- run `changed-files` in its own job that must run before all other jobs
  - have it `output` booleans for specific changes -- limit all the `changed_files` nuances, naming, syntax, etc to that one job
    - job `outputs` are also needed for skipping other jobs, as step outputs can't be directly accessed
      - see https://docs.github.com/en/actions/using-jobs/defining-outputs-for-jobs and https://docs.github.com/en/actions/learn-github-actions/contexts#needs-context
- have other jobs specify it in their `needs` and then skip if not needed with `if`

- use multi-output variant of `changed-files` YAML directive
  - so can check e2e vs docs vs UI etc
- use `any_modified` as that includes added, copied, modified, renamed, and deleted (ACMRD)
  - `any_changed` does not include deletions

NOTE: I realized after that `docs` isn't a job, `lint` is, so there's gonna be some follow-up commits
- and well need to test anyway too
- will also include the `all` list _after_ testing, since it would make everything run

Signed-off-by: Anton Gilgur <[email protected]>
- also tiny renames

Signed-off-by: Anton Gilgur <[email protected]>
Signed-off-by: Anton Gilgur <[email protected]>
Signed-off-by: Anton Gilgur <[email protected]>
Signed-off-by: Anton Gilgur <[email protected]>
Signed-off-by: Anton Gilgur <[email protected]>
Signed-off-by: Anton Gilgur <[email protected]>
Signed-off-by: Anton Gilgur <[email protected]>
Signed-off-by: Anton Gilgur <[email protected]>
- list merging is not supported in YAML natively, but `changed-files` appears to support it? https://github.com/tj-actions/changed-files/blob/2a10bef1b42044172f2e64d40beeb8fbad792438/test/changed-files.yml#L8-L11

Signed-off-by: Anton Gilgur <[email protected]>
- both can run independently of each other and independently of tests
  - they should fail fast if the other one fails though, I suppose

- otherwise this breaks some of the checks, since if tests are skipped, then codegen and lint are skipped too
  - or, well, that's my hypothesis at least -- will test by pushing this

Signed-off-by: Anton Gilgur <[email protected]>
- so now only lint should run...

Signed-off-by: Anton Gilgur <[email protected]>
Signed-off-by: Anton Gilgur <[email protected]>
This reverts commit 34671d7.

Signed-off-by: Anton Gilgur <[email protected]>
Signed-off-by: Anton Gilgur <[email protected]>
Signed-off-by: Anton Gilgur <[email protected]>
Signed-off-by: Anton Gilgur <[email protected]>
Copy link
Contributor Author

@agilgur5 agilgur5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments for some in-line clarity about some changes

.github/workflows/ci-build.yaml Show resolved Hide resolved
common: &common
- .github/workflows/ci-build.yaml
tests: &tests
- *common
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned in the PR description, tj-actions/changed-files supports a YAML variant that has array anchors & aliases

.github/workflows/ci-build.yaml Show resolved Hide resolved
.github/workflows/ci-build.yaml Show resolved Hide resolved
@isubasinghe
Copy link
Member

otherwise this breaks some of the checks, since if tests are skipped, then codegen and lint are skipped too

Can you elaborate on this?
If tests are skipped the lint is also skipped?
Would that be an issue for docs changes then? Those actually get linted.
Would we be not linting them in that case?

@agilgur5
Copy link
Contributor Author

agilgur5 commented Oct 15, 2023

Can you elaborate on this?

If one or more items of a needs array is skipped, any dependent jobs will also be skipped (see needs docs)
In the previous version of the code, if the tests job was skipped, the lint and codegen jobs would also be skipped. tests never had a conditional before though.

So now that tests does have a conditonal, I had to remove the needs from the lint and codegen jobs so that they run regardless if tests is skipped or not. (they will only be skipped if the relevant files haven't changed).
The 3 are independent jobs anyway, so the previous needs block seemed to be more for fast fail purposes; i.e. don't run lint or codegen if tests fail.
Right now they will run regardless of the other two jobs.

That is, I explicitly chose the safer option as it otherwise does not cover all cases (which I literally noticed while testing the "only run lint" scenario).

Would that be an issue for docs changes then?

In ci-build.yaml, which is the only GH workflow I changed here, the docs lint runs as part of the lint job mentioned above

There is also a separate docs.yaml GH workflow that builds the docs. That one I did not change in this PR; I left as "Future Work" if we want to tackle that. It's simple but duplicate and low value compared to ci-build.yaml. (it would have to be duplicative as GH workflows don't share any information with each other)

@agilgur5
Copy link
Contributor Author

agilgur5 commented Oct 15, 2023

The 3 are independent jobs anyway, so the previous needs block seemed to be more for fast fail purposes; i.e. don't run lint or codegen if tests fail.

Mmm I think there is a way to leave the needs in if I re-work the conditional a bit, per the needs docs. I think if I were to add a !failure() && to the conditional maybe.

GH Actions have some implicit conditionals that are very disorienting IMO. They're basically missing Argo's enhanced depends field for complex dependencies (i.e. GH Actions basically only have dependencies + when, but no enhanced depends).

Not a huge difference either way though with or without the needs.

@isubasinghe
Copy link
Member

Ah thanks @agilgur5, makes sense now, I misunderstood previously, in that case you have fixed an issue withe the CI process really.

Copy link
Member

@isubasinghe isubasinghe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd personally prefer if the Makefile was included, I think the tasks.yaml can probably be avoidede though

@agilgur5
Copy link
Contributor Author

agilgur5 commented Oct 15, 2023

in that case you have fixed an issue withe [sic] the CI process really.

Sort of; it would never occur previously because there wasn't a conditional on tests previously. That was very subtle / error-prone though; was previously very easy to add a conditional and not realize it would cause further skips

.github/workflows/ci-build.yaml Show resolved Hide resolved
.github/workflows/ci-build.yaml Outdated Show resolved Hide resolved
Signed-off-by: Anton Gilgur <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/build Build or GithubAction/CI issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better segmentation of tests Skip unnecessary CI workflows if only have documentary changes
4 participants