use GitHub Actions to cache environments and use dependent CI workflow structure #6867

zacharyburnett · 2022-05-31T16:42:14Z

Notable changes:

workflow jobs are defined in GitHub Actions to simplify the environment caching and construction
all environments are cached to speed up subsequent runs; cache key is set to hash the build files so that if a build requirement changes the environment is built from scratch
matrices are dependent in a workflow such that a commit that fails style checks will not run tests, and so on
matrices test a more comprehensive set of platforms and Python versions
newest CRDS operational context is cached to speed up testing

Checklist

zacharyburnett · 2022-06-07T14:54:19Z

waiting on #6871

…g environments directly with GitHub Actions bottleneck special-install tests behind the first test matrix

zacharyburnett · 2022-10-27T17:07:51Z

One hiccup I recall in setting up CRDS caching to actually work between builds where the default context hadn't change was to make sure that the CRDS cache action is only happening when pytest is actually running. If CRDS caching happens during code style checks or bandit, those don't actually pull any of the reffiles that the tests need, and because those jobs finish first, they produce the cache before the pytest runs do, leading to successful caching of precisely zero CRDS files. Hopefully you can avoid that here.

@jdavies-st Does this apply to running crds sync on the latest context, like the following?

crds sync --contexts jwst_1009.pmap

It appears to be caching 438591 bytes at the moment

zacharyburnett · 2022-10-27T17:08:49Z

I've also added a workflow_dispatch: that allows the unit testing workflow to be run from the web client using a specific CRDS context

jdavies-st · 2022-10-28T08:58:34Z

You definitely don't want to use crds sync, because it will pull all items for that context. And that's a lot of data. 100s of GB.

The way the CRDS caching should work is that workflow A runs the unit tests, pulling in all the CRDS reffiles it needs. At the end of the workflow, the state of the crds_cache directory should be cached, and that should be used for the next workflow.

The trick is to only github actions cache the workflows that actually run pytest on the whole suite of tests, otherwise you will cache nothing, and you will be restoring empty caches. And make sure the cache key is something that changes occasionally but not too often, or it defeats the purpose of the caching. You could have the key be the CRDS context, but I don't think that's a very good proxy for which CRDS reffiles change for the jwst unit tests. I.e. the unit test reffiles are pretty stable, whereas CRDS context can change almost every day. And even if the github actions cache is a bit out-of-date with the current state of the CRDS cache, that's not a problem as it will pull the new files over. You could just hash the reference files in that dir (not the pmaps or rmaps) for a key.

jdavies-st

I don't think this refactor is going to make the current problem of the failure of fetching CRDS reffiles, which then causes the workflows to fail. This PR is not attached to any issue, but I assume that is the motivating factor?

What needs fixing is the github cache action setup - it should only run on jobs that run pytest on the full unit test suite, it should only populate the cache when run on push (merge PR) against master, and its key should be hashed from the contents of $HOME/crds_cache/references, i.e. the files that get pulled from CRDS to run these tests.

Note that PRs are always from forks, so in order for the cache to be used, it needs to be generated from the default branch on spacetelescope/jwst. caches generated from branches on forks won't be used (security reasons).

All the other caching added in this PR on may speed up the build somewhat, but doesn't solve the above fundamental problem. And caching the build state of dependencies might have a bunch of unintended consequences that we don't understand as well. I would abide by the "if it's not broke, don't fix it" adage.

Finally, removing tox is actually a step backwards, as it is the only way that we locally can run tests exactly the same way they're run via Github actions. This is an important middle layer that would be a shame to remove with better motivation. I would highly recommend not removing tox.ini.

jdavies-st · 2022-10-28T09:11:22Z

.github/workflows/ci.yml

+  cache_crds:
+    name: cache current CRDS operational context
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/setup-python@v4
+        with:
+          python-version: '3.x'
+      - uses: actions/cache@v3
+        with:
+          path: ${{ env.pythonLocation }}
+          key: cache-crds-${{ runner.os }}-${{ env.pythonLocation }}
+      - run: pip install crds
+      - run: pip freeze
+      - run: echo "CRDS_CONTEXT=$(crds list --operational-context)" >> $GITHUB_ENV
+        if: github.event.inputs.crds_context == ''
+      - run: echo "CRDS_CONTEXT=${{ github.event.inputs.crds_context }}" >> $GITHUB_ENV
+        if: github.event.inputs.crds_context != ''
+      - uses: actions/cache@v3
+        if: env.CRDS_CONTEXT != ''
+        with:
+          path: ${{ env.CRDS_PATH }}
+          key: crds-${{ env.CRDS_CONTEXT }}
+      - run: crds sync --contexts ${{ env.CRDS_CONTEXT }}
+        if: env.CRDS_CONTEXT != ''


I don't think you want to do this job. Running crds sync will pull contexts over, but not actually do anything. What you want cached are the CRDS reference files in @HOME/crds_cache that are actually used by the unit tests, a very, very small subset of the ~300GB snapshot of all CRDS reffiles.

jdavies-st · 2022-10-28T09:57:35Z

The Github cache action was removed here

#6165

It should be turned on again, making sure it only runs on jobs that run pytest, and it gets restored based on a hash of $HOME/crds_cache/references. 👍

zacharyburnett · 2022-10-28T14:45:42Z

closing in favor of successor #7323

zacharyburnett self-assigned this May 31, 2022

zacharyburnett force-pushed the cache_workflow_envs branch 5 times, most recently from 14db8c2 to b42892e Compare June 7, 2022 13:09

zacharyburnett marked this pull request as ready for review June 7, 2022 13:42

zacharyburnett changed the title ~~separate and cache workflow jobs~~ use GitHub Actions to cache environments and use dependent CI workflow structure Jun 7, 2022

zacharyburnett force-pushed the cache_workflow_envs branch from 4f48d11 to 2b67cde Compare June 7, 2022 17:52

zacharyburnett marked this pull request as draft June 7, 2022 19:40

zacharyburnett force-pushed the cache_workflow_envs branch 2 times, most recently from 768a4e5 to c3b74ad Compare June 9, 2022 13:53

zacharyburnett force-pushed the cache_workflow_envs branch from 15c38ca to 010c71c Compare June 16, 2022 14:28

zacharyburnett force-pushed the cache_workflow_envs branch 2 times, most recently from 7813bf1 to 1957c80 Compare July 14, 2022 20:40

zacharyburnett force-pushed the cache_workflow_envs branch 2 times, most recently from 4e555d0 to c85ca3f Compare July 19, 2022 14:56

zacharyburnett added testing automation Continuous Integration (CI) and testing automation tools and removed testing automation Continuous Integration (CI) and testing automation tools labels Jul 20, 2022

zacharyburnett force-pushed the cache_workflow_envs branch from bd62f67 to 1aa4219 Compare July 26, 2022 18:32

github-actions bot added automation Continuous Integration (CI) and testing automation tools testing labels Jul 26, 2022

zacharyburnett force-pushed the cache_workflow_envs branch 2 times, most recently from 2777fdf to 66a7a16 Compare August 1, 2022 15:55

zacharyburnett force-pushed the cache_workflow_envs branch from dddaddc to 5e6c477 Compare August 16, 2022 19:13

zacharyburnett requested review from jdavies-st and hbushouse August 18, 2022 13:01

zacharyburnett force-pushed the cache_workflow_envs branch 2 times, most recently from 5cb04d0 to 701c697 Compare August 29, 2022 13:33

zacharyburnett force-pushed the cache_workflow_envs branch 2 times, most recently from c06c87a to 0b6c595 Compare September 19, 2022 13:16

zacharyburnett force-pushed the cache_workflow_envs branch from 4061d01 to 3e73b72 Compare October 10, 2022 19:21

zacharyburnett force-pushed the cache_workflow_envs branch 2 times, most recently from a4708e8 to 023f02d Compare October 27, 2022 16:51

zacharyburnett and others added 13 commits October 27, 2022 12:53

separate and cache workflow jobs, and remove tox in favor of cachin…

80e7699

…g environments directly with GitHub Actions bottleneck special-install tests behind the first test matrix

cache devdeps daily

6c3c9f9

add cython to oldestdeps test

8c56b12

change job dependencies

ec83056

checkout most recent tag in sdpdeps job

2507489

fetch all tags

e60e4bd

cache current CRDS operational context before testing

2023e1e

unfreeze action versions

3c69d6a

test Python 3.11

d56b54d

update actions versions

9b8b831

cache CRDS context in CI

4e603a0

update action version

241b67b

only install crds to cache latest context

3b5fa95

zacharyburnett force-pushed the cache_workflow_envs branch from 023f02d to 3b5fa95 Compare October 27, 2022 16:54

jdavies-st suggested changes Oct 28, 2022

View reviewed changes

zacharyburnett mentioned this pull request Oct 28, 2022

separate tox testenvs to cache CRDS reference files retrieved by initial test #7323

Closed

7 tasks

zacharyburnett closed this Oct 28, 2022

zacharyburnett deleted the cache_workflow_envs branch November 1, 2022 14:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use GitHub Actions to cache environments and use dependent CI workflow structure #6867

use GitHub Actions to cache environments and use dependent CI workflow structure #6867

zacharyburnett commented May 31, 2022 •

edited

Loading

zacharyburnett commented Jun 7, 2022

zacharyburnett commented Oct 27, 2022

zacharyburnett commented Oct 27, 2022

jdavies-st commented Oct 28, 2022

jdavies-st left a comment

jdavies-st Oct 28, 2022 •

edited

Loading

jdavies-st commented Oct 28, 2022

zacharyburnett commented Oct 28, 2022

use GitHub Actions to cache environments and use dependent CI workflow structure #6867

use GitHub Actions to cache environments and use dependent CI workflow structure #6867

Conversation

zacharyburnett commented May 31, 2022 • edited Loading

zacharyburnett commented Jun 7, 2022

zacharyburnett commented Oct 27, 2022

zacharyburnett commented Oct 27, 2022

jdavies-st commented Oct 28, 2022

jdavies-st left a comment

Choose a reason for hiding this comment

jdavies-st Oct 28, 2022 • edited Loading

Choose a reason for hiding this comment

jdavies-st commented Oct 28, 2022

zacharyburnett commented Oct 28, 2022

zacharyburnett commented May 31, 2022 •

edited

Loading

jdavies-st Oct 28, 2022 •

edited

Loading