Merge the CI workflows and stop using `setup-ocaml` #473

shym · 2024-08-30T17:50:49Z

This PR merges the two CI main workflows, namely common.yml and msvc-common.yml into a single common.yml workflow that uses neither the setup-ocaml action nor OPAM so that:

the workflow has reduced dependencies and a smaller footprint,
the compiler configuration can be tuned (to disable features that are not relevant to the multicoretests, such as manpages),
it gets easier to test any branch of any fork of the compiler,
it brings the workflow really close to a version that can be pushed to the compiler repository.

The logic of the workflow is largely extracted to a runner.sh script so that:

it is easier to maintain, as it is a regular script, instead of being embedded into a different syntax, especially with the issue of portability to Windows due to line endings,
it is easy to reuse that logic for different setups than regular GH Action runners, such as VMs, other infrastructures, etc.

Extra notes:

the actual test steps are currently not triggered through the runner.sh script, in the unlikely case this could make some difference on Windows to have an sh involved somewhere in the stack,
the new common-workflow default is to test trunk rather than 5.0.0, which seems a better default.

Unfortunately, most of this PR is a single large commit that rewrites the common workflows and pulls out the runner.sh script. I didn’t find good intermediate steps for this rewrite.

I’ve tested this PR by checking that the output of ocamlc -config is indeed the expected one in all runs. I hope I didn’t miss something on the way.

Open question: I wondered whether we should use fixed versions of QCheck and dune rather than the latest commit. If I remember correctely, it is currently done so in the MSVC workflows as a couple of fixes were required from both projects to get the build working. Now that they have been released (at least in dune?), we could go back to the latest released versions.

jmid · 2024-09-02T16:34:15Z

This is very nice, thanks a ton! 🙏 😃

Uniformity is appreciated in itself, but the current lack of it currently means that Linux, MinGW+Cygwin, and MSVC need three different changes to test a feature branch - and doing so for Cygwin is currently not possible without custom opam-files.

On top, this PR eliminates the need for the (otherwise nice) shym/custom-opam-repository, which means one less step, e.g., to move to testing, e.g., 5.3.0~alpha1 when it arrives.

Here are my initial impressions:

To try this out on a concrete use case, I created a branch based on it targeting 'MisterDA/ocaml and the branch winpthreadsectomy by changing as follows:

      COMPILER_REPO:            'MisterDA/ocaml'
      COMPILER_REF:             'refs/heads/winpthreadsectomy'

https://github.com/ocaml-multicore/multicoretests/tree/shym-new-ci-common-winpthreadsectomy
I then realized that the above env-variables are not used for that, but that checkout uses the inputs.-variables instead:

      - name: Fetch OCaml
        uses: actions/checkout@v4
        with:
          repository: ${{ inputs.compiler_repository }}
          ref: ${{ inputs.compiler_ref }}

One could of course target a feature branch by passing the appropriate repo and ref in each individual workflow, but it may be simpler to do so by 'overriding' and inputs-setting across the board as we can do now? 🤔

It would be good to update the current README instructions to reflect the new changes required.

By chance I spotted that the Cygwin workflow is not running any multicoretests! (I realized because it completed before all other workflows and that has probably never happened before... 😅 )
It seems that the last two steps are invoked, but somehow the Cygwin shell is not waiting for the actual test execution?
https://github.com/ocaml-multicore/multicoretests/actions/runs/10637017068/job/29490050468?pr=473

shym · 2024-09-02T17:34:32Z

I then realized that the above env-variables are not used for that, but that checkout uses the inputs.-variables instead:

Oh! Sorry about that! You’re right, fixed in e9fa0da (which I pushed to your branch, to trigger the intended tests), to squash into this PR, when I figure out why:

By chance I spotted that the Cygwin workflow is not running any multicoretests!

?!?
But thank you for spotting that!

shym · 2024-09-03T11:33:43Z

By chance I spotted that the Cygwin workflow is not running any multicoretests!

I still don’t understand really why this is happening. In particular, I tried to launch D:\ocaml\bin\dune (so with the explicit full path, even if the directory is in PATH) from powershell and it doesn’t show anything (not even a failure).
So I just ended up dropping my initial desire to trigger the test from the default shell and now trigger them through the runner script.

Another outcome of investigating this is I realised that caching was not working on Cygwin (maybe on all Windows setups). This is hopefully fixed now.

shym · 2024-09-03T15:17:13Z

I just updated because it seems the terrible trick with pristine PATH is no longer necessary (it served to work around a bug of some binaries from Cygwin hiding the versions provided by the image when running the cache action; maybe we are no longer installing those binaries here, so it gets much simpler). And to fix a bug due to the fact that MSVC builds always need at least a script from OCaml sources, so we always fetch them in those runs.

jmid · 2024-09-05T08:29:10Z

CI summary for 41d79d5

32bit 5.0.0 crashed on the parallel STM Buffer stress test [ocaml5-issue] Segfault in Buffer test (5.0, 32bit) #306
MinGW bytecode 5.1 timed out in Mash up of threads and domains [ocaml5-issue] Windows failures on threadomain #203 (our old friend)
linux-s390x-5.2 timed out in STM BigArray test parallel after a slow run on s390x-worker-01.marist.ci.dev s390x timeouts on s390x-worker-01 #421

All of these are known and hence unrelated to the PR's changes.
Overall, of the 63 workflows, 3 failed with 2 genuine issues and 1 CI issue.

shym · 2024-09-05T09:14:02Z

As it seems to be all good now, I’ve dropped the DROPME commit.
By the way this makes me wonder whether it’s time to drop testing completely on older releases, at least 5.0.0.

jmid

I've now been over this.
This is a very carefully prepared change, that I really appreciate and that I'd like to see merged ASAP. Kudos for writing such a readable shell script!

I've made a few notes inline as I read this. The only actual issue is what I believe is a missed option.

A few high-level remarks:

Should we go for installing named versions of dune and qcheck (e.g., 3.16 and 0.22)? That way accidental changes on either of their main branches won't affect a test suite outcome. The flip side is of course that these will need an occasional, manual version bump.
As for the workflow defaults, I'd rather just see an explicit compiler_ref mentioning trunk in the 5.4-trunk workflows (but that it is a matter of taste)

.github/runner.sh

jmid · 2024-09-05T08:50:28Z

.github/runner.sh

+  case "$OCAML_PLATFORM,$OCAML_OPTIONS" in
+    msvc,*32bit*)
+      eval $(tools/msvs-promote-path)
+      printf 'Running: %s\n' "./configure --host=i686-pc-windows $opts"


Perhaps this is a combination (MSVC+32-bit) that is worth adding? (probably as a separate PR)
I deeply appreciate you keeping track of and supporting this, yet untested combination! 🙏

It’s a combination that’s tested in ocaml/ocaml CI, it helped iron out quite a few 64bit-isms while restoring MSVC 😅 I agree that it’d better be a separate PR.

.github/workflows/mingw-530-trunk-bytecode.yml

shym · 2024-09-05T10:43:58Z

I’ve just pushed on a new new-ci-common-2 branch all the fixes you suggested (the branch might be nicer to review the diff). I’ll let the CI round go through and rebase them in the PR branch.

jmid · 2024-09-05T11:14:55Z

The patches on that branch look good to me!

Before merging I just remembered this comment from above:

It would be good to update the current README instructions to reflect the new changes required.

I was thinking of the last half of https://github.com/ocaml-multicore/multicoretests/blob/main/README.md#running-the-test-suite which could be changed to something like the following (feel free to reword/change as you see fit):

It is also possible to run the test suite in the CI, by altering
[.github/workflows/common.yml](.github/workflows/common.yml) to target
a particular `ocaml/ocaml` compiler PR:

  COMPILER_REF:   'refs/pull/12345/head'


a particular compiler tag:

  COMPILER_REF:   'refs/heads/some-compiler-tag'


or a particular branch:

  COMPILER_REPO:  'SomeUserId/ocaml'
  COMPILER_REF:   'refs/heads/my-feature-branch'

shym · 2024-09-05T11:41:14Z

Before merging I just remembered this comment from above:

I also remembered it and I had pushed a commit to that effect on the new branch 😄
Your suggestion makes me wonder we should add an explicit example targeting a tag, with refs/tags/mytag.

By the way, I went with the fully qualified name (refs/heads/trunk) rather than the short name so that git ls-remote will always give the one answer, in the clumsy case where the short name is ambiguous.

Merge the two CI main workflows, namely `common.yml` and `msvc-common.yml` into a single `common.yml` workflow that uses neither the `setup-ocaml` action nor `OPAM` so that: - the workflow has reduced dependencies and a smaller footprint, - the compiler configuration can be tuned (to disable features that are not relevant to the multicoretests, such as manpages), - it gets easier to test any branch of any fork of the compiler, - it brings the workflow really close to a version that can be pushed to the compiler repository. The logic of the workflow is largely extracted to a `runner.sh` script so that: - it is easier to maintain, as it is a regular script, instead of being embedded into a different syntax, especially with the issue of portability to Windows due to line endings, - it is easy to reuse that logic for different setups than regular GH Action runners, such as VMs, other infrastructures, etc. Extra notes: - the actual test steps are currently not triggered through the `runner.sh` script, in the unlikely case this could make some difference on Windows to have an `sh` involved somewhere in the stack, - the new common-workflow default is to test `trunk` rather than 5.0.0, which seems a better default.

shym · 2024-09-05T12:04:54Z

CI round all green, I’ve force-pushed the new version.

jmid · 2024-09-05T16:07:20Z

CI summary for 0a058f0: all 45 workflows succeeded!

jmid · 2024-09-08T18:19:43Z

CI summary for merge to main: all 46 workflows completed successfully 🎉

Ensure *.sh files are checked out using LF

a00db7e

shym force-pushed the new-ci-common branch from 84af4d1 to ddba01f Compare August 30, 2024 17:50

shym force-pushed the new-ci-common branch from ddba01f to 2181df9 Compare September 3, 2024 11:26

shym force-pushed the new-ci-common branch from 2181df9 to 41d79d5 Compare September 3, 2024 15:13

MisterDA mentioned this pull request Sep 4, 2024

Use WinAPI concurrency primitives on Windows ports (remove winpthreads) ocaml/ocaml#13416

Open

shym force-pushed the new-ci-common branch from 41d79d5 to 6364b54 Compare September 5, 2024 09:11

jmid approved these changes Sep 5, 2024

View reviewed changes

shym added 5 commits September 5, 2024 14:02

Translate the workflow arguments

13a1260

Remove the local OPAM repository

e6f1868

Fix up the QCheck and dune versions used in CI

b221c1a

Update instructions to test a particular PR or fork

0a058f0

shym force-pushed the new-ci-common branch from 6364b54 to 0a058f0 Compare September 5, 2024 12:04

jmid merged commit 5b33b02 into ocaml-multicore:main Sep 5, 2024
42 checks passed

shym deleted the new-ci-common branch September 5, 2024 16:08

jmid mentioned this pull request Sep 9, 2024

Experiment: Add MSVC 32bit workflows #476

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge the CI workflows and stop using `setup-ocaml` #473

Merge the CI workflows and stop using `setup-ocaml` #473

shym commented Aug 30, 2024

jmid commented Sep 2, 2024

shym commented Sep 2, 2024 •

edited

Loading

shym commented Sep 3, 2024

shym commented Sep 3, 2024

jmid commented Sep 5, 2024

shym commented Sep 5, 2024

jmid left a comment

jmid Sep 5, 2024

shym Sep 5, 2024

shym commented Sep 5, 2024

jmid commented Sep 5, 2024

shym commented Sep 5, 2024

shym commented Sep 5, 2024

jmid commented Sep 5, 2024

jmid commented Sep 8, 2024

Merge the CI workflows and stop using setup-ocaml #473

Merge the CI workflows and stop using setup-ocaml #473

Conversation

shym commented Aug 30, 2024

jmid commented Sep 2, 2024

shym commented Sep 2, 2024 • edited Loading

shym commented Sep 3, 2024

shym commented Sep 3, 2024

jmid commented Sep 5, 2024

shym commented Sep 5, 2024

jmid left a comment

Choose a reason for hiding this comment

jmid Sep 5, 2024

Choose a reason for hiding this comment

shym Sep 5, 2024

Choose a reason for hiding this comment

shym commented Sep 5, 2024

jmid commented Sep 5, 2024

shym commented Sep 5, 2024

shym commented Sep 5, 2024

jmid commented Sep 5, 2024

jmid commented Sep 8, 2024

Merge the CI workflows and stop using `setup-ocaml` #473

Merge the CI workflows and stop using `setup-ocaml` #473

shym commented Sep 2, 2024 •

edited

Loading