-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update/Add binfmt and qemu #2095
Comments
@umarcor thank you for providing useful links. @igagis , thank you for your proposal. Thank you for understanding. |
@maxim-lobanov I understand that you want to keep the image as minimal as possible, but consider the following arguments:
|
@igagis thank you for providing additional information! |
@igagis, the important point to understand is that installing Precisely, dbhi/qus uses exactly the same binaries (from Debian repos) and a customised script (from QEMU). Therefore, the dbhi/qus Action is an optimised subset. In fact, the main advantage of dbhi/qus is that it allows registering the interpreter for a single target architecture, instead of installing and registering all of them. Furthermore, since statically built interpreters are portable, download is limited to 2MB per target architecture. The purpose of dbhi/qus is to showcase over a dozen different approaches to using QEMU static: https://dbhi.github.io/qus/#tests. As you can see, a quarter of the examples do use
This is not exclusive to On the one hand, it might be easily fixed if On the other hand, registering the interpreters by default might prevent usage of the
Any non-trivial workflow will require
I would say that you can start a container at the beginning and then docker exec each step, similar to how you would use "service containers". Nonetheless, I suggest using a custom test script, and
Although I agree that the number of people willing to build tools for non-amd64 is increasing, IMHO it is still negligible in CI. Don't take me wrong. I think it is very important, and that's why I maintain dbhi/qus (and use it in dbhi/docker). However, I would say it is not crucial at all. Moreover, there is a 5-10x overhead when using QEMU. That's why the vast majority of builds for non-amd64 architectures are generated through cross-compilation. This is not exclusive to GHA. Almost all the Linux distributions build packages for all the supported architectures on amd64. Overall, I think that point 1 and 3 should be addressed. That would provide a better user experience not only for this use case, but for many others involving non-trivial and non-web-focused usage of containers. |
@umarcor I've just had a thought, is it possible to run |
@igagis, that's an interesting idea! I guess the only issue might be |
@umarcor according to https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-syntax-for-github-actions#jobsjob_idservicesservice_idoptions it is possible to pass options to So, what should I try? Would something like the following be the right thing?
I took the command line for |
@igagis, "service" is normally used as a synonym of "daemon container". Therefore, we need to take two constraints into account:
I will do some tests and I'll let you know. |
Also See example for redis service: https://docs.github.com/en/free-pro-team@latest/actions/guides/creating-redis-service-containers#configuring-the-container-job |
@igagis, have a look at https://github.com/dbhi/qus/actions/runs/375520958 and dbhi/qus@56c0107.
Hence, using a service might be feasible if the startup order was changed from:
to:
|
The execution of qus is a few seconds. It takes longer for GitHub to move from 'Starting service container' to 'Waiting for all services to be ready' that it takes for qus to execute. That's why https://github.com/dbhi/qus/runs/1434087443?check_suite_focus=true#step:2:78 failed. Therefore, although a corner case might need a health cmd, I don't think that's the main problem. Starting containers with |
Did you contact runner developers about possibility of changing that container/services startup order? You mentioned that it is done deliberately by some design decision, right? Or should we submit a feature request to them to change that order? To me it makes no sense to start job container before services are ready, but maybe I'm missing something... |
I did not. In fact, I had not investigated this until I did the tests above.
I say it's a design decission because someone did need to decide the order. Submitting a feature request sounds sensible.
It seems that it's just because it's the first one added to the array: https://github.com/actions/runner/blob/c18c8746db0b7662a13da5596412c05c1ffb07dd/src/Runner.Worker/ContainerOperationProvider.cs#L143-L147. So, all the service containers are started one after the other, and the main container is considered the first service. |
Thanks! I'll submit the feature request then. |
it looks like the best solution would be still to just pre-install the actions/runner#816 why do I need all that functionality? Well, I'm in the process of setting up a universal workflow for all my repositories and after I switch to github actions CI for all of my repos I want to minimize the need for any modifications I will have to do to each of the repos in the future, when this and that issue is resolved (revert workarounds), when I drop in self-hosted runners to be used along with public ones (yes, I considered using self-hosted runners only for I'm not sure that all those listed issues will be resolved in any near future. On the other hand, just pre-installing |
Hello everyone, just an update: |
@maxim-lobanov, I believe that those packages do register the interpreters. Therefore, users who expect those not to be registered will find issues when registering theirs. They'll need to first remove the existing interpreter. I suggest to consider reworking the container provisioning instead, since that will enable other use cases too. |
@umarcor , Oh, didn't catch that it will be a breaking changes for those customers. |
@umarcor Why would someone expect interpreters to be not installed? What is the example use case? |
@igagis, any user who is currently using GHA for building/testing foreign applications/containers. Very specially, the users that are loading the interpreters in persistent mode, in order to run foreign containers without contaminating them. I believe that the default installation would not load them in persistent mode. |
@umarcor I'm not sure I fully understand that. What is "persistent mode" of the interpreters? Though I still don't understand why would someone expect interpreters to be not registered in their build (and why would someone care at all about installed interpreters in case they don't use them, i.e. run only
before doing the rest of the stuff. While in case someone want to use |
Please, see dbhi.github.io/qus. The last paragraph of section Context refs https://kernelnewbies.org/Linux_4.8?highlight=%28binfmt%29.
In a typical binfmt/qemu-user setup, whenever a foreign instruction/app needs to be interpreted, the interpreter needs to be executed. That is done implicitly by the kernel, but the interpreter needs to be available in the environment where the app is being executed. Hence, by default, binfmt will tell the foreign app inside the container to use an interpreter which does not exist in that environment. The path that the kernel passes corresponds to a path on the host, which was set when the interpreter was registered. Before the persistent mode was added, users did need to make qemu-user binaries available inside the containers. That could be done by copying them, or by binding a path from the host. That's what I mean which "contaminating the container". Copying the binaries and removing them is a requirement for building foreign containers, unless persistent mode or docker experimental features are used. See the table in dbhi.github.io/qus/#tests. All the The great advantage of loading interpreters persistently is that you don't need to care about exposing the binaries inside the containers. You run a single setup command and then you can use and build as many foreign containers/apps as you want, without tweaking each of them.
I would want to remove non persistently loaded interpreters and load then persistently instead. That is, I want to keep using dbhi/qus. I would need to change the procedure because registering interpreters twice can fail. Is it a great problem if I need to remove them first? It's not. Is it worth the hassel? I believe it's not, because having qemu-user-static and binfmt-support installed by default would still not solve your use case.
Currently, all those who care are using qemu-user in GitHub Actions with a single one-liner. Therefore, this is not a stopper by any means. Instead of advocating for a not-so-easy patch that fixes your specific style preference, I believe it is better to address a fundamental enhancement that would allow not only your preferred style, but also other use cases which are currently not possible. The most obvious one is building an image in one step and using it in the next one (actions/runner#814). Don't take me wrong, I understand your frustration with GHA workflow syntax and how limiting it is. However, I believe that adding tools needs to be done carefully, and someone needs to look after them. @maxim-lobanov explicitly said that they were going to add them because they need no maintenance, which is not the case as I have just explained. |
Well, in that case, we need to pre-install interpreters in persistent mode then, to ensure widest compatibility.
But then, images built in such a way, will always require interpreters in persistent mode when using such images. Doesn't that limit usability of such built images? Isn't it better to include statically linked
Right, and that would be just one simple step I wrote above. In contrast, consider the following workflow:
as you can see the build matrix lists 4 images to build on, some are
See my example workflow above, it's not just one liner. While the workflow I used for example seems pretty common to me.
I agree that it is always better to solve the problem in a right way, but in real world it does not look like it will be solved in a foreseeable future, considering number of issues in the
Why do you think so? As I said, this approach worked for me on Travis-CI. Yes, in |
Now you are starting to grasp the complexity of this topic. Who is "we"? Someone needs to be willing to understand, implement and maintain this feature. Willingness is not enough: good knowledge of the virtual environment(s), runner, self-hosted runners, etc. is required. The response from GitHub so far is that they are not putting any additional effort on it. I don't have the bandwidth for testing modifications to the virtual environments, and I don't have a testing environment where to deploy them. Are you up to the challenge? Note the three verbs, tho: understand, implement and maintain.
Registering non-statically linked interpreters is a no-go. Binaries are loaded in memory and passed by the kernel to the container, but executed inside the container. Therefore, if you use dynamically linked binaries, you need to make all the dependencies (shared libs) available inside the container. The only exception is when the container is the same OS and arch as the host; yet, in that use case you don't need QEMU at all. Your guess is not wrong: something needs to be done after installing qemu-user-static and binfmt. But that something is not installing additional packages. Please, read dbhi.github.io/qus carefully and follow the references provided there. As I said above, the main purpose of dbhi/qus is didactic. You don't need to use dbhi/qus if you don't want to, but you should learn how it works if you want to implement any alternative solution. Very precisely:
Not at all! Containers are self-contained user-space environments that rely on an "external" kernel which can pass the instructions to a valid interpreter. Interpreters can be software, which is what we normally refer to when talking about QEMU user mode. However, interpreters can also be hardware, and that's what we call CPU. For instance, arm32v7 containers can be executed on armv7 or armv8 devices, without any software interpreter, because those devices understand the instruction sets natively. By the same token, i386 containers can be executed on amd64 hosts without software interpreters. Of course, there are aarch64 only and amd64 only devices too (specially in datacenters). Therefore, containers should be agnostic to the usage of QEMU. That belongs to the host, as a workaround for the inability of a given CPU to understand some specific instruction set.
That's a very short-sighted approach. You are assuming that you will always want to use foreign containers on amd64 hosts. There is a very straightforward counter example: users building container images for Raspberry Pi, Rock960, Pine64, and other SBCs on GitHub Actions. For instance:
The procedure above is how dbhi/docker works. See a GHA run and the resulting images/manifests. On the other hand, apart from amd64 as a host, aarch64, ppc64 and s390x are also very used for running their "native" or amd64/arm containers. Hence, the corresponding interpreters for all those might potentially need to be put into Nevertheless, the procedure you are suggesting is also supported in dbhi/qus. Container images and releases provide statically compiled binaries (precisely, extracted from Debian packages), which you can add to the container through volumes. That is, you can use containers with qemu-user without persistent mode and without adding binaries to the images permanently. Naturally, for docker build you need experimental features to avoid contamination, but it's possible too.
As I said before, I agree that it's not a big problem and I could easily work around that. As I also said, my main concern is that I don't see a feasible proposal on the table yet. So, I'm not opposing to installing qemu through
As I also said before, I understand why you like that style. Still, there is nothing preventing you from using GHA and achieving your targets. Therefore, this is not critical, but a matter of taste. Conversely, there are other use cases which are currently not possible, and which would benefit from looking at this issue with a wider perspective, instead a shortest term look. This is just one possible alternative style: name: ci
on: [push, pull_request]
jobs:
linux:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
include:
- { arch: amd64, image: debian:buster },
- { arch: i386, image: i386/debian:buster }, # i386 arch image
- { arch: amd64, image: ubuntu:eoan },
- { arch: arm, image: igagis/raspbian:buster }, # ARM arch image
steps:
- uses: actions/checkout@main
- run: register_interpreter_for_matrix.arch
if: matrix.arch != 'amd64'
- run: |
docker run --rm \
-v $(pwd):/src -w /src \
-e IS_TAGGED="startsWith(github.ref, 'refs/tags/')" \
${{ matrix.image }} \
./.github/test.sh #!/usr/bin/env sh
apt install --assume-yes devscripts equivs bla-bla-bla;
make
[ -n "$IS_TAGGED" ] && deploy.sh || echo "Skip deploy" The advantage of this style is that it is portable. That is, anyone can run the test script either on their host or inside a container or in some other CI service (by copying the command in the workflow).
Using name: ci
on: [push, pull_request]
jobs:
linux:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
include:
- { arch: amd64, image: debian:buster },
- { arch: i386, image: i386/debian:buster }, # i386 arch image
- { arch: amd64, image: ubuntu:eoan },
- { arch: arm, image: igagis/raspbian:buster }, # ARM arch image
steps:
- uses: actions/checkout@main
- run: register_interpreter_for_matrix.arch
if: matrix.arch != 'amd64'
- uses: docker://${{ matrix.image }}
with:
args: ./.github/test.sh
env:
IS_TAGGED: "startsWith(github.ref, 'refs/tags/')" Unfortunately, this would not work because the runner will try to start the docker step at the beginning. That's what actions/runner#814 is about. And that's something that would allow you to have a clean enough workflow, while enabling other use cases which are currently not possible.
See #2095 (comment) for a detailed analysis of those specific problems. See also #2095 (comment). I think that reordering the service initialisation is the easiest solution to this problem.
Your example workflow above is not a valid reference, because it is a synthetic representation of how you envision that GHA workflows should work. For that same reason, that style is very far from being common, because it never worked. Conversely, the pattern I find most is the "portable" style I showed above. That's probably because many projects were migrated from Travis and other services where "actions" as JavaScript/Docker modules did not exist. NOTE: actions/checkout behaves differently depending on the git version. Be careful when using it inside a container. That is, when using a container job instead of container step(s). Anyway, once again, don't take me wrong, I agree with your ideal style preference. I'd love if that worked and we could have foreign container support off the shelf. Going back to the beginning, my concern is who has the resources and willingness for implementing and maintaining the feature in GHA. NOTE: In fact, all Windows 10 users of Docker Desktop do have persistent qemu-user support for ARM by default since a couple of years ago, and I really like that. However, they don't use
Honestly, in the real world we will revisit this in some weeks, and months, and maybe years. Human resources for all actions repos are similar, if not the same.
Should you achieve pre-installing interpreters in persistent mode, I believe that'd be acceptable. Otherwise, it should be very clearly documented which is the one-liner that users need to use, due to breaking backwards behaviour.
I assumed that you didn't want to contaminate the containers, since that's what most users pursue. If you are ok with being forced to put the binary in the same location that GHA does (and use that same location in any future host), then installing those packages alone might work. |
Hi there, I am closing this issue. Please, feel free to create a new one in case of any other questions. Thanks. |
Tool information
Area for Triage:
C/C++
Question, Bug, or Feature?:
Feature
Virtual environments affected
Can this tool be installed during the build?
sudo apt install --assume-yes binfmt-support qemu-user-static
Tool installation time in runtime
5-10 seconds
Are you willing to submit a PR?
possibly
Why is this needed?
This will allow running jobs inside of docker images built for ARM architecture, i.e. when all binaries inside of the image are for ARM.
The text was updated successfully, but these errors were encountered: