Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM support #264

Closed
davidbarratt opened this issue Aug 6, 2021 · 48 comments
Closed

ARM support #264

davidbarratt opened this issue Aug 6, 2021 · 48 comments
Labels
enhancement New feature or request

Comments

@davidbarratt
Copy link

Use-cases

I would like to run Pelias on a Raspberry Pi, but that would require ARM support. It doesn't look like all of the docker images support arm64. :/

Attempted Solutions

Proposal

Add ARM builds for all Docker images. :)

References

@davidbarratt davidbarratt added the enhancement New feature or request label Aug 6, 2021
@dimaryaz
Copy link

Just finished setting up an AWS EC2 t4g instance - one of their latest and cheapest instance types - only to run into the same problem.

@missinglink
Copy link
Member

missinglink commented Oct 26, 2021

We'd be happy to accept a PR for this, I don't have an ARM Mac so I can't test this unfortunately but we should be able to set it up in the CI env.

All Pelias Docker images extend from the baseimage so once that's configured to do either multiarch or a second baseimage is generated for ARM then we can configure the CI to build them and begin testing the application components.

I suspect that major software vendors such as nodejs/elasticsearch with large teams will already support ARM but things like libpostal might require some work.

Tim Cook himself said it would take some time for ARM to be supported everywhere so we'll need some help from the Pelias community to get it working and tested!

@orangejulius
Copy link
Member

orangejulius commented Nov 2, 2021

Hey folks,

As a Macbook M1 owner, I wouldn't mind spending a little bit of my personal time making Pelias work on ARM :)

There's going to be a bunch of aspects to this work, and I'll try to list a general overview here. A big thanks to @jlowe000 for his work apparently getting all of Pelias to work on ARM, and his accompanying blog post https://redthunder.blog/2021/07/04/daysofarm-12-of-x/. Definitely points us in the right direction.

Elasticsearch version

As mentioned in #263, we'll want to upgrade the default Elasticsearch image to 7.10.2 or newer to get ARM compatibility. It sounds like this alone might be enough to get Pelias to at least work on an M1 Mac. Lots of things will still use rosetta2 emulation and be slow and battery draining, but it's a start.

Go binaries and other hardcoding of amd64

We hardcode amd64 download links for various dependencies across the project, for example the Polylines importer Dockerfile:

https://github.com/pelias/polylines/blob/cb0b382af7c5bd8cfe3b607c6b35f6f7417b24bc/Dockerfile#L8

I'd love to hear what best practices there are for this. I think buildx (more on that below) will provide an ARCH environment variable, can we use that?

pbf2json

We'll either need to add arm binaries to anything that uses pbf2json (interpolation, openstreetmap), or start compiling from source during Docker builds on arm

Docker buildx

If we want Docker images to be built for the project with arm support by default then we might want to look at Docker multi-arch builds, presumably using buildx. I haven't had a ton of luck with this yet, but I'm sure it can be done.

Let me know if I'm missing anything!

@jlowe000
Copy link

jlowe000 commented Nov 3, 2021

Hi @orangejulius, I've been able to get a "version" of this up and going.

I wouldn't say it's been fully tested but it's been very stable for the queries and workloads that I've been working with. The repos are checked in and forked from the pelias versions. The last time that I did a fetch was working on the issue with the interpolation.

A couple of things that I had to update.

  • libpostal needed to be compiled from src.
  • some changes to the nodejs based upon the isaacs/nave repo

There was a comment here that pelias on arm64 worked fine - isaacs/nave#111 (comment). I'm not sure whether they were referring to their repo or pelias.

There are definitely some areas of concern that I hadn't had time to look into deeply. The main one is the Valhalla stack.

@missinglink
Copy link
Member

regarding pbf2json, we are already building pbf2json.linux-arm and bundling it in the npm module:

path.join(__dirname, 'build', util.format( 'pbf2json.%s-%s', os.platform(), os.arch() ) )
/tmp ❯ ls -lah node_modules/pbf2json/build
total 22792
drwxr-xr-x  6 peter  wheel   192B Nov  4 10:02 .
drwxr-xr-x  7 peter  wheel   224B Nov  4 10:02 ..
-rwxr-xr-x  1 peter  wheel   5.8M Oct 26  1985 pbf2json.darwin-x64
-rwxr-xr-x  1 peter  wheel   1.7M Oct 26  1985 pbf2json.linux-arm
-rwxr-xr-x  1 peter  wheel   1.9M Oct 26  1985 pbf2json.linux-x64
-rwxr-xr-x  1 peter  wheel   1.8M Oct 26  1985 pbf2json.win32-x64

do we need pbf2json.darwin-arm? I'm assuming not since it will run in a linux docker container? if so, please open an issue on that repo and I'll have a look at how much work it is.

ref: https://github.com/pelias/pbf2json/tree/master/build

@jlowe000
Copy link

jlowe000 commented Nov 4, 2021

I'm not 100% sure. But I think I had issues with the prebuilt arm version as I am running on 64bit arm.

@missinglink
Copy link
Member

It was simple enough pelias/pbf2json#107

@ddelange
Copy link

ddelange commented May 6, 2022

Add ARM builds for all Docker images.

Does the scope of this issue include derivatives/auxiliaries like pelias/libpostal-service?

@missinglink
Copy link
Member

Does the scope of this issue include derivatives/auxiliaries like pelias/libpostal-service?

Yes, I think it's all-or-nothing, partial support isn't particularly helpful.

We would be happy to accept contributions, I doubt this will be picked up by the core team as we don't have a need for it.

@ddelange
Copy link

ddelange commented Jul 18, 2022

We would be happy to accept contributions

@missinglink could you point me to the CI that builds and pushes to dockerhub? I have some experience running multi-arch builds from raw dockerfile (buildx build) or from docker compose definition (buildx bake), particularly using QEMU emulator on GitHub Actions CI

@missinglink
Copy link
Member

Each repo has its own CI script such as this:

https://github.com/pelias/api/blob/master/.github/workflows/push.yml#L42

Rather than repeat everything per-repo it just executes this script:

https://github.com/pelias/ci-tools/blob/master/build-docker-images.sh

@ddelange
Copy link

ddelange commented Jul 18, 2022

Thanks, so each repo would need:

     steps:
       - uses: actions/checkout@v2
+      - uses: docker/setup-qemu-action@v1
+      - uses: docker/setup-buildx-action@v1
       - name: Build Docker images
       ...

and the script would need:

-  docker build -t $tag .
-  docker push $tag
+  docker buildx build --push --platform=linux/amd64,linux/arm64,linux/arm/v7 -t $tag .

Do you expect build failures in some repos? getting dependencies from official apt repositories will generally be available for arm64 (not so sure about arm v7 but probably also ok), and so adding this platform should work ootb 🤔

Is there a complete list of repos somewhere that would need a PR? 36 of them? you can also move the workflow itself to pelias/ci-tools and call it from the children ref reusable workflows

orangejulius added a commit to pelias/api that referenced this issue Jul 18, 2022
@orangejulius
Copy link
Member

orangejulius commented Jul 18, 2022

Thanks @ddelange that was super helpful.

I tested that out with https://github.com/pelias/api/tree/arm-build and it worked just fine.

It is very likely there will be some build failures on some repos, so we'll have to start sorting through that now.

orangejulius added a commit to pelias/placeholder that referenced this issue Jul 18, 2022
@ddelange
Copy link

Awesome, let me know if I can provide any further support!

@ddelange
Copy link

I would be happy to contribute and get this through: would you prefer a core contributor to take care of it, or should I just open 36 PRs? If yes, would the PRs use the feature branch from ci-tools until further notice, like your PoC @orangejulius? Or what is your preferred order of things?

@ddelange
Copy link

A second option is to pin to a specific commit of ci-tools. That way the PRs could be tested, but you'd be left having to do 36 PRs everytime something changes in ci-tools, something I guess you wanted to avoid.

A third option: the PRs could also leave the link untouched (pointing to master), as my diff above is a non-breaking change. The potentially breaking change would then only come once ci-tools feature branch merges into master.

A fourth option: temporarily point to the feature-branch, test the PRs, once green, point back to master, merge it, and it only goes live once ci-tools feature branch merges into master.

I tend towards number 4 but I'm curious about your thoughts!

@ddelange
Copy link

ddelange commented Aug 11, 2022

Opening 34 PRs under option 1 or 4. Probably, pelias/docker-baseimage#26 needs to merge (and release?) before most of them can be tested.

There's some repos that don't use github actions, I won't touch them for now (they'll break once ci-tools feature branch merges):

@ddelange
Copy link

ddelange commented Aug 11, 2022

Realised the CI is not being triggered on my PRs because they're coming from a fork, so it's not technically a push to the repo.

Adding the pull_request trigger on top of the push trigger will work. It should error for me on the pushing to dockerhub phase, because PRs coming from a fork won't have access to the base repo's github secrets.

There is also the (dangerous) pull_request_target trigger where you'd need to bar the permissions carefully.

For PRs not from a fork, the push event will run in parallel to the pull_request triggered job, doing double minutes. To avoid that, a branch filter can be added to the push trigger to only run on master commits.

The triggering setup could be:

on:
  pull_request:
  push:
    branches: master

  # optional
  release:
    types: [released, prereleased]  # triggers workflow using the release tag as ref
  workflow_dispatch:  # allows running workflow manually from the Actions tab

edit: and I think it doesn't make sense for me to push this snippet, because I'm pretty sure the modified event triggers won't go into effect when coming from a fork :')

@ddelange
Copy link

Hi @ddelange, we are planning to discuss this in a team meeting today, can you please hold off any more PRs until we chat about what we'd like to do.

Hi @missinglink 👋

Any updates thus far?

@missinglink
Copy link
Member

missinglink commented Sep 12, 2022

Agh sorry I forgot to write back, the pull requests are a pain to deal with since there's so many repos, so we'd like to avoid having to do them multiple times.

As they stand they are targeting ci-tools/buildx (a branch of ci-tools) and I'm guessing we'd have to do them all again to switch them back to master?

I think this is the right direction to go but I'm still a little concerned that maybe one or two repos might end up being more difficult (likely anything to do with libpostal).

So what I think is a good solution is to merge a ci-tools PR first which contains the change but also has the ability to disable ARM support via env var.

That would allow us to roll it out across the board and also have the ability to disable it for any repos where there are issues.

It seems to make sense to have multi-arch builds enabled by default and optionally disabled via an env var.

Does that sound like a plan?

@missinglink
Copy link
Member

Looking at the code again it might make sense to have a default "platforms" string and then allow it to be overwritten by an env var as this gives a lot more granular control.

pelias/ci-tools@master...buildx

@missinglink
Copy link
Member

Question: is there any functional difference between docker build and docker buildx build --platform=linux/amd64?

@ddelange
Copy link

ddelange commented Sep 12, 2022

the env var idea with multi-arch enabled by default sounds like a good plan! I recently took a similar approach here

I'm guessing we'd have to do them all again to switch them back to master

yeah, the rollout here is a bit iffy. I would tend to option 4 but maybe your env var suggestion opens the door to even more possibilities? 🤔

Question: is there any functional difference between docker build and docker buildx build --platform=linux/amd64?

short answer: no! buildx does couple the pushing of multi-arch manifest (so separating build and push instead of build --push would be a pain, for that you'd need regctl i think, but haven't tried) but here that's no problem

@missinglink
Copy link
Member

In CI there is no need for separation of build and push so we can move forward on this path, where we migrate to buildx and include the required build dependencies without any negative impact.

Of the two options, being enabling multi-arch by default or multi-arch by config, I'm confident that multi-arch is the preferred option so would advocate it being enabled by default and disabled/adapted via config.

@missinglink
Copy link
Member

So the immediate next step is opening a PR on ci-tools which defines a variable with default --platform flags and making that variable overloadable via the :- bash convention.

Once that is merged to master we can go ahead and merge these PRs once they point to the master branch and we're done for now, with the option of testing and reconfiguring in the future with minimal effort

cc/ @orangejulius

@missinglink
Copy link
Member

I'm off to bed now but I can open that PR tomorrow

@ddelange
Copy link

Hi @missinglink 👋 was there any decision about the order in which to roll this out?

@ddelange
Copy link

ddelange commented Nov 8, 2022

I think it doesn't make sense for me to push this snippet, because I'm pretty sure the modified event triggers won't go into effect when coming from a fork :')

Does it make sense to keep all my PRs open?

Or should I close them and leave it up to the maintainers when and in which order to get this done?

@schmidp
Copy link

schmidp commented Jul 31, 2023

Hey is there any plan to enable multiarch soon? Trying to decide if I should built the images myself or can I help somehow to get the multiarch build merged?

@amuedespacher
Copy link

ARM support would be greatly appreciated!

@missinglink
Copy link
Member

Hi all,

I had a look over the outstanding ARM tickets today and there's a path forward with pelias/ci-tools#11 but also a significant amount of testing and possibly some dev work involved to fully support ARM and have it stable enough to use in a production environment.

It's unfortunately not as simple as just using buildx as the docker images contain a bunch of different tools and software out of our control which all needs to be multiarch capable and compiled accordingly.

Due to the amount of time required to set up and maintain ARM builds, I'm sorry to say Julian & I likely won't be able to take this on any time soon. I recently got a new M2 Mac so I can slowly chip away at these issues.

If you're in a position to sponsor some developer time to work on this, please email us [email protected]

@missinglink
Copy link
Member

Hi all, I merged pelias/ci-tools#11 today, this is the first step to getting multi-arm builds working.

From now on, when PRs are merged in the Pelias repos it will produce a multiarch docker image which will run on both amd64 and arm64 🎉

The exceptions are the following repos which I believe are still not capable of running on ARM because they depend on libpostal and there are downstream issues blocking it, once those are resolved we should be able to get full ARM support.

@ddelange
Copy link

Nice! And what about all my closed PRs linked above? Mainly the

- uses: docker/setup-qemu-action@v1
- uses: docker/setup-buildx-action@v1

addition which makes docker buildx build --platform work?

@missinglink
Copy link
Member

@ddelange those don't seem to be required, we handle it all here:
https://github.com/pelias/ci-tools/blob/master/build-docker-images.sh#L61-L68

@missinglink
Copy link
Member

I've updated the pelias/baseimage to be multiarch, so as of today any derivative images can also be multiarch.

As a test I've managed to produce multiarch builds of both pelias/elasticsearch and pelias/polylines.

The latter required a conditional statement in the Dockerfile to detect which file to download, an example of this can be found in pelias/polylines#273

@missinglink
Copy link
Member

I think the final hurdle is going to be libpostal, there was some progress in supporting the Mac M1 chipset but it caused a regression and IIRC was removed.

It might be as simple as detecting the architecture and using the --disable-sse2 build flag.

@ddelange
Copy link

for libpostal that flag should indeed work, we are producing (OpenBLAS accelerated) x86_64/arm64 images for libpostal here using TARGETARCH which is exposed by default by buildx.

@ddelange those don't seem to be required, we handle it all here: https://github.com/pelias/ci-tools/blob/master/build-docker-images.sh#L61-L68

that's good to know, seems github is now bundling qemu and buildx in the default runner. nice!

@missinglink
Copy link
Member

I just published pelias/elasticsearch:7.17.15 which supports multiarch.
We can optionally rebuild older versions for multiarch if requested in the future.

@missinglink
Copy link
Member

I managed to complete a portland-metro build on my Macbook M2 this morning with a couple of small changes to the docker-compose.yml file 🎉

It seems that Docker is now able to emulate AMD64 on ARM64 using Qemu (not Rosetta unfortunately), so a bunch of the images seem to 'just work', although I bet they are much slower under emulation.

In order to fully support ARM64 (for Graviton servers, Raspberry PI etc.) we will need native ARM64 support for all the images, some just need to be rebuilt in CI, some need some adjustments to work.

For reference, this command is handy, after pulling down the latest images (to grab the multiarch versions I've been building) you can run this to list what architecture docker pulled down for you:

docker image inspect --format "{{.Architecture}} {{.RepoTags}}" $(docker image ls -q -f 'reference=pelias/*')

If you're on an ARM64 machine you should see some of them prefixed with ARM64.

@ddelange
Copy link

ddelange commented Dec 1, 2023

I also wrote a one-liner to check out available architectures and sizes on the registry before pull: https://stackoverflow.com/a/73108928/5511061

@missinglink
Copy link
Member

@ddelange I'm trying the workflow you have to disable SSE as per your Dockerfile but it results in an error, maybe a more recent commit of libpostal broke it?

Does the master branch of libpostal work for your build?

#17 140.4 gcc: error: unrecognized command-line option ‘-mfpmath=sse’
#17 140.4 gcc: error: unrecognized command-line option ‘-mfpmath=sse’
#17 140.4 gcc: error: unrecognized command-line option ‘-msse2’
#17 140.4 gcc: error: unrecognized command-line option ‘-msse2’
#17 140.4 make[2]: *** [Makefile:3956: libpostal-strndup.o] Error 1
#17 140.4 make[2]: *** Waiting for unfinished jobs....
#17 140.4 make[2]: *** [Makefile:3970: libpostal-main.o] Error 1
#17 140.4 gcc: error: unrecognized command-line option ‘-mfpmath=sse’
#17 140.4 gcc: error: unrecognized command-line option ‘-mfpmath=sse’
#17 140.4 gcc: error: unrecognized command-line option ‘-msse2’
#17 140.4 gcc: error: unrecognized command-line option ‘-msse2’
#17 140.4 make[2]: *** [Makefile:3998: libpostal-file_utils.o] Error 1
#17 140.4 make[2]: *** [Makefile:3984: libpostal-json_encode.o] Error 1
#17 140.4 make[2]: Leaving directory '/code/libpostal/src'
#17 140.4 make[1]: *** [Makefile:464: all-recursive] Error 1
#17 140.4 make[1]: Leaving directory '/code/libpostal'
#17 140.4 make: *** [Makefile:373: all] Error 2
#17 ERROR: process "/dev/.buildkit_qemu_emulator /bin/sh -c ./bootstrap.sh &&     ([ \"$TARGETARCH\" == \"arm64\" ] && ./configure --datadir=\"$DATADIR\" --disable-sse2 || ./configure --datadir=\"$DATADIR\") &&     make -j4 &&     make install &&     ldconfig" did not complete successfully: exit code: 2

@ddelange
Copy link

ddelange commented Dec 1, 2023

all green on our side... fwiw running on a ubuntu jammy image with latest build-essential installed

@missinglink
Copy link
Member

missinglink commented Dec 1, 2023

ok cool, it took a couple of days dev work but now all images are available on ARM 🎉

I'm going to close this PR, please open individual bug reports if you have any issues.

arm64 [pelias/fuzzy-tester:master]
arm64 [pelias/transit:master]
arm64 [pelias/pip-service:master]
arm64 [pelias/placeholder:master]
arm64 [pelias/csv-importer:master]
arm64 [pelias/whosonfirst:master]
arm64 [pelias/elasticsearch:7.16.1]
arm64 [pelias/openaddresses:master]
arm64 [pelias/openstreetmap:master]
arm64 [pelias/api:master]
arm64 [pelias/interpolation:master]
arm64 [pelias/libpostal-service:latest]
arm64 [pelias/schema:master]
arm64 [pelias/elasticsearch:7.17.15]
arm64 [pelias/polylines:master]

@ddelange
Copy link

ddelange commented Dec 1, 2023

awesome work! 💥

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

8 participants