Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement Request: CPU architecture aware box handling #12610

Closed
fatso83 opened this issue Dec 7, 2021 · 16 comments · Fixed by #13239
Closed

Enhancement Request: CPU architecture aware box handling #12610

fatso83 opened this issue Dec 7, 2021 · 16 comments · Fixed by #13239

Comments

@fatso83
Copy link

fatso83 commented Dec 7, 2021

Is your feature request related to a problem? Please describe.
A lot of developers are now on ARM architectures. Even if they should be able to to get VMWare Fusion playing nicely with Vagrant, the issue of finding boxes that runs on the right provider is left missing. You cannot filter boxes on Vagrant Cloud by architecture, nor will you get a good error message if one does not run on your provider due to architecture mismatch.

Describe the solution you'd like
Two things:

  • Vagrant Cloud should require an architecture field for new boxes and let users filter on amd64 (x86?), arm64, etc. (from a list of enums) when searching for images.
  • When specifying bionic64 in your Vagrantfile, and that box does not exist for your architecture, you should be told through an error message.

Describe alternatives you've considered
Clear documentation mentioning the architecture issues.

Additional context

@fatso83 fatso83 changed the title Enhancement Request: Metadata support to differentiate boxes on CPU architecture Enhancement Request: Built-in support for handling CPU architecture Dec 9, 2021
@fatso83 fatso83 changed the title Enhancement Request: Built-in support for handling CPU architecture Enhancement Request: Explicit support for handling boxes with incompatible CPU architecture Dec 9, 2021
@fatso83 fatso83 changed the title Enhancement Request: Explicit support for handling boxes with incompatible CPU architecture Enhancement Request: Handling boxes with incompatible CPU architecture Dec 9, 2021
@jcsalem
Copy link

jcsalem commented Dec 19, 2021

Yes, I need this feature as well.
I got Vagrant working well with Parallels but I have to create a separate box name just for the arm version.
Particularly since Vagrant runs under Rosetta on the Apple M1, it was challenging to make my Vagrantfile select the appropriate box name based on the underlying CPU architecture.

@brycekahle
Copy link

it was challenging to make my Vagrantfile select the appropriate box name based on the underlying CPU architecture.

@jcsalem Mind sharing how you did this? I'm running into the same issue.

@fatso83
Copy link
Author

fatso83 commented Feb 3, 2022

@brycekahle Since the Vagrantfile is just a Ruby script, you just need to switch on architecture to set the box name.

The pre-requisite is of course that you create differently named boxes for each architecture. Like @brycekahle/ubuntu-focal-amd64 and @brycekahle/ubuntu-focal-arm64.

# one alternative: `uname -m`
architecture = do_something_to_get_some_architecture_abbreviation()

Vagrant.configure("2") do |config|
  config.vm.box = "@brycekahle/ubuntu-focal-#{architecture}" 
  # the rest
  # ...
end

All that is left is then to implement do_something_to_get_some_architecture_abbreviation() or use some built-in. I am no Ruby programmer, so calling out to shell utils like uname -m could be one way of achieving this, but there are probably native built-ins that are preferable 😃

@brycekahle
Copy link

@fatso83 Yeah, the challenge is vagrant still runs under Rosetta, so using builtins often return the wrong result. I ended up reading the sysctl that indicates if you are running under Rosetta or not as a signal that it was an apple silicon box, but that is not perfect.

@fredngo
Copy link

fredngo commented Mar 8, 2022

I ended up reading the sysctl

@brycekahle Mind posting your workaround? Thank you!

@fredngo
Copy link

fredngo commented Mar 8, 2022

All that is left is then to implement do_something_to_get_some_architecture_abbreviation() or use some built-in. I am no Ruby programmer, so calling out to shell utils like uname -m could be one way of achieving this, but there are probably native built-ins that are preferable 😃

@fatso83 I tried to do this in my Vagrantfile:

Vagrant.configure('2') do |config|

  arch = `arch`

  config.vm.box = if arch == 'arm64'
    'whichever/ubuntu-server-20.04-arm64'
  else if arch == 'i386'
    'whichever/ubuntu-server-20.04-i386'
  end

  ...
end

But it turns out that since Vagrant is running under Rosetta, the arch command always returns i386 even when it's running on an Apple Silicon Mac, so I can't decide on the box at runtime!

If anyone has a workaround, please let me know.

@glynnforrest
Copy link

But it turns out that since Vagrant is running under Rosetta, the arch command always returns i386 even when it's running on an Apple Silicon Mac, so I can't decide on the box at runtime!

I haven't had time to test this yet, but I believe sysctl -in sysctl.proc_translated will give different exit codes on the host and Rosetta 🤞

@fatso83
Copy link
Author

fatso83 commented Mar 8, 2022

@glynnforrest That seems very promising as it seems to align well with this accepted answer that lists a small C snippet one could use, which checks just that.


EDIT
Ooh, lots of expanded information on what this is doing here in this SO answer, along with optional detection mechanisms.


P.S. At first I was a bit at a loss on how this could work, since no sysctl.* keys were listed in the main sysctl man page, but at the very bottom it says

More variables than these exist, and the best and likely only place to search for their deeper
meaning is undoubtedly the source where they are defined.

P.P.S. Why do you need to pass -i? Seems to work fine without.

 sysctl -n sysctl.proc_translated
0

@stephenreay
Copy link

(For context: I both use vagrant as a developer, and build various boxes)

As much as there are workarounds with naming, I think the original request still has a lot of merit. We define the provider for a box (rather than having say foo/debian11-parallels and foo/debian11-vmware etc).

The arch suffix thing is already kind of common (i.e. the 64 suffix in some, or e.g. I use -i386 and -amd64 suffixes) but unlike the i386 issue which is likely to just go away as non-64bit PCs fade into insignificance, we're going to have Arm and x86 with significant numbers each for a significant period of time.

@fredngo
Copy link

fredngo commented Apr 26, 2022

P.P.S. Why do you need to pass -i? Seems to work fine without.

Seems like -i is needed to prevent error output when sysctl.proc_translated doesn't exist.

On my Intel box running Big Sur:

% sysctl -n sysctl.proc_translated
sysctl: unknown oid 'sysctl.proc_translated'
% sysctl -in sysctl.proc_translated
% 

On my M1 box running Monterey:

% sysctl -n sysctl.proc_translated
0
% sysctl -in sysctl.proc_translated
0
% 

@fredngo
Copy link

fredngo commented Apr 26, 2022

So this is what I ended up with in my Vagrantfile based on @glynnforrest and @brycekahle 's suggestion:

# Tells us if the host system is an Apple Silicon Mac running Rosetta
def running_rosetta()
  !`sysctl -in sysctl.proc_translated`.strip().to_i.zero?
end

Vagrant.configure('2') do |config|
  arch = `arch`.strip()
  if arch == 'arm64' || (arch == 'i386' && running_rosetta())
    config.vm.box = 'bytesguy/ubuntu-server-20.04-arm64'
    config.vm.box_version = '1.0.0'
  else
    puts "This appears to be an Intel machine! ... Not supported just yet."
    exit
  end

  ...
end

It seems to work fine! On my M1 Mac (running Monterey) the proper box is selected, and on an Intel Mac (running Big Sur) vagrant up just exits out with the "Not supported just yet" message.

This unblocks me for now while I work on the M1 version of the VM, and eventually I'll develop the Intel version as well.

Thanks everyone!

Edit: I added a check for arch to future-proof a bit; assuming that eventually arch will return the right architecture someday

@fatso83 fatso83 changed the title Enhancement Request: Handling boxes with incompatible CPU architecture Enhancement Request: CPU architecture aware box handling Apr 27, 2022
@jhgorse
Copy link

jhgorse commented May 4, 2022

@fredngo Excellent work! I've extended it and simplified.

# Tells us if the host system is an Apple Silicon Mac running Rosetta
def is_arm64()
  `uname -m` == "arm64" || `/usr/bin/arch -64 sh -c "sysctl -in sysctl.proc_translated"`.strip() == "0"
end

Vagrant.configure('2') do |config|
  if is_arm64()
    config.vm.box = 'bytesguy/ubuntu-server-20.04-arm64'
    config.vm.box_version = '1.0.0'
  elsif `uname -m` == "x86_64"
    config.vm.box = 'bento/ubuntu-20.04'
  end

  ...
end

Unit testing below.

on m1 mac within rosetta:

arch -arch x86_64 bash
sysctl -in sysctl.proc_translated
1
arch -64 /bin/sh -c "sysctl -in sysctl.proc_translated"
0
arch -arch x86_64 /bin/sh -c "sysctl -in sysctl.proc_translated"
1

on x86 mac returns no output:

arch -64 sh -c "sysctl -in sysctl.proc_translated"

edit: zero? caused x86 empty return string value to false positive report integer 0. Just compare the string.

@ladar
Copy link
Contributor

ladar commented Nov 8, 2022

I'm the guy making the Roboxes. Those are the generic base boxes on Vagrant Clould. At the moment we're only building x64 boxes (plus a handful of x32 boxes). But will (hiopefully) be changing soon.

I'm still waiting on the hardware to arrive, but if everything goes according to plan, we'll also be generating a64 versions of our most popular boxes by the end of the year.

The build server we hope to get will be running Linux, so at least initially, we'll only be generating a64 boxes for the libvirt provider.

For the VWare and VirtualBox providers, we'll need to wait until VMWare/Oracle add support for a64 Linux hosts. As of now, those vendors only offer a64 release for MacOS (and it's M1/M2 chips).

As of now,m we have our Windows, and MacOS build robots all have Intel chips. So I don't expect a64 versions for those providers for awhile.

The status update above was just the preface to my feature request. When the a64 boxes start arriving, I would like to be able to upload them to our existing box repos, so they can appear be hosted alongside their x64 counterparts. What I'm trying to say is that a64/x64 boxes for a given provider should be able to coexisting the same way VMware and libvirt versions of boxes coexist today. When a user requests a given box, they presented with a list to pick from, showing all of the different providers available. CPU arches should work the same way.. If the user picks libvirt, they would see a list of different arches availabe for that provider, and get to pick.

Presumably a global config flag could be used to configure the default behavior, like so!

If arch is set to ask then Vagrant should always ask the user to pick when multiple arches are available. Likewise, the value host-only should automatically pick boxes which match the host architecture. If no exact match is available, the Vagrant shold present a warning, and ask the user to choose from what is available.

Of course arch should also support specific arch values, like x32, x64, a32, a64 (my preferred naming convention, but x86, x86_64, armv7a, armv8a also works. Likewise you could also use amd64 and aarch64).

When set to use to a specific arch, Vagrant would only continue if a given repo+arch match is available. Otherwise the request should fail.

A commnd line flag, --arch= should be added which lets a user override this global default, and dictate a given arch value. Of course, if value provided to --arch= isn't available for the given provider, then the the up ./ box add request should fail, the same way a global arch value would also fail.

In the nice to have category, it should be possible to specific array of arches when setting the global default. When setup this way, Vagrant should use the first arch value in the array to match what a repo provides. This would allow someone to prefer a64, then a32, and finally x64. But such a setup would still reject x32, r64 (RISC-V) and p64 /p64le (OpenPower) boxes (for example).

Note, the --arch= option should probably be resitricted to accepting a single value, and not the array, so works the same way --provder= does now... ie a user tries to pass Vagrant the CLI option --provider=awesomesauceit fails because noawesomesauve` boxes exist (yet).

The final configuration to discuss is what happens when the arch config value is set is explicitly set to auto ... presumably this option also be used as the implied/default value,ie. when if arch isn't explicitly configured.

@ladar
Copy link
Contributor

ladar commented Nov 8, 2022

I submitted early.

For auto, Vagrant should match boxes same architecture family automatically, with a preference order that picks the box cloest to the host. So x64 would prefer x64 then x32. While x32 would only accept x32.

If no boxes are available in the same arch family, then Vagrant should present a list of options that are available, (with a possible warning as someone else suggested), and allow the user to pick....

Would anyone else like to offer thoguhts on how this should work?

Personally, I believe this issue is critical and time sensitve. Vagrant and Vagrtant Cloud both need to be improved so they can handle different CPU arches logically, or we will end up with a mess of technical debt.

If we can't store a64 and x64 boxes in the same Vagrant Cloud repo, like we currently do with different providers, then alt arch boxes will end up in distinct repos. For the Roboxes project, I went with a different org for the x32 boxes, so I could use consistent repo names for both x32 and x64.

This may not sound like much, but this critically means users won't be able to use/distribute a common Vagrantfile that will work, unmodified, regardless opf the CPU arch.

@chrisroberts
Copy link
Member

Hi everyone,

Thanks for all the information and suggestions provided in here! I am currently working on identifying all the changes required to make this work while also preventing those changes from breaking existing Vagrant releases.

@websafe
Copy link

websafe commented Jun 8, 2023

Yes, please :-) I'm working on Windows and MacOS M1 on the same projects and it's problematic 8-)

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants