Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intel macOS qemu_podman-machine-default.sock: connect: no such file or directory #13609

Closed
fithisux opened this issue Mar 23, 2022 · 20 comments · Fixed by #13750
Closed

Intel macOS qemu_podman-machine-default.sock: connect: no such file or directory #13609

fithisux opened this issue Mar 23, 2022 · 20 comments · Fixed by #13750
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine

Comments

@fithisux
Copy link

fithisux commented Mar 23, 2022

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

I upgraded my podman with brew to 4.0.2. I tried to use it but it does not work.

Steps to reproduce the issue:

brew uninstall podman
brew install podman
podman machine rm
podman machine init
podman machine start

Describe the results you received:

Starting machine "podman-machine-default"
INFO[0000] waiting for clients...
ERRO[0000] Error listening on socket: /Users/vassilisanagnostopoulos/.local/share/containers/podman/machine/podman-machine-default/podman.sock: listen unix /Users/vassilisanagnostopoulos/.local/share/containers/podman/machine/podman-machine-default/podman.sock: bind: invalid argument
Error: dial unix /var/folders/70/tpb1l8jd45s8zj742ljrmbs80000gp/T/podman/qemu_podman-machine-default.sock: connect: no such file or directory

Describe the results you expected:

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

podman version
Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM
Error: unable to connect to Podman. failed to create sshClient: Connection to bastion host (ssh://core@localhost:51528/run/user/502/podman/podman.sock) failed.: dial tcp [::1]:51528: connect: connection refused

Output of podman info --debug:

Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM
Error: unable to connect to Podman. failed to create sshClient: Connection to bastion host (ssh://core@localhost:51528/run/user/502/podman/podman.sock) failed.: dial tcp [::1]:51528: connect: connection refused

Package info (e.g. output of rpm -q podman or apt list podman):

(paste your output here)

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):
Darwin C02CH2W2MD6R.local 20.6.0 Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:31 PDT 2021; root:xnu-7195.141.2~5/RELEASE_X86_64 x86_64

@openshift-ci openshift-ci bot added kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature. labels Mar 23, 2022
@fithisux fithisux changed the title Intel Macos qemu_podman-machine-default.sock: connect: no such file or directory Intel macOS qemu_podman-machine-default.sock: connect: no such file or directory Mar 23, 2022
@Luap99
Copy link
Member

Luap99 commented Mar 23, 2022

On darwin and other BSD operation system the maximum path length for the socket path seems to be 104 chars. On linux it is 108 chars. Your path is 105 chars long.

Can you try to create a user with a shorter name and see if this works?

@baude
Copy link
Member

baude commented Mar 23, 2022

consider the following:

  1. podman system connection ls ; how many listings are there. Are you certain the default listing is correct for the default machine? if you dont know and don't care, rm the machine; then double check system connections. if some still exist, remove them and start over at init.
  2. do the dirs being cited exist? do you have write permissions to them?
  3. are there any security policies on your mac that are blocking things?

@baude
Copy link
Member

baude commented Mar 23, 2022

wooh, @Luap99 nice find.

@Luap99
Copy link
Member

Luap99 commented Mar 23, 2022

@baude Maybe this workaround helps:

// workaround to bypass the 108 char socket path limit
// open the fd and use the path to the fd as bind argument
fd, err := unix.Open(socketDir, unix.O_PATH, 0)
if err != nil {
return err
}
socket, err := net.ListenUnix("unixpacket", &net.UnixAddr{Name: fmt.Sprintf("/proc/self/fd/%d/%s", fd, cfg.ContainerID), Net: "unixpacket"})
if err != nil {
return err
}
err = unix.Close(fd)
// remove the socket file on exit
defer os.Remove(socketfile)
if err != nil {
logrus.Warnf("Failed to close the socketDir fd: %v", err)
}
defer socket.Close()

But I have no idea if this would work on darwin.

@baude
Copy link
Member

baude commented Mar 23, 2022

@Luap99 or @fithisux you could also create a shortened machine name. i.e. podman init --now foo

@baude
Copy link
Member

baude commented Mar 23, 2022

verified on my m1 that indeed the long name throws the same errors.

@Luap99 Luap99 removed the kind/feature Categorizes issue or PR as related to a new feature. label Mar 23, 2022
@baude baude self-assigned this Mar 23, 2022
@baude
Copy link
Member

baude commented Mar 23, 2022

i got this one ..,.

@fithisux
Copy link
Author

On darwin and other BSD operation system the maximum path length for the socket path seems to be 104 chars. On linux it is 108 chars. Your path is 105 chars long.

Can you try to create a user with a shorter name and see if this works?

Unfortunately this is not possible. But my issue is that the files / sockets is not there at all

`Last login: Wed Mar 23 15:14:37 on ttys001
➜ ~ ls /var/folders/70/tpb1l8jd45s8zj742ljrmbs80000gp/T/podman

➜ ~ ls -l /Users/vassilisanagnostopoulos/.local/share/containers/podman/machine/podman-machine-default

➜ ~`

@fithisux
Copy link
Author

podman init --now foo

This does not seem to work

➜ ~ podman init --now foo
Error: unknown flag: --now
See 'podman init --help'
➜ ~

@Luap99
Copy link
Member

Luap99 commented Mar 23, 2022

use podman machine init ...

@fithisux
Copy link
Author

use podman machine init ...

Works now. Thank you.

@Luap99 Luap99 added the machine label Mar 23, 2022
@baude baude added the 4.1 label Mar 23, 2022
baude added a commit to baude/podman that referenced this issue Mar 24, 2022
if the machine name is long and/or the username is long, it is possible to exceed the 104byte limit for filenames in MacOS.  This PR shortcuts things to $HOME/.podman instead.

Fixes: containers#13609

[NO NEW TESTS NEEDED]

Signed-off-by: Brent Baude <[email protected]>
@afbjorklund
Copy link
Contributor

You should still be able to use ssh sockets just fine, it's just the legacy unix sockets that have this limit.

baude added a commit to baude/podman that referenced this issue Apr 5, 2022
to avoid errors on macos, we use symlinks to long socket names.

Fixes: containers#12751
Fixes: containers#13609

Signed-off-by: Brent Baude <[email protected]>

[NO NEW TESTS NEEDED]

Signed-off-by: Brent Baude <[email protected]>
@pigping88
Copy link

mac apple A1 , podman 4.3.1
podman machine init
podman machine start

Starting machine "podman-machine-default"
Waiting for VM ...
Error: dial unix /var/folders/f3/v585c0593yg1v4qqs5tjnw600000gn/T/podman/podman-machine-default_ready.sock: connect: no such file or directory

podman machine ssh

Connecting to vm podman-machine-default. To close connection, use `~.` or `exit`
Fedora CoreOS 37.20221211.2.0
Tracker: https://github.com/coreos/fedora-coreos-tracker
Discuss: https://discussion.fedoraproject.org/tag/coreos

@ssbarnea
Copy link
Collaborator

Reopening because this issue is still happening with GHA runners as today, see https://github.com/ansible/vscode-ansible/actions/runs/4005037985/jobs/6874912745

2023-01-25T10:34:30.5842600Z Downloading VM image: fedora-coreos-37.20230110.2.0-qemu.x8…
2023-01-25T10:34:30.7628400Z �[1A�[JDownloading VM image: fedora-coreos-37.20230110.2.0-qemu.x8…
2023-01-25T10:34:37.9151470Z Extracting compressed file
2023-01-25T10:35:09.9899520Z Image resized.
2023-01-25T10:35:09.9923940Z Machine init complete
2023-01-25T10:35:09.9935290Z Starting machine "podman-machine-default"
2023-01-25T10:35:10.5067840Z Waiting for VM ...
2023-01-25T10:35:13.5424550Z Error: dial unix /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/podman/podman-machine-default_ready.sock: connect: no such file or directory
2023-01-25T10:35:13.5717740Z task: Failed to run task "setup": exit status 125
2023-01-25T10:35:13.5751490Z ##[error]Process completed with exit code 1.

I think that the command that caused it to fail was podman machine init --now.

@ssbarnea
Copy link
Collaborator

@baude Any idea what could have cause this regression? What can we do to make the podman initialization reliable on GHA?

@baude
Copy link
Member

baude commented Jan 25, 2023

this is sort of a general error for "something went wrong". it is entirely possible that it is different than this issue .... does removing the machine and recreating it help?

@ssbarnea
Copy link
Collaborator

While ok locally, turn-in-off-and-on does not really work with GHA. Still, I observed that while it does reproduce, it does not always reproduce, so there is a level of randomness in it.

If you could provide some hints regarding how we can get extra logs when happens, I might be able to alter the GHA pipelines to collect the extra information.

If we manage to get podman to be reliable on GHA, we might have a chance on convincing github to add it to the default runner image.

ssbarnea added a commit to ansible/vscode-ansible that referenced this issue Mar 10, 2023
Due to podman bootrapping flakiness on Github Runners, we make
the macos testing non-voting for the final outcome. We continue to
run it order to see how it evolves in time, hopefully find a permanent
fix for it.

Related: containers/podman#13609
ssbarnea added a commit to ansible/vscode-ansible that referenced this issue Mar 10, 2023
Due to podman bootrapping flakiness on Github Runners, we make
the macos testing non-voting for the final outcome. We continue to
run it order to see how it evolves in time, hopefully find a permanent
fix for it.

Related: containers/podman#13609
ssbarnea added a commit to ansible/vscode-ansible that referenced this issue Mar 10, 2023
Due to podman bootrapping flakiness on Github Runners, we make
the macos testing non-voting for the final outcome. We continue to
run it order to see how it evolves in time, hopefully find a permanent
fix for it.

Related: containers/podman#13609
@ctrought
Copy link

ctrought commented Mar 12, 2023

My system (m1 mac) hit an out of memory condition while podman was running which led to this issue on startup #16945 , rebooted and then podman won't startup now hitting the error in this issue :/ Any workaround that does not involve wiping the original podman machine?

$ podman machine start  --log-level debug
INFO[0000] podman filtering at log level debug
Starting machine "podman-machine-default"
[/opt/podman/qemu/bin/gvproxy -listen-qemu unix:///var/folders/5k/865_x9vd2_3f3k7bw36k87vc0000gp/T/podman/qmp_podman-machine-default.sock -pid-file /var/folders/5k/865_x9vd2_3f3k7bw36k87vc0000gp/T/podman/podman-machine-default_proxy.pid -ssh-port 50044 -forward-sock /Users/XXXXXXXXXXXX/.local/share/containers/podman/machine/podman-machine-default/podman.sock -forward-dest /run/user/502/podman/podman.sock -forward-user core -forward-identity /Users/XXXXXXXXXXXX/.ssh/podman-machine-default --debug
Error: dial unix /var/folders/5k/865_x9vd2_3f3k7bw36k87vc0000gp/T/podman/qmp_podman-machine-default.sock: connect: no such file or directory

@spencerrung
Copy link

spencerrung commented Mar 16, 2023

Agree this should be fixed but an easy workaround for M1 mac users

podman machine stop
podman machine rm
podman machine init
podman machine start

@vrothberg
Copy link
Member

That has been fixed in the meantime, closing.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Jan 17, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine
Projects
None yet
9 participants