Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Systemctl with user flag don't work #375

Open
aikooo7 opened this issue Dec 30, 2023 · 16 comments
Open

Systemctl with user flag don't work #375

aikooo7 opened this issue Dec 30, 2023 · 16 comments
Labels
bug Something isn't working wontfix This will not be worked on

Comments

@aikooo7
Copy link

aikooo7 commented Dec 30, 2023

Bug description

Using any commands of systemctl with the --user flag like systemctl list-unit-files --user --state=enabled with systemd native support activate gives the error: Failed to connect to bus: No such file or directory

To Reproduce

Steps to reproduce the behavior:

  1. Activate native systemd adding wsl.nativeSystemd = true; to /etc/nixos/configuration.nix
  2. Run systemctl list-unit-files --user --state=enabled

Expected behavior

The command execute without errors.

In the example command it should display all units.

Logs

Similar to #165 doing systemctl list-unit-files --user --state=enabled with native systemd support turned off works perfectly:

UNIT FILE                STATE   PRESET
emacs.service            enabled enabled
nixos-activation.service enabled enabled
dbus.socket              enabled enabled
gpg-agent-ssh.socket     enabled enabled
gpg-agent.socket         enabled enabled

5 unit files listed.

While with native systemd support turned off errors:

Failed to connect to bus: No such file or directory
@aikooo7 aikooo7 added the bug Something isn't working label Dec 30, 2023
@aikooo7 aikooo7 changed the title Systemd with user flag don't work Systemctl with user flag don't work Dec 30, 2023
@SuperSandro2000
Copy link
Member

Are you using WSL2?

@SuperSandro2000 SuperSandro2000 added question Further information is requested and removed bug Something isn't working labels Dec 31, 2023
@aikooo7
Copy link
Author

aikooo7 commented Dec 31, 2023

I am

@nzbr
Copy link
Member

nzbr commented Jan 5, 2024

This is a bug/missing feature in WSL. My Ubuntu distro with systemd enabled for example behaves exactly the same.
syschdemd included a workaround that made this work, but Microsoft's systemd implementation (which we call native) does not have that

@nzbr nzbr added bug Something isn't working wontfix This will not be worked on and removed question Further information is requested labels Jan 5, 2024
@nzbr
Copy link
Member

nzbr commented Jan 5, 2024

I'll leave this open, because I do in fact consider this a bug, but in WSL, not here.
There might be possible workarounds, but I'd much rather see Microsoft fix it

@aikooo7
Copy link
Author

aikooo7 commented Jan 6, 2024

This is a bug/missing feature in WSL. My Ubuntu distro with systemd enabled for example behaves exactly the same. syschdemd included a workaround that made this work, but Microsoft's systemd implementation (which we call native) does not have that

Alright I will make a issue in wsl repo and keep you/this issue updated

@wyndon
Copy link

wyndon commented Aug 7, 2024

The issue isn't always present. Right now I do not have the issue, whereas some days ago I was encountering it.

It's worth noting though that services and stuff takes a bit of time to start up (at least for me, even nix-daemon), so by trying to reproduce the issue directly after the shell is available, you'll 100% encounter the issue. Check the startup is actually finished by using htop or similar before attempting to reproduce, you should see a bunch of stuff starting.

@paperdev-code
Copy link

paperdev-code commented Aug 14, 2024

Experiencing this, programs.ssh.startAgent wasn't working for me, using nativeSystemd = false; solves the issue, but really isn't ideal...

$ wsl --version
WSL version: 2.2.4.0
Kernel version: 5.15.153.1-2
WSLg version: 1.0.61
MSRDC version: 1.2.5326
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26091.1-240325-1447.ge-release
Windows version: 10.0.22631.4037

Is there an active issue for this on the WSL(g?) repo?

@nialov
Copy link

nialov commented Aug 16, 2024

Issue for me as well with, e.g., ssh-agent from home-manager. The user systemd unit is not enabled/working/ with native systemd enabled.

systemctl --user
# Failed to connect to bus: No such file or directory
WSL version: 2.2.4.0
Kernel version: 5.15.153.1-2
WSLg version: 1.0.61
MSRDC version: 1.2.5326
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26091.1-240325-1447.ge-release
Windows version: 10.0.19045.4651

@Prince213
Copy link
Contributor

Related WSL2 issues:

@xieve
Copy link

xieve commented Sep 12, 2024

My workaround is to set this as the shell command in my terminal (run as root):

/run/current-system/sw/bin/zsh -c \
"until [ -S /run/dbus/system_bus_socket ]; \
 do sleep 1; \
done; \
systemctl restart user@1000; \
export DBUS_SESSION_BUS_ADDRESS='unix:path=/run/user/1000/bus'; \
exec sudo --preserve-env=DBUS_SESSION_BUS_ADDRESS --user xieve zsh"

It should be possible to incorporate a similar workaround into NixOS-WSL, if that is deemed appropriate. I also tried using login, which would be the "cleaner" version in my eyes, but it re-sets PATH to only include unix binaries.

@antoineco
Copy link
Contributor

Is this still relevant? I just performed a fresh installation of NixOS-WSL with all the default settings (in particular the native systemd integration), and user scope is working without issues:

$ systemctl list-unit-files --user --state=enabled
UNIT FILE                STATE   PRESET
nixos-activation.service enabled enabled
ssh-agent.service        enabled enabled
dbus.socket              enabled enabled
$ systemctl status --user ssh-agent.service
● ssh-agent.service - SSH Agent
     Loaded: loaded (/etc/systemd/user/ssh-agent.service; enabled; preset: enabled)
     Active: active (running) since Fri 2024-09-13 13:31:15 UTC; 5min ago
    Process: 288 ExecStartPre=/nix/store/k71apxkm38m3g34k01sb6zhysi0y7gph-coreutils-9.5/bin/rm -f /run/user/1000/ssh-agent (code=exited, status=0/SUCCESS)
    Process: 290 ExecStart=/nix/store/78mv13w9mgh0s0rd7rnr6ff4d7a39bpd-openssh-9.7p1/bin/ssh-agent -a /run/user/1000/ssh-agent (code=exited, status=0/SUCCESS)
   Main PID: 297 (ssh-agent)
     CGroup: /user.slice/user-1000.slice/[email protected]/app.slice/ssh-agent.service
             └─297 /nix/store/78mv13w9mgh0s0rd7rnr6ff4d7a39bpd-openssh-9.7p1/bin/ssh-agent -a /run/user/1000/ssh-agent

Sep 13 13:31:15 calavera systemd[272]: Starting SSH Agent...
Sep 13 13:31:15 calavera systemd[272]: Started SSH Agent.
> wsl --version
WSL version: 2.2.4.0
Kernel version: 5.15.153.1-2
WSLg version: 1.0.61
MSRDC version: 1.2.5326
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26091.1-240325-1447.ge-release
Windows version: 10.0.22631.4112

@xieve
Copy link

xieve commented Sep 14, 2024

Is this still relevant? I just performed a fresh installation of NixOS-WSL with all the default settings (in particular the native systemd integration), and user scope is working without issues:

$ systemctl list-unit-files --user --state=enabled
UNIT FILE                STATE   PRESET
nixos-activation.service enabled enabled
ssh-agent.service        enabled enabled
dbus.socket              enabled enabled
$ systemctl status --user ssh-agent.service
● ssh-agent.service - SSH Agent
     Loaded: loaded (/etc/systemd/user/ssh-agent.service; enabled; preset: enabled)
     Active: active (running) since Fri 2024-09-13 13:31:15 UTC; 5min ago
    Process: 288 ExecStartPre=/nix/store/k71apxkm38m3g34k01sb6zhysi0y7gph-coreutils-9.5/bin/rm -f /run/user/1000/ssh-agent (code=exited, status=0/SUCCESS)
    Process: 290 ExecStart=/nix/store/78mv13w9mgh0s0rd7rnr6ff4d7a39bpd-openssh-9.7p1/bin/ssh-agent -a /run/user/1000/ssh-agent (code=exited, status=0/SUCCESS)
   Main PID: 297 (ssh-agent)
     CGroup: /user.slice/user-1000.slice/[email protected]/app.slice/ssh-agent.service
             └─297 /nix/store/78mv13w9mgh0s0rd7rnr6ff4d7a39bpd-openssh-9.7p1/bin/ssh-agent -a /run/user/1000/ssh-agent

Sep 13 13:31:15 calavera systemd[272]: Starting SSH Agent...
Sep 13 13:31:15 calavera systemd[272]: Started SSH Agent.
> wsl --version
WSL version: 2.2.4.0
Kernel version: 5.15.153.1-2
WSLg version: 1.0.61
MSRDC version: 1.2.5326
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26091.1-240325-1447.ge-release
Windows version: 10.0.22631.4112

Yes, this is still relevant (see open issues on WSL). This is a race condition that depends on how long systemd takes to fully start up, which is less for fast and light systems.

xieve added a commit to xieve/dotfiles that referenced this issue Sep 24, 2024
nix-community/NixOS-WSL#375
present in upstream wsl
when start wsl and immediately dropping into a shell
- sudo doesn't work
- systemctl --user doesn't work

possibly more. these are "fixed"
@go-colin
Copy link

go-colin commented Oct 11, 2024

My workaround is to set this as the shell command in my terminal (run as root):

/run/current-system/sw/bin/zsh -c \
"until [ -S /run/dbus/system_bus_socket ]; \
 do sleep 1; \
done; \
systemctl restart user@1000; \
export DBUS_SESSION_BUS_ADDRESS='unix:path=/run/user/1000/bus'; \
exec sudo --preserve-env=DBUS_SESSION_BUS_ADDRESS --user xieve zsh"

It should be possible to incorporate a similar workaround into NixOS-WSL, if that is deemed appropriate. I also tried using login, which would be the "cleaner" version in my eyes, but it re-sets PATH to only include unix binaries.

Thank you for this. guess when I updated wsl this was the root of the problem I've been running into. Was 95% of the way there of restoring my environment and this was the missing link! Was annoying to have to sudo things that I didn't need to, was breaking a lot of integrations with vscode, dev flakes, docker, etc.

A little annoying to have to run those 3 commands every time I fire up wsl, but it's better than the alternative of disrupting my overall workflow.

🙏

Now if only could resolve this read-only file system error when nixos-rebuild switch. But that's fine, rebuild boot and terminating wsl isn't too painful since not updating often.

------- edit:

I've resorted to disabling nativeSystemd for now as it's just less painful at the moment.
This does seem to likely be an upstream issue with the latest WSL updates.

@lucdew
Copy link

lucdew commented Oct 14, 2024

I also face the same issue on WSL.
It disappears if I disable native systemd wsl.nativeSystemd = false but the startup takes 40s on my I7 13th Gen, with SSD and plenty of RAM 32 GB and .wslconfig reserving half of the cores and memory.

My wsl version:

WSL version: 2.3.24.0
Kernel version: 5.15.153.1-2
WSLg version: 1.0.65
MSRDC version: 1.2.5620
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26100.1-240331-1435.ge-release
Windows version: 10.0.22631.4317

xieve workaround works for me except that it slows down the startup from wsl waiting for the system dbus socket to be created and checking if it is there. It takes as long as the disabled native systemd support.
I ran the script with:

wsl -d NixOS -u root  /run/current-system/sw/bin/zsh ... exec sudo --preserve-env=DBUS_SESSION_BUS_ADDRESS --user mysuser zsh"

Another way to somehow workaround the issue is to enable native systemd and also enable user systemd lingering mode for the user by doing so in the NixOS machine configuration:

systemd.tmpfiles.rules = [
    "f /var/lib/systemd/linger/myusername"
  ];

Then when I login quickly as the user. The user's systemd services are not started, but if I wait a couple of seconds more the user systemd eventually starts.

systemctl --user status                                                                                                                                                10m 18.75s
nixos
    State: running 

Edit1:
Another drawback of enabling systemd lingering for the user is that when logged in as the user,
the sudo command is broken

sudo: effective uid is not 0, is /run/wrappers/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?

It is because the symlink /run/wrappers/bin which is in the PATH links to a directory that is not yet created...
The latter contains executables with sticky bit enabled like sudo.
In my case it is a matter of 1 to 3 seconds. So I either have to start another shell (zsh in my case) or I need to run the rehash command for my zsh shell. It is possible to configure zsh to always rehash on completions.

@doronbehar
Copy link

Thanks to everyone for the investigation efforts. I too encountered the mentioned issue:

sudo: effective uid is not 0, is /run/wrappers/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?

Just wanted to share my TL;DR, which is to rehash (using ZSH). For most of the times though, the WSL will be running anyway in the background, and this won't be needed. Not only that, I'd get into a Tmux session right afterwards, which will give enough time for the wrappers to be loaded so that I shouldn't notice this issue.

@ruizlenato
Copy link

Same here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests