Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use XDG_RUNTIME_DIR #1129

Closed
totaam opened this issue Feb 18, 2016 · 51 comments
Closed

Use XDG_RUNTIME_DIR #1129

totaam opened this issue Feb 18, 2016 · 51 comments

Comments

@totaam
Copy link
Collaborator

totaam commented Feb 18, 2016

Issue migrated from trac ticket # 1129

component: platforms | priority: blocker | resolution: fixed

2016-02-18 20:59:12: urzds created the issue


Currently, xpra defaults to ~/.xpra for several paths related to sockets, logfiles (1) and temporary scripts (2).

The XDG specification (3) says:
$XDG_RUNTIME_DIR defines the base directory relative to which user-specific non-essential runtime files and other file objects (such as sockets, named pipes, ...) should be stored. The directory MUST be owned by the user, and he MUST be the only one having read and write access to it. Its Unix access mode MUST be 0700.

This seems to be the ideal location for these files.

In fact, Fedora already bends the paths to point to that location (4), (5), though currently literally instead of by using $XDG_RUNTIME_DIR (6).

  • (1): [http://xpra.org/trac/browser/xpra/tags/v0.16.x/src/xpra/platform/paths.py#L62]
  • (2): [http://xpra.org/trac/browser/xpra/tags/v0.16.x/src/xpra/scripts/server.py#L150]
  • (3): [https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html]
  • (4): [https://apps.fedoraproject.org/packages/xpra/sources/]
  • (5): [https://pkgs.fedoraproject.org/cgit/rpms/xpra.git/tree/xpra-0.16.0-move-to-var-run.patch]
  • (6): [https://bugzilla.redhat.com/show_bug.cgi?id=1309872]
@totaam
Copy link
Collaborator Author

totaam commented Feb 19, 2016

2016-02-19 03:01:51: antoine changed status from new to assigned

@totaam
Copy link
Collaborator Author

totaam commented Feb 19, 2016

2016-02-19 03:01:51: antoine edited the issue description

@totaam
Copy link
Collaborator Author

totaam commented Feb 19, 2016

2016-02-19 03:01:51: antoine commented


The move to /var/run for sockets and log files is complete already and will be part of the 0.17 release, see: #888, #963.
The change that moves the run-xpra script there is a very bad idea: it is too simplistic and it makes Fedora incompatible with every other distro out there (and MS Windows, OSX, etc - then people just think xpra doesn't work... ouch).

It looks like replacing /var/run/user/$UID/ with XDG_RUNTIME_DIR makes sense... except that again, it's not that simple: we use this same code to generate the default config file. And you really don't want it to contain the buildbot's UID.

@totaam
Copy link
Collaborator Author

totaam commented Feb 19, 2016

2016-02-19 03:03:32: antoine edited the issue description

@totaam
Copy link
Collaborator Author

totaam commented Feb 19, 2016

2016-02-19 08:55:11: jonathan.underwood commented


Replying to [comment:1 antoine]:

The move to /var/run for sockets and log files is complete already and will be part of the 0.17 release, see: #888, #963.

Any hope of backporting this to 0.16? I'd really liketo get this change in to F24 if possible.

The change that moves the run-xpra script there is a very bad idea: it is too simplistic and it makes Fedora incompatible with every other distro out there (and MS Windows, OSX, etc - then people just think xpra doesn't work... ouch).

OK, this was only done on the rawhide branch to see what broke :). However, I think this does need some consideration moving forward - having the run script in the home directory is fragile on NFS home directory mounted systems.

It looks like replacing /var/run/user/$UID/ with XDG_RUNTIME_DIR makes sense... except that again, it's not that simple: we use this same code to generate the default config file. And you really don't want it to contain the buildbot's UID.

Hm. OK, this could be achieved with a bit of seddism after building the config file, or some judicious use of quoting during building. Will look into it.

The reporter (urzds) seems to be doing some intensive testing using xpra in minimal containers (judging by the Fedora bugs filed), which is really great - this is a great opportunity to make this use case solid.

@totaam
Copy link
Collaborator Author

totaam commented Feb 19, 2016

2016-02-19 09:48:51: antoine changed status from assigned to new

@totaam
Copy link
Collaborator Author

totaam commented Feb 19, 2016

2016-02-19 09:48:51: antoine changed owner from antoine to jonathan.underwood

@totaam
Copy link
Collaborator Author

totaam commented Feb 19, 2016

2016-02-19 09:48:51: antoine commented


Any hope of backporting this to 0.16? I'd really liketo get this change in to F24 if possible.
[[BR]]
None whatsoever - too big. 0.17 will be frozen in the next 2 weeks though.
BTW: you own #963 ;)
[[BR]]
having the run script in the home directory is fragile on NFS home directory mounted systems.
[[BR]]
Not just that, but there's no reason you can't mount your home directory noexec or readonly.

The problem here is that all existing clients connect via ssh by running this command:

xpra initenv;~/.xpra/run-xpra _proxy

(the _proxy subcommand just opens stdin/stdout pipes to the xpra server)
If we can come up with a safe and backwards compatible way of changing that string, then I'm happy to backport it. Then we can move the script anywhere.
Another difficulty here is that this needs to work with all shells. Any ideas?

[[BR]]
r11996 looks up XDG_RUNTIME_DIR for both socket dirs and log dirs.
Note: this goes into the config file with the UID replaced with $UID which we convert back to a UID at runtime.. (so really, not much has changed except we use the env var rather than building the path ourselves)
We could have the actual string $XDG_RUNTIME_DIR in the config file, and it this string to the env substitution whitelist - not sure there's much to be gained from doing that though.

[[BR]]

... xpra in minimal containers..
[[BR]]
Sounds like a great use case, send bug reports this way!

@totaam
Copy link
Collaborator Author

totaam commented Feb 19, 2016

2016-02-19 11:59:18: jonathan.underwood commented


Replying to [comment:4 antoine]:

Any hope of backporting this to 0.16? I'd really liketo get this change in to F24 if possible.
[[BR]]
None whatsoever - too big. 0.17 will be frozen in the next 2 weeks though.
BTW: you own #963 ;)

Ah, yes. Have been bogged down with $dayjob, but will get to that, thansk for the poke.

[[BR]]

having the run script in the home directory is fragile on NFS home directory mounted systems.
[[BR]]
Not just that, but there's no reason you can't mount your home directory noexec or readonly.

The problem here is that all existing clients connect via ssh by running this command:

xpra initenv;~/.xpra/run-xpra _proxy

(the _proxy subcommand just opens stdin/stdout pipes to the xpra server)
If we can come up with a safe and backwards compatible way of changing that string, then I'm happy to backport it. Then we can move the script anywhere.
Another difficulty here is that this needs to work with all shells. Any ideas?

Something like this?

xpra initenv;~/.xpra/run-xpra _proxy || $XDG_RUNTIME_DIR/run-xpra _proxy

I think conditional execution is posix compliant, so I think that's portable.

So, in paths.py we could have do_get_script_bin_dir() return a list of places to look for the run_xpra script, and then compose the ssh line from that list, with !|| as the separator.

This seems too simple, I am sure I am missing a bunch of complexity :)

[[BR]]
r11996 looks up XDG_RUNTIME_DIR for both socket dirs and log dirs.
Note: this goes into the config file with the UID replaced with $UID which we convert back to a UID at runtime.. (so really, not much has changed except we use the env var rather than building the path ourselves)
We could have the actual string $XDG_RUNTIME_DIR in the config file, and it this string to the env substitution whitelist - not sure there's much to be gained from doing that though.

[[BR]]

... xpra in minimal containers..
[[BR]]
Sounds like a great use case, send bug reports this way!

So far, most of the bugs have been me missing some Requires for the Fedora package :).

@totaam
Copy link
Collaborator Author

totaam commented Feb 19, 2016

2016-02-19 16:32:55: antoine changed status from new to assigned

@totaam
Copy link
Collaborator Author

totaam commented Feb 19, 2016

2016-02-19 16:32:55: antoine changed owner from jonathan.underwood to antoine

@totaam
Copy link
Collaborator Author

totaam commented Feb 19, 2016

2016-02-19 16:32:55: antoine commented


I think conditional execution is posix compliant, so I think that's portable.
[[BR]]
I think I had considered that and decided against it. But now that I look at it again, I can't see why this wouldn't work!

Let me play with it.

@totaam
Copy link
Collaborator Author

totaam commented Feb 21, 2016

2016-02-21 07:28:13: antoine changed status from assigned to new

@totaam
Copy link
Collaborator Author

totaam commented Feb 21, 2016

2016-02-21 07:28:13: antoine changed owner from antoine to urzds

@totaam
Copy link
Collaborator Author

totaam commented Feb 21, 2016

2016-02-21 07:28:13: antoine commented


Done in r12005 (+fixup in r12006) - please see commit message for more details. I've tried to do this as cleanly as I can, which also means this is not 0.16 backport material..

It would be a little bit difficult to test because by default we run "xpra initenv", which re-creates all the scripts we may need. To workaround that, use XPRA_INITENV_COMMAND="" xpra attach ssh:... to skip this step.

If the scripts we try to run on the remote end do not exist (intentionally removed for testing, or just not present when connecting to an older xpra version), you get this sort of message on stderr (slightly ugly, but we can live with it):

bash: /run/user/1000/username/run-xpra: No such file or directory
bash: /home/username/.xpra/run-xpra: No such file or directory

At some point in the future, we can drop compatibility with older versions and remove "~/.xpra" from the list of directories where we save the script / try to run from.
Until then, failure to create or run a script from this location is no longer fatal.
(the same approach applies to socket-dir, see #963)

Note: we could actually drop the call to "xpra initenv" at the same time: the new remote command line will also try to run just a plain "xpra" (from the PATH), so this is now a little bit redundant.

@urzds: does that work for you?

@totaam
Copy link
Collaborator Author

totaam commented Mar 13, 2016

2016-03-13 02:52:04: antoine changed priority from major to blocker

@totaam
Copy link
Collaborator Author

totaam commented Mar 13, 2016

2016-03-13 02:52:04: antoine commented


We do have a problem, on Fedora at least, when the /run/user/$UID directory does not exist: things don't work at all. Ouch.

I was expecting su to create this directory, but it doesn't, so this breaks badly:

su - test
xpra start ...

What are we supposed to do?
If this directory cannot be relied upon, maybe I need to undo some of those changes?

@urzds / @jonathan.underwood: ideas?

The only slightly related ticket that I found are: [https://bugzilla.redhat.com/show_bug.cgi?id=961235] and [https://bugzilla.redhat.com/show_bug.cgi?id=967509].
The claim is that su -l should create the directory, but this does not work on Fedora 23.

@totaam
Copy link
Collaborator Author

totaam commented Mar 13, 2016

2016-03-13 12:58:03: jonathan.underwood commented


Well, this problem is actually more general, and related to a problem I was trying to understand with xpra and Fedora: if I start an xpra session from within a gnome session, putting the socket etc under /run/user/$UID, and then log out of the gnome session, the xpra Xdummy process ends up bring killed (though the xpra process itself remains, oddly). The reason for that is that logging into gnome creates a new cgroup, and process started within gnome are started within that cgroup. On logging out, the cgroup is torn down. By default, it shouldn't really be killing off the Xdummy process, but for some reason it does. I haven't got to the bottom of what's going on here. But it's related to this issue of what to do if /run/user/$UID isn't there - that directory is created when systemd is asked to fire up a new cgroup.

What I think xpra should really be doing is starting itself as a systemd managed service. That would create the directory, and also mean the process could be nicely isolated in its own cgroup. On systems where systemd isn't being used, libcg and related tools could be used instead.

I realize this is a very incomplete picture... I am still reading about all this to try and understand it, and where the integration points lie. But, this would be a really nice thing to get working.

@totaam
Copy link
Collaborator Author

totaam commented Mar 13, 2016

2016-03-13 22:38:59: jonathan.underwood commented


Replying to [comment:9 antoine]:

We do have a problem, on Fedora at least, when the /run/user/$UID directory does not exist: things don't work at all. Ouch.

Just to clarify: /run/user/$UID is created by the pam_systemd PAM module on user login. So, the only case where that directory wouldn't exist would be when the user trying to start xpra isn't logged in, and so doesn't have an active login session. I am struggling a bit to work out when that could actually happen.

I was expecting su to create this directory, but it doesn't, so this breaks badly:

su - test
xpra start ...

What are we supposed to do?
If this directory cannot be relied upon, maybe I need to undo some of those changes?

That does seem to be a su bug - I am not sure why that's going around the pam stack.

@urzds / @jonathan.underwood: ideas?

The only slightly related ticket that I found are: [https://bugzilla.redhat.com/show_bug.cgi?id=961235] and [https://bugzilla.redhat.com/show_bug.cgi?id=967509].
The claim is that su -l should create the directory, but this does not work on Fedora 23.

It's working here on F23...

@totaam
Copy link
Collaborator Author

totaam commented Mar 13, 2016

2016-03-13 22:51:21: jonathan.underwood commented


Replying to [comment:11 jonathan.underwood]:
[snip]

The only slightly related ticket that I found are: [https://bugzilla.redhat.com/show_bug.cgi?id=961235] and [https://bugzilla.redhat.com/show_bug.cgi?id=967509].
The claim is that su -l should create the directory, but this does not work on Fedora 23.

It's working here on F23...

Addendum: actually no it's not working on F23. This is a nasty bug of su.

@totaam
Copy link
Collaborator Author

totaam commented Mar 13, 2016

2016-03-13 22:57:57: jonathan.underwood commented


Bugzilla entry opened against su:

https://bugzilla.redhat.com/show_bug.cgi?id=1317304

@totaam
Copy link
Collaborator Author

totaam commented Mar 13, 2016

2016-03-13 23:10:55: jonathan.underwood commented


Actually, after reading this:

https://bugzilla.redhat.com/show_bug.cgi?id=753882#c29

I suspect this won't get fixed. Bottom line is that su --login (or su -l or su -) isn't actually designed to behave as though the new user was logging in. The recommended way would be ssh localhost or some variant thereof.

I suppose xpra could grow a command line option to specify the user to run as. But then xpra would also have to get into the business of logging a user in, and starting a session etc.

@totaam
Copy link
Collaborator Author

totaam commented Mar 14, 2016

2016-03-14 13:15:04: antoine commented


.. isn't actually designed to behave as though the new user was logging in...
[[BR]]
What a mess. The man pages reads Start the shell as a login shell with an environment similar to a real login. One would expect this environment to actually be usable...
Requiring an ssh server to get a directory created is beyond bizarre.

Whatever the justification is, this means that the new XDG_RUNTIME_DIR is useless for a large number of valid use cases.
End users will not care that $XYZ is the right way of doing something, they will just rightfully complain that something that used to work just fine no longer does.

So I am tempted to undo this change in the default config, those who really insist on using this location can make the change themselves and deal with the fallout.

@totaam
Copy link
Collaborator Author

totaam commented Mar 14, 2016

2016-03-14 13:34:22: jonathan.underwood commented


Yeah. It is a mess. Why not check to see if $XDG_RUNTIME_DIR is set, and if not, fall back to using the home directory?

Of relevance to this whole issue:

https://tlhp.cf/lennart-poettering-su/

@totaam
Copy link
Collaborator Author

totaam commented Mar 14, 2016

2016-03-14 13:40:43: antoine commented


Why not check to see if $XDG_RUNTIME_DIR is set
[[BR]]
I believe we now continue after printing a warning when we fail to create a socket in one of the locations specified in the config file (or command line), handling for the server log files would require more ugly code.

And we can't even do the fallback for the Xorg command line, because it is generated in the config file, and the vfb will just fail to start.
That is, unless we substitute the log location in that string at runtime. More code, more chances of breakage. More documentation to update, more bug reports to deal with. Sigh.

@totaam
Copy link
Collaborator Author

totaam commented Mar 31, 2016

2016-03-31 16:09:17: antoine changed status from new to closed

@totaam
Copy link
Collaborator Author

totaam commented Mar 31, 2016

2016-03-31 16:09:17: antoine set resolution to wontfix

@totaam
Copy link
Collaborator Author

totaam commented Mar 31, 2016

2016-03-31 16:09:17: antoine commented


As per the comments above, this directory is not usable so we won't try to use it.

r12297 disables all this code, nice waste of my time writing it.

@totaam
Copy link
Collaborator Author

totaam commented Apr 22, 2016

2016-04-22 19:22:50: urzds commented


@urzds: does that work for you?

This appears to work with 12469 on Fedora Rawhide, as can be tested with the Docker image quay.io/urzds/xpra-test:v0.17-pre12469-fedora-1, which is built from these layered repositories:

As a hack I made use of Xpra automatically creating the ~/.xpra directory and set XDG_RUNTIME_DIR=$HOME/.xpra, to make sure everything ends up in the same spot without relying on any system directories.

@totaam
Copy link
Collaborator Author

totaam commented May 2, 2016

2016-05-02 09:03:21: antoine changed status from closed to reopened

@totaam
Copy link
Collaborator Author

totaam commented Aug 14, 2016

2016-08-14 15:01:06: antoine commented


  • as of r13343, we enable /var/run/user/$UID/xpra for storing the unix domain sockets (as well as the other legacy paths, see r13342)
  • r13344 does the same for the run-xpra script and log files (both Xorg's and xpra's)
    This will help with the selinux policy, see SELinux policy for printing #815.

@totaam
Copy link
Collaborator Author

totaam commented Aug 16, 2016

2016-08-16 14:12:15: antoine changed status from reopened to new

@totaam
Copy link
Collaborator Author

totaam commented Aug 16, 2016

2016-08-16 14:12:15: antoine changed owner from urzds to smo

@totaam
Copy link
Collaborator Author

totaam commented Aug 16, 2016

2016-08-16 14:12:15: antoine commented


Gah this is painful.

Platform testing notes:

  • centos<7, debian jessie, ubuntu 14.04 and earlier, etc don't have /var/run/user so we use the old defaults (".xpra")
  • centos 7, debian stretch, fedora 23, ubuntu 16.04, etc have /var/run/user but fail to create the $UID directory if you use "su --login", it does get created with an ssh login or regular interactive login
  • fedora>=24 is the only distro that works OK (directory is created with su as well as interactive / ssh logins)

So:

  • r13367 adds patches to disable XDG_RUNTIME_DIR and tweak the selinux policy to allow access to the socket in "~/.xpra"
  • r13369 enables those 2 patches for all distros except for Fedora >= 24

@smo: ready for testing.

@totaam
Copy link
Collaborator Author

totaam commented Aug 17, 2016

2016-08-17 10:15:18: antoine commented


See also #1105#comment:8 about pam_systemd: If it does not exist yet, the user runtime directory /run/user/$USER is created and its ownership changed to the user that is logging in

@totaam
Copy link
Collaborator Author

totaam commented Sep 13, 2016

2016-09-13 04:45:31: antoine changed status from new to assigned

@totaam
Copy link
Collaborator Author

totaam commented Sep 13, 2016

2016-09-13 04:45:31: antoine changed owner from smo to antoine

@totaam
Copy link
Collaborator Author

totaam commented Sep 13, 2016

2016-09-13 04:45:31: antoine commented


As reported in #1303#comment:2, XDG_RUNTIME_DIR is not set and not created if you su - USERNAME from root... but it is set if you sudo su - THATSAMEUSERNAME.
What an awful mess.

@totaam
Copy link
Collaborator Author

totaam commented Sep 13, 2016

2016-09-13 12:16:28: antoine commented


And it just keeps getting worse: XAUTHORITY now lives in /run/user/$UID/gdm/Xauthority if you login via gdm... which means that we cannot access a session started from a GUI session (I assume other dm place their xauth file in similar locations) without overriding the xauth settings since we use the "standard" xauth file $HOME/.Xauthority, ie just to launch an xterm:

DISPLAY=:100 XAUTHORITY=$HOME/.Xauthority xterm

And since we have no way of knowing which xauthority path is the right one to use, this looks unfixable. What an epic clusterfsck.

@totaam
Copy link
Collaborator Author

totaam commented Oct 9, 2016

2016-10-09 15:54:10: antoine changed status from assigned to closed

@totaam
Copy link
Collaborator Author

totaam commented Oct 9, 2016

2016-10-09 15:54:10: antoine set resolution to fixed

@totaam
Copy link
Collaborator Author

totaam commented Oct 9, 2016

2016-10-09 15:54:10: antoine commented


Some updates: #888#comment:60.

We now use XDG_RUNTIME_DIR if it is present for sockets, log files and the "run-xpra" script, and we still use the older default as fallback.

@totaam totaam closed this as completed Oct 9, 2016
@totaam
Copy link
Collaborator Author

totaam commented Sep 16, 2017

2017-09-16 08:56:36: antoine commented


More catch 22: systemd : Issues with XDG_RUNTIME_DIR when working in HPC environments
The systemd answer given in that ticket is that logins should call pam to setup the environment, the big problem is that oses like centos7 do not do that when you "su"... leaving users with no workable option and no known workarounds.

@totaam
Copy link
Collaborator Author

totaam commented Sep 16, 2017

2017-09-16 09:35:07: jonathan.underwood commented


Replying to [comment:28 Antoine Martin]:

More catch 22: systemd : Issues with XDG_RUNTIME_DIR when working in HPC environments
The systemd answer given in that ticket is that logins should call pam to setup the environment, the big problem is that oses like centos7 do not do that when you "su"... leaving users with no workable option and no known workarounds.

Don't have a centos7 box to hand right now, but is it possible to do a "machinectl shell"? Upstream systemd proposes that as the preferred approach to getting a new shell with the correct environment. At this point there seems to be little hope of su being fixed to work as you/I would have expected.

@totaam
Copy link
Collaborator Author

totaam commented Sep 16, 2017

2017-09-16 09:36:33: antoine commented


but is it possible to do a "machinectl shell"?
This is not supported on centos7 either! (no such subcommand)

@totaam
Copy link
Collaborator Author

totaam commented Sep 16, 2017

2017-09-16 09:43:02: jonathan.underwood commented


Replying to [comment:30 Antoine Martin]:

but is it possible to do a "machinectl shell"?
This is not supported on centos7 either! (no such subcommand)

That is miserable. Little option but to have a fullback code path for when the XDG_* directories are absent.

I really wish the systemd devs had consulted more thoroughly before implementing the current design, and ensured tools like su - continued to work.

@totaam
Copy link
Collaborator Author

totaam commented Sep 16, 2017

2017-09-16 10:10:01: antoine commented


Little option but to have a fullback code path for when the XDG_* directories are absent.
We have this already: in this case, we log a warning and we create in the legacy locations only (ie: $HOME/.xpra)

@totaam
Copy link
Collaborator Author

totaam commented Jun 27, 2018

2018-06-27 10:27:45: antoine commented


See also #1723

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant