Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with run files for nginx start with system start #1372

Closed
trenta opened this issue Feb 5, 2020 · 17 comments · Fixed by #1478
Closed

Issue with run files for nginx start with system start #1372

trenta opened this issue Feb 5, 2020 · 17 comments · Fixed by #1478

Comments

@trenta
Copy link

trenta commented Feb 5, 2020

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 6.12.0
  • Ruby:
  • Distribution: Ubuntu 18.04.4
  • Module version: 1.1.0

What are you seeing

The latest upgrade to this module may have broken the nginx service startup process when rebooting server. After a server reboot the nginx status reads as
nginx.service - nginx - high performance web server Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled) Active: failed (Result: exit-code) since Wed 2020-02-05 15:17:34 AEDT; 7min ago Docs: http://nginx.org/en/docs/ Process: 664 ExecStart=/usr/sbin/nginx -c /etc/nginx/nginx.conf (code=exited, status=1/FAILURE)

Feb 05 15:17:33 jamfp02 systemd[1]: Starting nginx - high performance web server... Feb 05 15:17:34 jamfp02 nginx[664]: nginx: [emerg] mkdir() "/run/nginx/client_body_temp" failed (2: No such file or directory) Feb 05 15:17:34 jamfp02 systemd[1]: nginx.service: Control process exited, code=exited status=1 Feb 05 15:17:34 jamfp02 systemd[1]: nginx.service: Failed with result 'exit-code'. Feb 05 15:17:34 jamfp02 systemd[1]: Failed to start nginx - high performance web server.

When puppet runs it fixes things and this is the output.
Info: Using configured environment 'production' Info: Retrieving pluginfacts Info: Retrieving plugin Info: Retrieving locales Info: Loading facts Info: Caching catalog for jamfp02.svi.edu.au Info: Applying configuration version '1580876853' Info: /Stage[main]/Ntp::Service/Service[ntp]: Unscheduling refresh on Service[ntp] Notice: /Stage[main]/Nginx::Config/File[/run/nginx]/ensure: created (corrective) Notice: /Stage[main]/Nginx::Config/File[/run/nginx/client_body_temp]/ensure: created (corrective) Notice: /Stage[main]/Nginx::Config/File[/run/nginx/proxy_temp]/ensure: created (corrective) Info: Class[Nginx::Config]: Scheduling refresh of Class[Nginx::Service] Info: Class[Nginx::Service]: Scheduling refresh of Service[nginx] Notice: /Stage[main]/Nginx::Service/Service[nginx]/ensure: ensure changed 'stopped' to 'running' (corrective) Info: /Stage[main]/Nginx::Service/Service[nginx]: Unscheduling refresh on Service[nginx]

What behaviour did you expect instead

I expected the nginx process to start at startup before a puppet run corrects things.

@lampapetrol
Copy link

Related PR #1352 (comment)

@jonemo
Copy link

jonemo commented Apr 9, 2020

The discussion of this seems to have mostly happened over in #1352. I think it belongs here so I'll try to summarize the points from there.

Prior Discussion

  1. Debian run_dir should be in /var/run/nginx #1352 moved the "run dir" from /var/nginx to /run/nginx. The author @anarcat points out that "this is important because /var is a real, on-disk, filesystem while [...] /run [...] is a tmpfs, and therefore much faster." This change got released in version 1.1.0 in January 2020. (Note that the PR description in Debian run_dir should be in /var/run/nginx #1352 is outdated and references /var/run/nginx.)
  2. @anarcat was the first to report the error on system (re)start "because /run/nginx doesn't exist on boot" in Debian run_dir should be in /var/run/nginx #1352 (comment) and filed a corresponding bug report against the Nginx Debian package. Arguably, it is correct that Nginx should take care of creating its own folders, but that doesn't help anyone who finds themselves with a broken server after Debian run_dir should be in /var/run/nginx #1352.
  3. The suggested workaround got some discussion, see below.
  4. @trenta opened this issue, and myself and @mgurjanov-cti also reported observing the issue over in Debian run_dir should be in /var/run/nginx #1352.

Workaround 1

#1352 (comment) contains a suggested workaround, namely to add two lines to /etc/systemd/system/nginx.service.d/runtime.conf:

[Service]
RuntimeDirectory=nginx

In #1352 (comment), @smortex points out that this will result in a non-converging configuration where Puppet will "fight" the system.

I would add that the existence and location of /etc/systemd/system/nginx.service.d/runtime.conf cannot be assumed because its existence and location depend on the specific version and configuration of the system. For example, I am currently looking at a Ubuntu server where the equivalent path is /lib/systemd/system/nginx.service. I haven't look deeply into this, but I assume that this file gets created as a result of https://github.com/voxpupuli/puppet-nginx/blob/master/manifests/service.pp#L39-L44, and it's not obvious to me how to update the manifest in order to add those extra lines.

Workaround 2

Use version 1.1.0 and undo the effect of #1352:

  file { '/var/nginx' :
    ensure => 'directory',
    group  => 'root',
    mode   => '0644',
  }

class { '::nginx':
    client_body_temp_path => '/var/nginx/client_body_temp',
    proxy_temp_path       => '/var/nginx/proxy_temp',
    require               => File['/var/nginx'],
  }

Possible Solutions

  1. Translate the workaround suggested by @anarcat in Debian run_dir should be in /var/run/nginx #1352 (comment) into a PR.
  2. Wait for the Debian Nginx package to be updated in response to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=945551.
  3. Revert Debian run_dir should be in /var/run/nginx #1352.
  4. Other ideas?

I'd be happy to help with a solution but I could use some guidance from maintainers as to what's the preferred path forward. Personally, I don't know how to do (1), (2) means this package will be broken until an unknown date, and (3) would mean stepping on @anarcat's toes pretty hard which I with my grand total of zero commits to this project certainly have no authority or desire to do.

@anarcat
Copy link
Contributor

anarcat commented Apr 9, 2020

awesome, thanks for summarizing this so accurately! :) i do agree it would be too bad to revert #1352...

we don't we just a systemd override for now (1), that seems the simplest... we don't depend on the systemd module yet, but that would not be a big hurdle to cross.

i don't think (2) is really an option in the short term - it will take time to update the debian package...

@jonemo
Copy link

jonemo commented Apr 19, 2020

[why] don't we just a systemd override for now (1), that seems the simplest... we don't depend on the systemd module yet, but that would not be a big hurdle to cross.

@anarcat Can you elaborate a little more on how this would work? A few specific questions I was wondering about while thinking this through:

  • I assume you are referring to camptocamp/systemd? Add that to metadata.json's dependencies? Then overwrite the entire default config file with something like this?
    systemd::unit_file { $service_name:
      source => "puppet:///modules/${module_name}/nginx.service",
    }
    ~> service { $service_name:
      ensure => 'running',
    }
  • Since this package advertises more supported operating systems than camptocamp/systemd, we can't just use it globally. Is your plan to add a special case for Debian/Ubuntu in this case statement: https://github.com/voxpupuli/puppet-nginx/blob/master/manifests/service.pp#L28
  • I know that some users are very thorough with vetting dependencies, is there any additional work involved in documenting this additional dependency? What kind of version bump would this require?

@anarcat
Copy link
Contributor

anarcat commented Apr 22, 2020

I know that some users are very thorough with vetting dependencies, is there any additional work involved in documenting this additional dependency? What kind of version bump would this require?

You're right this might be too invasive. I'm starting to be convinced this could be rolled back and pushed into my profile instead.

Feel free to revert, but i will note that the previous value for this was /var/nginx and i doubt that would have worked any better in the long term. :p

@janit42
Copy link

janit42 commented Jun 7, 2020

[why] don't we just a systemd override for now (1), that seems the simplest... we don't depend on the systemd module yet, but that would not be a big hurdle to cross.

Not sure whether I'm telling things everybody is already familiar with, but in case not:

systemd scans a lot of directories for unit configurations, see man systemd.unit.

a systemd override in a directory /etc/systemd/system/<unit-name>.<unit-type>.d/<meaningful_name>.conf is a feature of systemd resulting in the settings inside <meaningful_name>.conf overriding values living in other systemd config files for <unit-name>.<unit-type>, e.g. in the os distributed startup files. A directory and file like this would be created by the command systemctl edit <unit-name>.<unit-type>

If the meaningful_name is chosen carefully, this is an as-subtile-as-possible fix imho with a low chance to mess up existing configurations.

I think (but am not completely sure) /etc/systemd/system can be assumed to exist on every systemd installation. <unit-name>.<unit-type>.d/<meaningfule_name>.conf would NOT be distributed by the os but may have been created by the admin of the system.

So the module needs to ensure the directory nginx.service.d in addition to a file with a meaningful name like puppet_nginx_fix_runtime.conf. The file could even be prefixed with a number resulting in something like nginx.service.d/30_puppet_nginx_fix_runtime.conf which would allow a local admin to overwrite the changes done by the module with a fix of her own in a file nginx.service.d/31_undo_puppet_nginx_nonsense.conf.

@anarcat Can you elaborate a little more on how this would work? A few specific questions I was wondering about while thinking this through:

* I assume you are referring to [camptocamp/systemd](https://forge.puppet.com/camptocamp/systemd)? Add that to `metadata.json`'s `dependencies`? Then overwrite the entire default config file with something like this?
  ```
  systemd::unit_file { $service_name:
    source => "puppet:///modules/${module_name}/nginx.service",
  }
  ~> service { $service_name:
    ensure => 'running',
  }
  ```

* Since this package advertises more supported operating systems than camptocamp/systemd, we can't just use it globally. Is your plan to add a special case for Debian/Ubuntu in this `case` statement: https://github.com/voxpupuli/puppet-nginx/blob/master/manifests/service.pp#L28

* I know that some users are very thorough with vetting dependencies, is there any additional work involved in documenting this additional dependency? What kind of version bump would this require?

I don't think it's necessary to use the camptocamp/systemd module (and risk dependency problems) for this as we would only ensure one directory and one file (you could even make the filename (not the dirname) configurable via a param, or allow to disable creating the systemd override). I agree overwriting the whole systemd config for nginx might be a bit invasive, but doing it by the override as proposed by @anarcat would leave the rest of the ngix config alone and fix only the reboot bug.

It might be a good idea to avoid the fight over file permissions pointed out by @smortex in his 1352 comment by checking whether the module could switch to a 0700 umask for the /run/nginx/client_body_temp and /run/nginx/proxy_temp directories.

@danmanners
Copy link

Running into this too, adding the RuntimeDirectory=nginx worked perfectly.

At least in my situation, I'm running Bolt not full blown puppet, so I'm not worried about Puppet 'fighting' the config.

darkstar42 added a commit to gernox/puppet-profile that referenced this issue Jul 14, 2020
rmc47 added a commit to Cambridge105/puppet that referenced this issue Aug 15, 2020
mergwyn added a commit to mergwyn/control-repo that referenced this issue Dec 9, 2020
@nvanheuverzwijn
Copy link

nvanheuverzwijn commented Dec 11, 2020

This also work if you still want to be in tmpfs /run and not change systemd or go back to /var.

client_body_temp_path     => "/run/nginx_client_body_temp",
proxy_temp_path           => "/run/nginx_proxy_temp",

@SebastianJ91
Copy link

I would propose to revert the change #1352 because of following reasons:

  • Nginx is already caching client bodys which doens't exceed client_body_buffer_size in RAM
  • Writing to disk is the second stage if the configured limit is exceeded.
  • Putting both cache stages together bypasses the client_body_buffer_size as everything is in RAM now.
  • This makes DDoS attacks a lot easier, as normaly RAM size is a lot less then disk space.

@anarcat
Copy link
Contributor

anarcat commented Mar 1, 2021

FWIW, #1352 didn't just move from /var/run to /run, it moved from /var/ to /run!! That is a significant difference, as this module was totally broken on Debian before that patch. So just don't revert #1352 blindly and at least move to /var/run, not /var.

So I think that, after @SebastianJ91's comment, i'd be okay with a patch changing the path (again), but please don't put it in /var/nginx, that would be silly.

@jeffmccune
Copy link
Contributor

@anarcat FYI, /var/run -> /run

❯ cat /etc/debian_version
10.9
❯ ls -l /var/run
lrwxrwxrwx 1 root root 4 Mar 31 18:43 /var/run -> /run

A systemd drop-in is a clean way to work around the problem:

  file { '/etc/systemd/system/nginx.service.d/override.conf':
    ensure  => file,
    owner   => 0,
    group   => 0,
    mode    => '0644',
    content => "[Service]\nRuntimeDirectory=nginx\n",
    notify  => [
      Exec['daemon-reload'],
      Service['nginx'],
    ]
  }

@tobixen
Copy link
Contributor

tobixen commented May 12, 2021

I should have looked into the open issues first ... anyway, here is my workaround, in hiera:

nginx::run_dir: /tmp/
nginx::client_body_temp_path: /tmp/nginx-client_body_temp
nginx::proxy_temp_path: /tmp/nginx-proxy_temp_path

Because temp data should probably be in /tmp?

It would be nicer to have /tmp/nginx as the run-dir instead of /tmp - but as some systems have /tmp set up as a tmpfs, it would be the same problem as with /run.

The run-dir construct is a bit weird, it is only used for setting the default values for nginx::client_body_temp_path and nginx::proxy_temp_path (and to me those looks more fitting to belong under a tmp_dir than a run_dir), but setting nginx::run_dir does not help, those two variables have the default value rooted in nginx::params::run_dir rather than nginx::run_dir.

@kenyon
Copy link
Member

kenyon commented Oct 27, 2021

I just ran into this too. 😒

This is a particularly nasty bug because you don’t know that there’s a problem until your first reboot, and your nginx doesn’t start.

I'm using the systemd dropin file workaround in my nginx profile:

systemd::dropin_file { 'runtimedirectory.conf':
  unit    => 'nginx.service',
  content => @(EOT),
    # Managed by Puppet
    [Service]
    RuntimeDirectory=nginx
    | EOT
}

Can we agree on this as a fix?

@tobixen
Copy link
Contributor

tobixen commented Oct 27, 2021

I have eventually concluded that the best fix (for this and multiple other problems) is to use package/service pattern and avoid using voxpupuli-nginx. :-(

b4ldr added a commit that referenced this issue Oct 27, 2021
…tional

The [general description of `/run/`](https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s15.html) is fairly vague and not in the original spec however my high level reading is that it relates more to small files which are used for service orchestration and not data (temporary or otherwise) which is specific to the programs operations for the later /var/lib which states the following seems more appropriate:

> Furthermore, this hierarchy holds state information pertaining to an application ....  State information is data that programs modify while they run, and that pertains to one specific host.

https://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/Linux-Filesystem-Hierarchy.html#var

And as we see from the [debian packages](https://salsa.debian.org/nginx-team/nginx/-/blob/master/debian/rules#L61-63) this is the route they have gone (and likely don't consider this a bug).  although it is worth noting they only override this only for the full package, the light package keeps the default which ends up being `--prefix` and thus `/usr/share/nginx/client_temp/`.

Of course there is a performance argument to make that says theses files should be in a ramdisk, however its also possible that this dir could become very large and putting it in a ramdisk could cause swapping or other system issues as such i think this decision should be left to the operator with sane defaults provided by the package maintain

As run_dir is only ever used for setting `proxy_temp_path` and `client_body_temp_path` i suggest that we completely drop run_dir and make `client_body_temp_path` and `proxy_temp_path` optional, thus ensuring the configuration uses the distro configured defaults.  if we don't want to use the package defaults then we should ensure we ensure things work correctly annd in this case i think using the systemd dropin would make the most senses

Fixes #1372
@b4ldr
Copy link
Member

b4ldr commented Oct 27, 2021

tl;dr i think by default we should set proxy_temp_path and client_body_temp_path to undef and let the defaults chosen by the package maintainers take precedence #1478

i just came across this and i think the main issue here is the assumption that proxy_temp_path and client_body_temp_path should be in some directory under /run/ however i don't think this is a valid assumption and it is not the decision that the upstream debian package has made.

now the general description of /run/ is fairly vague and not in the original spec however my high level reading is that it relates more to small files which are used for service orchestration and not data (temporary or otherwise) which is specific to the programs operations for the later /var/lib which states the following seems more appropriate:

Furthermore, this hierarchy holds state information pertaining to an application .... State information is data that programs modify while they run, and that pertains to one specific host.

https://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/Linux-Filesystem-Hierarchy.html#var

And as we see from the debian packages this is the route they have gone (and likely don't consider this a bug). although it is worth noting they only override this only for the full package, the light package keeps the default which ends up being --prefix and thus /usr/share/nginx/client_temp/.

Of course there is a performance argument to make that says theses files should be in a ramdisk, however its also possible that this dir could become very large and putting it in a ramdisk could cause swapping or other system issues as such i think this decision should be left to the operator with sane defaults provided by the package maintain

As run_dir is only ever used for setting proxy_temp_path and client_body_temp_path i suggest that we completely drop run_dir and make client_body_temp_path and proxy_temp_path optional, thus ensuring the configuration uses the distro configured defaults. if we don't want to use the package defaults then we should ensure we ensure things work correctly annd in this case i think using the systemd dropin would make the most senses

@ekohl
Copy link
Member

ekohl commented Oct 27, 2021

I would be in favor of that. At least in the default nginx config on Debian 10 & 11 and CentOS 7 & 8 (via EPEL) these options are not present at all. Making them optional would move it closer to the distro default packaging which I generally feel like is a sane default.

b4ldr added a commit that referenced this issue Oct 27, 2021
…tional

The [general description of `/run/`](https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s15.html) is fairly vague and not in the original spec however my high level reading is that it relates more to small files which are used for service orchestration and not data (temporary or otherwise) which is specific to the programs operations for the later /var/lib which states the following seems more appropriate:

> Furthermore, this hierarchy holds state information pertaining to an application ....  State information is data that programs modify while they run, and that pertains to one specific host.

https://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/Linux-Filesystem-Hierarchy.html#var

And as we see from the [debian packages](https://salsa.debian.org/nginx-team/nginx/-/blob/master/debian/rules#L61-63) this is the route they have gone (and likely don't consider this a bug).  although it is worth noting they only override this only for the full package, the light package keeps the default which ends up being `--prefix` and thus `/usr/share/nginx/client_temp/`.

Of course there is a performance argument to make that says theses files should be in a ramdisk, however its also possible that this dir could become very large and putting it in a ramdisk could cause swapping or other system issues as such i think this decision should be left to the operator with sane defaults provided by the package maintain

As run_dir is only ever used for setting `proxy_temp_path` and `client_body_temp_path` i suggest that we completely drop run_dir and make `client_body_temp_path` and `proxy_temp_path` optional, thus ensuring the configuration uses the distro configured defaults.  if we don't want to use the package defaults then we should ensure we ensure things work correctly annd in this case i think using the systemd dropin would make the most senses

This CR also adds resources to manage the creation of proxy_cache_path
and fastcgi_cache_path directories

Fixes #1372
b4ldr added a commit that referenced this issue Oct 28, 2021
…tional

This is the same as #1478 however changes the spec tests to check for
 is.expect.not_to instead of removing

Fixes: #1372
b4ldr added a commit that referenced this issue Oct 28, 2021
…tional

This is the same as #1478 however changes the spec tests to check for
 is.expect.not_to instead of removing

Fixes: #1372
b4ldr added a commit that referenced this issue Oct 28, 2021
…tional

This is the same as #1478 however changes the spec tests to check for
 is.expect.not_to instead of removing

Fixes: #1372
@elofu17
Copy link

elofu17 commented Nov 26, 2021

I just ran into this too. 😒

This is a particularly nasty bug because you don’t know that there’s a problem until your first reboot, and your nginx doesn’t start.

I'm using the systemd dropin file workaround in my nginx profile:

systemd::dropin_file { 'runtimedirectory.conf':
  unit    => 'nginx.service',
  content => @(EOT),
    # Managed by Puppet
    [Service]
    RuntimeDirectory=nginx
    | EOT
}

Can we agree on this as a fix?

I too ran into this when upgrading from a very old version of the puppetforge module.

Regarding all different workarounds, I dislike the one above, since it adds a dependency to another puppet module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet