Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

re-implement daemonize #8011

Merged
merged 3 commits into from
Aug 4, 2020
Merged

re-implement daemonize #8011

merged 3 commits into from
Aug 4, 2020

Conversation

richvdh
Copy link
Member

@richvdh richvdh commented Jul 31, 2020

This has long been something I've wanted to do. Basically the Daemonize code
is both too flexible and not flexible enough, in that it offers a bunch of
features that we don't use (changing UID, closing FDs in the child, logging to
syslog) and doesn't offer a bunch that we could do with (redirecting stdout/err
to a file instead of /dev/null; having the parent not exit until the child is
running).

As a first step, I've lifted the Daemonize code and removed the bits we don't
use. This should be a non-functional change. Fixing everything else will come
later.

The reveiewer is invited to compare with https://github.com/thesharp/daemonize/blob/master/daemonize.py.

This has long been something I've wanted to do. Basically the `Daemonize` code
is both too flexible and not flexible enough, in that it offers a bunch of
features that we don't use (changing UID, closing FDs in the child, logging to
syslog) and doesn't offer a bunch that we could do with (redirecting stdout/err
to a file instead of /dev/null; having the parent not exit until the child is
running).

As a first step, I've lifted the Daemonize code and removed the bits we don't
use. This should be a non-functional change. Fixing everything else will come
later.
@clokep
Copy link
Member

clokep commented Aug 3, 2020

The reveiewer is invited to compare with thesharp/daemonize@master/daemonize.py.

See a diff
--- daemonize.py	2020-08-03 10:19:48.000000000 -0400
+++ synapse/util/daemonize.py	2020-08-03 10:19:39.000000000 -0400
@@ -1,251 +1,120 @@
-# #!/usr/bin/python
+# -*- coding: utf-8 -*-
+# Copyright (c) 2012, 2013, 2014 Ilya Otyutskiy <[email protected]>
+# Copyright 2020 The Matrix.org Foundation C.I.C.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
 
+import atexit
 import fcntl
+import logging
 import os
-import pwd
-import grp
-import sys
 import signal
-import resource
-import logging
-import atexit
-from logging import handlers
-import traceback
-
+import sys
 
-__version__ = "2.5.0"
 
+def daemonize_process(pid_file: str, logger: logging.Logger, chdir: str = "/") -> None:
+    """daemonize the current process
 
-class Daemonize(object):
+    This calls fork(), and has the main process exit. When it returns we will be
+    running in the child process.
     """
-    Daemonize object.
 
-    Object constructor expects three arguments.
+    # If pidfile already exists, we should read pid from there; to overwrite it, if
+    # locking will fail, because locking attempt somehow purges the file contents.
+    if os.path.isfile(pid_file):
+        with open(pid_file, "r") as pid_fh:
+            old_pid = pid_fh.read()
+
+    # Create a lockfile so that only one instance of this daemon is running at any time.
+    try:
+        lock_fh = open(pid_file, "w")
+    except IOError:
+        print("Unable to create the pidfile.")
+        sys.exit(1)
+
+    try:
+        # Try to get an exclusive lock on the file. This will fail if another process
+        # has the file locked.
+        fcntl.flock(lock_fh, fcntl.LOCK_EX | fcntl.LOCK_NB)
+    except IOError:
+        print("Unable to lock on the pidfile.")
+        # We need to overwrite the pidfile if we got here.
+        #
+        # XXX better to avoid overwriting it, surely. this looks racey as the pid file
+        # could be created between us trying to read it and us trying to lock it.
+        with open(pid_file, "w") as pid_fh:
+            pid_fh.write(old_pid)
+        sys.exit(1)
 
-    :param app: contains the application name which will be sent to syslog.
-    :param pid: path to the pidfile.
-    :param action: your custom function which will be executed after daemonization.
-    :param keep_fds: optional list of fds which should not be closed.
-    :param auto_close_fds: optional parameter to not close opened fds.
-    :param privileged_action: action that will be executed before drop privileges if user or
-                              group parameter is provided.
-                              If you want to transfer anything from privileged_action to action, such as
-                              opened privileged file descriptor, you should return it from
-                              privileged_action function and catch it inside action function.
-    :param user: drop privileges to this user if provided.
-    :param group: drop privileges to this group if provided.
-    :param verbose: send debug messages to logger if provided.
-    :param logger: use this logger object instead of creating new one, if provided.
-    :param foreground: stay in foreground; do not fork (for debugging)
-    :param chdir: change working directory if provided or /
-    """
-    def __init__(self, app, pid, action,
-                 keep_fds=None, auto_close_fds=True, privileged_action=None,
-                 user=None, group=None, verbose=False, logger=None,
-                 foreground=False, chdir="/"):
-        self.app = app
-        self.pid = os.path.abspath(pid)
-        self.action = action
-        self.keep_fds = keep_fds or []
-        self.privileged_action = privileged_action or (lambda: ())
-        self.user = user
-        self.group = group
-        self.logger = logger
-        self.verbose = verbose
-        self.auto_close_fds = auto_close_fds
-        self.foreground = foreground
-        self.chdir = chdir
-
-    def sigterm(self, signum, frame):
-        """
-        These actions will be done after SIGTERM.
-        """
-        self.logger.warning("Caught signal %s. Stopping daemon." % signum)
+    # Fork, creating a new process for the child.
+    process_id = os.fork()
+
+    if process_id != 0:
+        # parent process
         sys.exit(0)
 
-    def exit(self):
-        """
-        Cleanup pid file at exit.
-        """
-        self.logger.warning("Stopping daemon.")
-        os.remove(self.pid)
+    # This is the child process. Continue.
+
+    # Stop listening for signals that the parent process receives.
+    # This is done by getting a new process id.
+    # setpgrp() is an alternative to setsid().
+    # setsid puts the process in a new parent group and detaches its controlling
+    # terminal.
+
+    os.setsid()
+
+    # point stdin, stdout, stderr at /dev/null
+    devnull = "/dev/null"
+    if hasattr(os, "devnull"):
+        # Python has set os.devnull on this system, use it instead as it might be
+        # different than /dev/null.
+        devnull = os.devnull
+
+    devnull_fd = os.open(devnull, os.O_RDWR)
+    os.dup2(devnull_fd, 0)
+    os.dup2(devnull_fd, 1)
+    os.dup2(devnull_fd, 2)
+    os.close(devnull_fd)
+
+    # Set umask to default to safe file permissions when running as a root daemon. 027
+    # is an octal number which we are typing as 0o27 for Python3 compatibility.
+    os.umask(0o27)
+
+    # Change to a known directory. If this isn't done, starting a daemon in a
+    # subdirectory that needs to be deleted results in "directory busy" errors.
+    os.chdir(chdir)
+
+    try:
+        lock_fh.write("%s" % (os.getpid()))
+        lock_fh.flush()
+    except IOError:
+        logger.error("Unable to write pid to the pidfile.")
+        print("Unable to write pid to the pidfile.")
+        sys.exit(1)
+
+    # write a log line on SIGTERM.
+    def sigterm(signum, frame):
+        logger.warning("Caught signal %s. Stopping daemon." % signum)
         sys.exit(0)
 
-    def start(self):
-        """
-        Start daemonization process.
-        """
-        # If pidfile already exists, we should read pid from there; to overwrite it, if locking
-        # will fail, because locking attempt somehow purges the file contents.
-        if os.path.isfile(self.pid):
-            with open(self.pid, "r") as old_pidfile:
-                old_pid = old_pidfile.read()
-        # Create a lockfile so that only one instance of this daemon is running at any time.
-        try:
-            lockfile = open(self.pid, "w")
-        except IOError:
-            print("Unable to create the pidfile.")
-            sys.exit(1)
-        try:
-            # Try to get an exclusive lock on the file. This will fail if another process has the file
-            # locked.
-            fcntl.flock(lockfile, fcntl.LOCK_EX | fcntl.LOCK_NB)
-        except IOError:
-            print("Unable to lock on the pidfile.")
-            # We need to overwrite the pidfile if we got here.
-            with open(self.pid, "w") as pidfile:
-                pidfile.write(old_pid)
-            sys.exit(1)
-
-        # skip fork if foreground is specified
-        if not self.foreground:
-            # Fork, creating a new process for the child.
-            try:
-                process_id = os.fork()
-            except OSError as e:
-                self.logger.error("Unable to fork, errno: {0}".format(e.errno))
-                sys.exit(1)
-            if process_id != 0:
-                if self.keep_fds:
-                    # This is the parent process. Exit without cleanup,
-                    # see https://github.com/thesharp/daemonize/issues/46
-                    os._exit(0)
-                else:
-                    sys.exit(0)
-            # This is the child process. Continue.
-
-            # Stop listening for signals that the parent process receives.
-            # This is done by getting a new process id.
-            # setpgrp() is an alternative to setsid().
-            # setsid puts the process in a new parent group and detaches its controlling terminal.
-            process_id = os.setsid()
-            if process_id == -1:
-                # Uh oh, there was a problem.
-                sys.exit(1)
-
-            # Add lockfile to self.keep_fds.
-            self.keep_fds.append(lockfile.fileno())
-
-            # Close all file descriptors, except the ones mentioned in self.keep_fds.
-            devnull = "/dev/null"
-            if hasattr(os, "devnull"):
-                # Python has set os.devnull on this system, use it instead as it might be different
-                # than /dev/null.
-                devnull = os.devnull
-
-            if self.auto_close_fds:
-                for fd in range(3, resource.getrlimit(resource.RLIMIT_NOFILE)[0]):
-                    if fd not in self.keep_fds:
-                        try:
-                            os.close(fd)
-                        except OSError:
-                            pass
-
-            devnull_fd = os.open(devnull, os.O_RDWR)
-            os.dup2(devnull_fd, 0)
-            os.dup2(devnull_fd, 1)
-            os.dup2(devnull_fd, 2)
-            os.close(devnull_fd)
-
-        if self.logger is None:
-            # Initialize logging.
-            self.logger = logging.getLogger(self.app)
-            self.logger.setLevel(logging.DEBUG)
-            # Display log messages only on defined handlers.
-            self.logger.propagate = False
-
-            # Initialize syslog.
-            # It will correctly work on OS X, Linux and FreeBSD.
-            if sys.platform == "darwin":
-                syslog_address = "/var/run/syslog"
-            else:
-                syslog_address = "/dev/log"
-
-            # We will continue with syslog initialization only if actually have such capabilities
-            # on the machine we are running this.
-            if os.path.exists(syslog_address):
-                syslog = handlers.SysLogHandler(syslog_address)
-                if self.verbose:
-                    syslog.setLevel(logging.DEBUG)
-                else:
-                    syslog.setLevel(logging.INFO)
-                # Try to mimic to normal syslog messages.
-                formatter = logging.Formatter("%(asctime)s %(name)s: %(message)s",
-                                              "%b %e %H:%M:%S")
-                syslog.setFormatter(formatter)
-
-                self.logger.addHandler(syslog)
-
-        # Set umask to default to safe file permissions when running as a root daemon. 027 is an
-        # octal number which we are typing as 0o27 for Python3 compatibility.
-        os.umask(0o27)
-
-        # Change to a known directory. If this isn't done, starting a daemon in a subdirectory that
-        # needs to be deleted results in "directory busy" errors.
-        os.chdir(self.chdir)
-
-        # Execute privileged action
-        privileged_action_result = self.privileged_action()
-        if not privileged_action_result:
-            privileged_action_result = []
-
-        # Change owner of pid file, it's required because pid file will be removed at exit.
-        uid, gid = -1, -1
-
-        if self.group:
-            try:
-                gid = grp.getgrnam(self.group).gr_gid
-            except KeyError:
-                self.logger.error("Group {0} not found".format(self.group))
-                sys.exit(1)
-
-        if self.user:
-            try:
-                uid = pwd.getpwnam(self.user).pw_uid
-            except KeyError:
-                self.logger.error("User {0} not found.".format(self.user))
-                sys.exit(1)
-
-        if uid != -1 or gid != -1:
-            os.chown(self.pid, uid, gid)
-
-        # Change gid
-        if self.group:
-            try:
-                os.setgid(gid)
-            except OSError:
-                self.logger.error("Unable to change gid.")
-                sys.exit(1)
-
-        # Change uid
-        if self.user:
-            try:
-                uid = pwd.getpwnam(self.user).pw_uid
-            except KeyError:
-                self.logger.error("User {0} not found.".format(self.user))
-                sys.exit(1)
-            try:
-                os.setuid(uid)
-            except OSError:
-                self.logger.error("Unable to change uid.")
-                sys.exit(1)
-
-        try:
-            lockfile.write("%s" % (os.getpid()))
-            lockfile.flush()
-        except IOError:
-            self.logger.error("Unable to write pid to the pidfile.")
-            print("Unable to write pid to the pidfile.")
-            sys.exit(1)
-
-        # Set custom action on SIGTERM.
-        signal.signal(signal.SIGTERM, self.sigterm)
-        atexit.register(self.exit)
-
-        self.logger.warning("Starting daemon.")
-
-        try:
-            self.action(*privileged_action_result)
-        except Exception:
-            for line in traceback.format_exc().split("\n"):
-                self.logger.error(line)
+    signal.signal(signal.SIGTERM, sigterm)
+
+    # Cleanup pid file at exit.
+    def exit():
+        logger.warning("Stopping daemon.")
+        os.remove(pid_file)
+        sys.exit(0)
+
+    atexit.register(exit)
+
+    logger.warning("Starting daemon.")

Edit: This doesn't really seem too useful since there's a conversion from a class to a function. The non-whitespace version didn't seem to help either.

Copy link
Member

@clokep clokep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems OK. I left a couple of queries about removed error handling.

Comment on lines +59 to +60
# Fork, creating a new process for the child.
process_id = os.fork()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why the error handling code here was removed from the original library?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically: I kinda felt that we should be allowing exceptions raised in this function to propagate, rather than suddenly calling sys.exit. It's not particularly clear though. I could be persuaded to go either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result should be the same either way. Showing the stack trace should give more info, I suppose. 👍

Comment on lines +68 to +74
# Stop listening for signals that the parent process receives.
# This is done by getting a new process id.
# setpgrp() is an alternative to setsid().
# setsid puts the process in a new parent group and detaches its controlling
# terminal.

os.setsid()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this differs from the original code by removing error handling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.setsid returns None, so that was dead code.

else:
run()
daemonize_process(pid_file, logger)
run()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've realised that the original code wrapped this with a try/except for a reason, which is that after stderr gets redirected to /dev/null, any uncaught exceptions will otherwise get thrown away.

I'm currently wondering how best to solve that.

This replaces the try/catch that Daemonize has around the application callback.
synapse/util/daemonize.py Outdated Show resolved Hide resolved
@richvdh richvdh merged commit 916cf2d into develop Aug 4, 2020
@richvdh richvdh deleted the rav/daemonize branch August 4, 2020 09:03
reivilibre added a commit that referenced this pull request Aug 13, 2020
Synapse 1.19.0rc1 (2020-08-13)
==============================

Removal warning
---------------

As outlined in the [previous release](https://github.com/matrix-org/synapse/releases/tag/v1.18.0), we are no longer publishing Docker images with the `-py3` tag suffix. On top of that, we have also removed the `latest-py3` tag. Please see [the announcement in the upgrade notes for 1.18.0](https://github.com/matrix-org/synapse/blob/develop/UPGRADE.rst#upgrading-to-v1180).

Features
--------

- Add option to allow server admins to join rooms which fail complexity checks. Contributed by @lugino-emeritus. ([\#7902](#7902))
- Add an option to purge room or not with delete room admin endpoint (`POST /_synapse/admin/v1/rooms/<room_id>/delete`). Contributed by @dklimpel. ([\#7964](#7964))
- Add rate limiting to users joining rooms. ([\#8008](#8008))
- Add a `/health` endpoint to every configured HTTP listener that can be used as a health check endpoint by load balancers. ([\#8048](#8048))
- Allow login to be blocked based on the values of SAML attributes. ([\#8052](#8052))
- Allow guest access to the `GET /_matrix/client/r0/rooms/{room_id}/members` endpoint, according to MSC2689. Contributed by Awesome Technologies Innovationslabor GmbH. ([\#7314](#7314))

Bugfixes
--------

- Fix a bug introduced in Synapse v1.7.2 which caused inaccurate membership counts in the room directory. ([\#7977](#7977))
- Fix a long standing bug: 'Duplicate key value violates unique constraint "event_relations_id"' when message retention is configured. ([\#7978](#7978))
- Fix "no create event in auth events" when trying to reject invitation after inviter leaves. Bug introduced in Synapse v1.10.0. ([\#7980](#7980))
- Fix various comments and minor discrepencies in server notices code. ([\#7996](#7996))
- Fix a long standing bug where HTTP HEAD requests resulted in a 400 error. ([\#7999](#7999))
- Fix a long-standing bug which caused two copies of some log lines to be written when synctl was used along with a MemoryHandler logger. ([\#8011](#8011), [\#8012](#8012))

Updates to the Docker image
---------------------------

- We no longer publish Docker images with the `-py3` tag suffix, as [announced in the upgrade notes](https://github.com/matrix-org/synapse/blob/develop/UPGRADE.rst#upgrading-to-v1180). ([\#8056](#8056))

Improved Documentation
----------------------

- Document how to set up a client .well-known file and fix several pieces of outdated documentation. ([\#7899](#7899))
- Improve workers docs. ([\#7990](#7990), [\#8000](#8000))
- Fix typo in `docs/workers.md`. ([\#7992](#7992))
- Add documentation for how to undo a room shutdown. ([\#7998](#7998), [\#8010](#8010))

Internal Changes
----------------

- Reduce the amount of whitespace in JSON stored and sent in responses. Contributed by David Vo. ([\#7372](#7372))
- Switch to the JSON implementation from the standard library and bump the minimum version of the canonicaljson library to 1.2.0. ([\#7936](#7936), [\#7979](#7979))
- Convert various parts of the codebase to async/await. ([\#7947](#7947), [\#7948](#7948), [\#7949](#7949), [\#7951](#7951), [\#7963](#7963), [\#7973](#7973), [\#7975](#7975), [\#7976](#7976), [\#7981](#7981), [\#7987](#7987), [\#7989](#7989), [\#8003](#8003), [\#8014](#8014), [\#8016](#8016), [\#8027](#8027), [\#8031](#8031), [\#8032](#8032), [\#8035](#8035), [\#8042](#8042), [\#8044](#8044), [\#8045](#8045), [\#8061](#8061), [\#8062](#8062), [\#8063](#8063), [\#8066](#8066), [\#8069](#8069), [\#8070](#8070))
- Move some database-related log lines from the default logger to the database/transaction loggers. ([\#7952](#7952))
- Add a script to detect source code files using non-unix line terminators. ([\#7965](#7965), [\#7970](#7970))
- Log the SAML session ID during creation. ([\#7971](#7971))
- Implement new experimental push rules for some users. ([\#7997](#7997))
- Remove redundant and unreliable signature check for v1 Identity Service lookup responses. ([\#8001](#8001))
- Improve the performance of the register endpoint. ([\#8009](#8009))
- Reduce less useful output in the newsfragment CI step. Add a link to the changelog section of the contributing guide on error. ([\#8024](#8024))
- Rename storage layer objects to be more sensible. ([\#8033](#8033))
- Change the default log config to reduce disk I/O and storage for new servers. ([\#8040](#8040))
- Add an assertion on `prev_events` in `create_new_client_event`. ([\#8041](#8041))
- Add a comment to `ServerContextFactory` about the use of `SSLv23_METHOD`. ([\#8043](#8043))
- Log `OPTIONS` requests at `DEBUG` rather than `INFO` level to reduce amount logged at `INFO`. ([\#8049](#8049))
- Reduce amount of outbound request logging at `INFO` level. ([\#8050](#8050))
- It is no longer necessary to explicitly define `filters` in the logging configuration. (Continuing to do so is redundant but harmless.) ([\#8051](#8051))
- Add and improve type hints. ([\#8058](#8058), [\#8064](#8064), [\#8060](#8060), [\#8067](#8067))
babolivier pushed a commit that referenced this pull request Sep 1, 2021
* commit '916cf2d43':
  re-implement daemonize (#8011)
DMRobertson pushed a commit that referenced this pull request May 2, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants