Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Synapse exits with zero exit code on config error #4640

Closed
richvdh opened this issue Feb 14, 2019 · 6 comments · Fixed by #9059
Closed

Synapse exits with zero exit code on config error #4640

richvdh opened this issue Feb 14, 2019 · 6 comments · Fixed by #9059
Labels
z-bug (Deprecated Label) z-p2 (Deprecated Label)

Comments

@richvdh
Copy link
Member

richvdh commented Feb 14, 2019

If synapse cannot bind to a TCP socket, or there is an error in the listeners config, synapse exits with a zero exit code even if it is configured not to daemonize.

It looks like it doesn't actually exit when sys.exit(1) is called but rather does a bunch of cleanup and forgets the exit code in the meantime

@richvdh
Copy link
Member Author

richvdh commented Feb 14, 2019

Obviously, this is problematic for systemd or other process controllers, because they cannot tell that there is an error and will tend to restart synapse in a loop.

@neilisfragile neilisfragile added z-bug (Deprecated Label) z-p2 (Deprecated Label) labels Feb 21, 2019
@richvdh
Copy link
Member Author

richvdh commented May 9, 2019

(This is still the case)

@JorikSchellekens
Copy link
Contributor

JorikSchellekens commented Sep 6, 2019

I've tested blowing up synapse in the config and setup stages while running it as a python module and it's exit code's were correct. The problem is with sycntl, it swallows the error codes.

The simple solution is to have a non-zero error code if it fails to start something. However, syctl can simultaneously try to start up a number of workers of which only a subset might fail. It's unclear whether the command should be considered a complete failure in this case because there will be workers running after it is executed. I presume this was intentional because it allows synapse to limp along.

I think the cleanest solution would be to have two error codes. One if sycntl was a complete failure and the other if only a subset failed. I propose 1 and 4 respectively.

@richvdh
Copy link
Member Author

richvdh commented Sep 6, 2019

I definitely wasn't thinking of synctl when I wrote this. Perhaps it's been fixed since...

@richvdh
Copy link
Member Author

richvdh commented Sep 6, 2019

It does not appear to have been fixed since.

I have misconfigured my synapse so that it has:

listeners:
  - port: 8
    type: http
    tls: true
    resources:
      - names: [client, federation]

I get:

(env3)rav@fred:~/work/synapse (develop $%=)$ python -m synapse.app.homeserver -c homeserver.fred.yaml
...
2019-09-06 15:50:12,678 - twisted - 166 - ERROR -  - Traceback (most recent call last):
2019-09-06 15:50:12,679 - twisted - 166 - ERROR -  -   File "/home/rav/work/synapse/env3/lib/python3.6/site-packages/twisted/internet/tcp.py", line 1361, in startListening
2019-09-06 15:50:12,679 - twisted - 166 - ERROR -  -     skt.bind(addr)
2019-09-06 15:50:12,679 - twisted - 166 - ERROR -  - PermissionError: [Errno 13] Permission denied
2019-09-06 15:50:12,679 - twisted - 166 - ERROR -  - 
2019-09-06 15:50:12,679 - twisted - 166 - ERROR -  - During handling of the above exception, another exception occurred:
2019-09-06 15:50:12,679 - twisted - 166 - ERROR -  - 
2019-09-06 15:50:12,679 - twisted - 166 - ERROR -  - Traceback (most recent call last):
2019-09-06 15:50:12,680 - twisted - 166 - ERROR -  -   File "/home/rav/work/synapse/synapse/app/_base.py", line 269, in start
2019-09-06 15:50:12,680 - twisted - 166 - ERROR -  -     hs.start_listening(listeners)
2019-09-06 15:50:12,680 - twisted - 166 - ERROR -  -   File "/home/rav/work/synapse/synapse/app/homeserver.py", line 266, in start_listening
2019-09-06 15:50:12,681 - twisted - 166 - ERROR -  -     self._listening_services.extend(self._listener_http(config, listener))
2019-09-06 15:50:12,681 - twisted - 166 - ERROR -  -   File "/home/rav/work/synapse/synapse/app/homeserver.py", line 137, in _listener_http
2019-09-06 15:50:12,681 - twisted - 166 - ERROR -  -     reactor=self.get_reactor(),
2019-09-06 15:50:12,681 - twisted - 166 - ERROR -  -   File "/home/rav/work/synapse/synapse/app/_base.py", line 196, in listen_ssl
2019-09-06 15:50:12,682 - twisted - 166 - ERROR -  -     check_bind_error(e, address, bind_addresses)
2019-09-06 15:50:12,682 - twisted - 166 - ERROR -  -   File "/home/rav/work/synapse/synapse/app/__init__.py", line 49, in check_bind_error
2019-09-06 15:50:12,682 - twisted - 166 - ERROR -  -     raise e
2019-09-06 15:50:12,682 - twisted - 166 - ERROR -  -   File "/home/rav/work/synapse/synapse/app/_base.py", line 193, in listen_ssl
2019-09-06 15:50:12,682 - twisted - 166 - ERROR -  -     reactor.listenSSL(port, factory, context_factory, backlog, address)
2019-09-06 15:50:12,683 - twisted - 166 - ERROR -  -   File "/home/rav/work/synapse/env3/lib/python3.6/site-packages/twisted/internet/posixbase.py", line 522, in listenSSL
2019-09-06 15:50:12,683 - twisted - 166 - ERROR -  -     port = self.listenTCP(port, tlsFactory, backlog, interface)
2019-09-06 15:50:12,683 - twisted - 166 - ERROR -  -   File "/home/rav/work/synapse/env3/lib/python3.6/site-packages/twisted/internet/posixbase.py", line 495, in listenTCP
2019-09-06 15:50:12,683 - twisted - 166 - ERROR -  -     p.startListening()
2019-09-06 15:50:12,684 - twisted - 166 - ERROR -  -   File "/home/rav/work/synapse/env3/lib/python3.6/site-packages/twisted/internet/tcp.py", line 1363, in startListening
2019-09-06 15:50:12,684 - twisted - 166 - ERROR -  -     raise CannotListenError(self.interface, self.port, le)
2019-09-06 15:50:12,684 - twisted - 166 - ERROR -  - twisted.internet.error.CannotListenError: Couldn't listen on :::8: [Errno 13] Permission denied.
2019-09-06 15:50:12,685 - synapse.handlers.deactivate_account - 159 - INFO - user_parter_loop-0 - User parter finished: stopping
2019-09-06 15:50:12,686 - synapse.app.homeserver - 579 - INFO -  - report_stats can use psutil
2019-09-06 15:50:12,687 - synapse.handlers.presence - 236 - INFO - presence.on_shutdown-0 - Performing _on_shutdown. Persisting 0 unpersisted changes
2019-09-06 15:50:12,687 - synapse.handlers.presence - 247 - INFO - presence.on_shutdown-0 - Finished _on_shutdown
2019-09-06 15:50:12,710 - twisted - 166 - CRITICAL -  - Unhandled error in Deferred:
2019-09-06 15:50:12,710 - twisted - 166 - CRITICAL -  - 
Traceback (most recent call last):
  File "/home/rav/work/synapse/env3/lib/python3.6/site-packages/twisted/internet/tcp.py", line 1361, in startListening
    skt.bind(addr)
PermissionError: [Errno 13] Permission denied

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/rav/work/synapse/synapse/app/_base.py", line 269, in start
    hs.start_listening(listeners)
  File "/home/rav/work/synapse/synapse/app/homeserver.py", line 266, in start_listening
    self._listening_services.extend(self._listener_http(config, listener))
  File "/home/rav/work/synapse/synapse/app/homeserver.py", line 137, in _listener_http
    reactor=self.get_reactor(),
  File "/home/rav/work/synapse/synapse/app/_base.py", line 196, in listen_ssl
    check_bind_error(e, address, bind_addresses)
  File "/home/rav/work/synapse/synapse/app/__init__.py", line 49, in check_bind_error
    raise e
  File "/home/rav/work/synapse/synapse/app/_base.py", line 193, in listen_ssl
    reactor.listenSSL(port, factory, context_factory, backlog, address)
  File "/home/rav/work/synapse/env3/lib/python3.6/site-packages/twisted/internet/posixbase.py", line 522, in listenSSL
    port = self.listenTCP(port, tlsFactory, backlog, interface)
  File "/home/rav/work/synapse/env3/lib/python3.6/site-packages/twisted/internet/posixbase.py", line 495, in listenTCP
    p.startListening()
  File "/home/rav/work/synapse/env3/lib/python3.6/site-packages/twisted/internet/tcp.py", line 1363, in startListening
    raise CannotListenError(self.interface, self.port, le)
twisted.internet.error.CannotListenError: Couldn't listen on :::8: [Errno 13] Permission denied.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/rav/work/synapse/env3/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
  File "/home/rav/work/synapse/synapse/app/homeserver.py", line 435, in start
    _base.start(hs, config.listeners)
  File "/home/rav/work/synapse/synapse/app/_base.py", line 279, in start
    sys.exit(1)
SystemExit: 1
2019-09-06 15:50:12,720 - twisted - 166 - INFO -  - Main loop terminated.
(env3)rav@fred:~/work/synapse (develop $%=)$ echo $?
0

@erikjohnston
Copy link
Member

Looks like this is due to us binding ports after having started the reactor, at which point we always return with 0. Errors in the config parsing do return with a non-zero error code because that happens before we start the reactor.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
z-bug (Deprecated Label) z-p2 (Deprecated Label)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants