-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gracefully exiting a Python process running Bjoern #91
Comments
Hm... the way Python implements signals makes it impossible to use them in this situation (C event-loop based server), because they are put into in an internal signal queue and the handlers are only called the next time the Python interpreter is executed, which is when a new request is being made to the server or an existing request is being processed. I'm not sure how other event-loop based servers solve this problem, maybe by periodically invoking the Python interpreter. Maybe you can do aside with signals altogether? |
This seems to be possible to work around with periodically calling into https://docs.python.org/2/c-api/exceptions.html#c.PyErr_CheckSignals |
Thanks for the quick reply and confirming our suspicions that it may be related to the C side of the library. We will look into an alternative solution. It would be great if you could build in the CheckSignals code you mentioned above if that would allow us to handle signals correctly. I can leave this open if you would like to look into it, otherwise I am happy to close the issue. |
Shouldn't SIGTERM be handled the same way SIGINT is ? For graceful shutdown ? |
That's correct, if it doesn't work like that then that's a separate issue. |
@jonashaag SIGTERM is not handled like SIGINT as A follow-up question: https://github.com/jonashaag/bjoern/blob/master/bjoern/server.c#L91 places a SIGINT on the signal queue, would
|
I don't know, maybe the other way round (first check signals, then set interrupt), but in either case we'd have to make sure we hold the GIL |
Are there any work arounds? I have an app that should not have any downtime. |
Maybe you can run this as a subprocess and kill it whenever required, I guess |
To gracefully shut down services, So I have will have to move back to |
Can you guys try the https://github.com/jonashaag/bjoern/tree/signal-watcher branch |
Sorry, but I really don't want to create a signal handler (it is asynchronous and there are plenty of spin-off issues, e.g. when using Now that I have seen that |
@bulletmark The patch in that branch should make this work. It simply checks every 100ms if there's a new signal in the Python signal queue and executes that Python signal handler. |
The https://github.com/jonashaag/bjoern/tree/signal-watcher branch works perfectly for me with SIGTERM. My use case is a |
I still think that simply returning from the event loop as I describe in my 2 posts above (and how meinheld does it) is a better way to go, and most compatible with systemd termination. @jonashaag, please reconsider that approach. |
I just released 3.0.0 with the approach taken in the signal-watcher branch. @bulletmark could you elaborate on what's better in the meinheld implementation for your use case? |
@bulletmark @jonashaag a complete solution could be to add to the bjoern package a kind of
I use something like that internally. With a little bit of cleaning, I could propose this as a PR if your are interested. |
Definitely! Could also paste it here uncleaned so we can discuss without having you spend time on it |
internal script hardly cleaned (but not completely), not even tested, just here to discuss i think we need to add :
#!/usr/bin/env python3
import argparse
import sys
import bjoern
import importlib
import signal
import datetime
import time
import os
import threading
from mflog import get_logger
from mfutil import kill_process_and_children
LOGGER = get_logger("bjoern_wrapper")
def send_sigint(pid):
LOGGER.info("sending SIGINT to %i", pid)
os.kill(pid, signal.SIGINT)
def on_signal(signum, frame):
if signum == 15:
LOGGER.info("received SIGTERM => smart closing...")
# [...]
# Real stop
send_sigint(os.getpid())
def get_wsgi_application(path):
if len(path.split(':')) != 2:
LOGGER.warning("main_arg must follow module.submodule:func_name")
sys.exit(1)
module_path, func_name = path.split(':')
mod = importlib.import_module(module_path)
try:
return getattr(mod, func_name)
except Exception:
LOGGER.warning("can't find: %s func_name in module: %s" % (
func_name, mod))
sys.exit(1)
class TimeoutWsgiMiddleware(object):
def __init__(self, app, timeout):
self.app = app
self.timeout = timeout
self.started = None
if self.timeout > 0:
x = threading.Thread(target=self.timeout_handler, daemon=True)
x.start()
def timeout_handler(self):
now = datetime.datetime.now
while True:
if self.started:
if (now() - self.started).total_seconds() > self.timeout:
LOGGER.warning("Request Timeout => exit")
# Self-Kill
kill_process_and_children(os.getpid())
time.sleep(1)
def __call__(self, environ, start_response):
if self.timeout > 0:
self.started = datetime.datetime.now()
try:
# see http://blog.dscpl.com.au/2012/10/
# obligations-for-calling-close-on.html
iterable = self.app(environ, start_response)
for data in iterable:
yield data
finally:
self.started = None
if hasattr(iterable, 'close'):
iterable.close()
def main():
parser = argparse.ArgumentParser(description="bjoern wrapper")
parser.add_argument("main_arg", help="wsgi application path")
parser.add_argument("unix_socket", help="unix socket to listen path")
parser.add_argument("--timeout", default=60, type=int,
help="one request execution timeout (in seconds)")
args = parser.parse_args()
signal.signal(signal.SIGTERM, on_signal)
wsgi_app = get_wsgi_application(args.main_arg)
try:
os.unlink(args.unix_socket)
except Exception:
pass
try:
app = TimeoutWsgiMiddleware(wsgi_app, args.timeout)
bjoern.run(app, 'unix:%s' % args.unix_socket)
except KeyboardInterrupt:
LOGGER.info("process shutdown")
if __name__ == '__main__':
main() |
@jonashaag, what I am talking about is so simple it is embarrassing that I apparently have not explained it clearly enough in my first 2 posts here! :) Normally code blocks essentially forever in Meinheld does it this way and it works well. |
@bulletmark I understand what you are saying ;-) But IMHO, today, So we have two different use cases:
I propose to add to existing With the wrapper, systemd will launch your app with:
and you will get a regular What do you think about this compromise ? |
I have a python program which just calls bjoern's |
My plan:
with "user provided" hook: def after_stop(wsgi_app, **kwargs)
# [...] clean what you want
return True I plan to implement other hooks:
|
That would work of course but nowhere near as simple or intuitive as just returning from |
Stupid question, if I send sigterm/sigint to a Python program, it will raise KeyboardInterrupt, right? So shouldn't that be what happens when you try to interrupt bjoern as well? |
With this code: import time
try:
time.sleep(60)
except KeyboardInterrupt:
print("catched KeyboardInterrupt")
except:
print("catched something else")
|
From #145 So, here's how Gunicorn works:
I agree that we should not throw away the backlog when doing a graceful shutdown. Also I think it's fine to just stop bjoern without any kind of graceful shutdown for SIGINT (but not for SIGTERM). Bjoern currently works as follows:
So, going forward, my suggestion is we change Bjoern behaviour as follows:
We can make the SIGTERM behaviour configurable, and allow for custom shutdown behaviours, by exporting some shutdown primitives from the C code. |
I'm still not sure how to meet both the expectations of people thinking of bjoern as a "library" that doesn't fiddle with signal handling and people thinking of bjoern as a server that takes care of graceful loop shutdown. Maybe we actually have to implement something along the lines of what @thefab suggested. |
I'm not sure it's a typo or not but you wrote: "SIGTERM waits for active connections to finish, and the backlog to be empty (while still accepting new connections), with a timeout, and exits the program." IMHO, after a SIGTERM, we don't accept new connections ! For my "library use case", it's perfectly ok to have another signal for custom shutdown. All I need is the python signal handler called from C like in 3.0. |
is it a correct way to gracefully exit? import bjoern
import sys
import signal
import os
def sigterm_handler(_signo, _stack_frame):
sys.stdout.write(' [*] SIGTERM: Stop \n')
os.kill(os.getpid(), signal.SIGINT)
if __name__ == '__main__':
signal.signal(signal.SIGTERM, sigterm_handler)
sys.stdout.write(' [*] Starting bjoern on port 5000...! \n')
try:
bjoern.run(app, '0.0.0.0', 5000)
except KeyboardInterrupt:
sys.stdout.write(' [*] SIGINT: Stop \n')
finally:
sys.stdout.write(' [*] Exiting \n') |
Yup. I would agree. Without the use of an external coordinator (to route new connections elsewhere), the server might potentially never have a chance to gracefully shut down, cos it'll keep accepting new connections.
This "library vs server" thing is interesting. Would the following work for a simple compromise? This supposes that you have the ability to modify (and why shouldn't you?) whatever mechanism you're using to send the SIGTERM to bjoern:
This setup would work for keeping the server up without any interruption in serving connections, and also allow for graceful connection draining. Does this make sense? |
Here are my two cents... Offtopic: I was confused a little bit. If you run bjoern with unix socket it works as wsgi server, otherwise, it works as http. Is it possible to manage via params? I would like to have the ability to run wsgi server which binds to tcp socket. |
It always works as a WSGI server (which is a special kind of HTTP server) |
This comment has been minimized.
This comment has been minimized.
As above, I left bjoern and went to meinheld given this issue but am now back to bjoern for other reasons. My problem here was that bjoern does not return from |
Hi folks, inspire by @bulletmark, I do something like this. (GracefulKiller from StackOverflow)
|
Greetings. |
Hi,
I have a simple Python script wrapping a Bjoern web server around a WSGI app. In essence, it looks like this:
The server works just fine. The problem is that when sending a SIGTERM signal (e.g. using
kill -15
), the signal handler doesn't get called and the process doesn't die. When sending SIGKILL, the process does die, without calling the signal handler (this is expected).How can I intercept SIGTERM?
The text was updated successfully, but these errors were encountered: