Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preview: point release v23.1.0 #3

Open
wants to merge 43 commits into
base: master
Choose a base branch
from
Open

Preview: point release v23.1.0 #3

wants to merge 43 commits into from

Conversation

pajod
Copy link
Owner

@pajod pajod commented Aug 14, 2024

This is where I put together various PRs.

The following PRs merge OK:

These merge conflicts are straightforward to resolve:

I recommend most of the above go into a 23.1.0 bugfix release.

These patches are included to help testing here, albeit not quite ready:

Please do not actually octopus merge though, sequential merges produce clearer output in common git tools. Created using this script:

#!/bin/sh
# assuming .git/config fetches pull requests from GitHub, e.g:
# [remote "benoitc"]
#         url = https://github.com/benoitc/gunicorn.git
#         fetch = +refs/heads/*:refs/remotes/benoitc/*
#         fetch = +refs/pull/*/head:refs/remotes/benoitc/pr/*
# [remote "pajod"]
#         url = https://github.com/pajod/gunicorn.git
#         fetch = +refs/heads/*:refs/remotes/pajod/*
set -e
echo "preparing branch"
test -z "$(git status --porcelain=v1 --untracked-files=no)" || exit 1
git fetch benoitc
git checkout -B integration-v23.1.0 benoitc/master
git reset --hard benoitc/master
git merge --verbose --no-edit --no-rerere-autoupdate benoitc/pr/2938
git apply -3 <<'EOF'
diff --git a/gunicorn/arbiter.py b/gunicorn/arbiter.py
index 646d684e..4d8a1b3b 100644
--- a/gunicorn/arbiter.py
+++ b/gunicorn/arbiter.py
@@ -7,8 +7,8 @@ import random
 import select
 import signal
+import socket
 import sys
 import time
 import traceback
-import socket

 from gunicorn.errors import HaltServer, AppImportError
EOF
git commit -m "help merge imports" -- gunicorn/arbiter.py
echo "octopus merge"
git merge --verbose --no-edit --no-rerere-autoupdate benoitc/pr/2407 \
       benoitc/pr/3157 benoitc/pr/3273 benoitc/pr/3271 \
       benoitc/pr/3275 benoitc/pr/3148 benoitc/pr/3250 \
       benoitc/pr/3264 benoitc/pr/3210 benoitc/pr/3272 \
       pajod/docs-misc
echo "docs"
git apply -3 <<'EOF'
diff --git a/tests/test_nginx.py b/tests/test_nginx.py
index 69742edb..07e07239 100644
--- a/tests/test_nginx.py
+++ b/tests/test_nginx.py
@@ -452,3 +452,3 @@ def test_nginx_proxy(*, ssl, worker_class, dummy_ssl_cert, read_size=1024):
                 expect={
-                    "Handling signal: term",
+                    "Handling signal: SIGTERM",
                     "Shutting down: Master",
EOF
git apply -3 <<'EOF'
diff --git a/tests/test_wrk.py b/tests/test_wrk.py
index f89dc013..fdce18a1 100644
--- a/tests/test_wrk.py
+++ b/tests/test_wrk.py
@@ -403,3 +403,3 @@ def test_wrk(*, ssl, worker_class, dummy_ssl_cert, read_size=1024):
                 expect={
-                    "[INFO] Handling signal: term",
+                    "[INFO] Handling signal: SIGTERM",
                 },
EOF
git commit -m "help merge test for modified logging" -- tests/test_nginx.py tests/test_wrk.py
(cd docs && sphinx-build -n -b html -d build/doctrees source build/html )
git commit --allow-empty -m "prepare docs 23.1.0" -- docs/source/settings.rst
echo "version commit"
sed -i -e 's/23, 0, 0/23, 1, 0/' gunicorn/__init__.py
git commit -m "bump version 23.1.0" -- gunicorn/__init__.py
echo "checking"
pylint -j0 --ignore-paths="^tests/requests/" --max-line-length=120 gunicorn/ tests/ || true
pycodestyle --exclude="tests/requests/*" gunicorn tests || true
python -B -m pytest -s -vvvv --ff --override-ini=addopts=--strict-markers --exitfirst || true
echo "building"
pip --quiet install -U build
python -m build
echo "searching for expected file in sdist"
tar -tf dist/gunicorn-23.1.0.tar.gz  | grep -m 1 gunicorn/__main__.py || true
echo "searching for unexpected file in sdist"
tar -tf dist/gunicorn-23.1.0.tar.gz  | grep -m 1 docs/build || echo 'none found (good!)'
echo "done"

Ankit Patel and others added 15 commits February 1, 2023 16:13
This resolves a `ResourceWarning` caused by the test suite.
This ensures that unclosed resources will be caught in CI
not even in error reponses!
If all the workers are busy or max connections is reached, new
connections will queue in the socket backlog, which defaults to 2048.
The `gunicorn.backlog` metric provide visibility into this queue, and
give an idea on concurrency, and worker saturation.

This also adds a distinction between the `timer` and `histogram` statsd
metric types, which although treated the same, can be difference, for
e.g. in this case histogram is not a timer: https://github.com/b/statsd_spec#timers
Fix failing lint tests
@pajod pajod force-pushed the integration-v23.1.0 branch 2 times, most recently from a7f742c to c8cd897 Compare August 14, 2024 20:50
This reverts commit 4023228.

We use sys.exit. On purpose.
We should therefore not be catching SystemExit.
This is probably wrong. But less wrong than handling *all*
BaseException. Reports of this happening may be the result of
 some *other* async Timeout in the wsgi app bubbling through to us.

Gevent docs promise:
"[..] if *exception* is the literal ``False``, the timeout is still raised,
   but the context manager suppresses it, so
   the code outside the with-block won't see it."
@pajod pajod force-pushed the integration-v23.1.0 branch 2 times, most recently from 21034a0 to fd21207 Compare August 15, 2024 00:28
raags and others added 8 commits August 15, 2024 19:36
... as opposed to in signal context. This is beneficial, since it
means that we can, in a signal safe way, print messages about why
e.g. a worker stopped its execution.

And since handle_sigchld() logs what it does anyway, don't bother
printing out that we're handling SIGCHLD. If workers are killed at
rapid pace, we won't get as many SIGCHLD as there are workers killed
anyway.
Since we can use something from queue.*, we can make it blocking as
well, removing the need for two different data structures.
This change is meant to handle the return value of waitpid() in a way
that is more in line with the man page of said syscall. The changes
can be summarized as follows:

* Use os.WIFEXITED and os.WIFSIGNALED to determine what caused
  waitpid() to return, and exactly how a worker may have exited.

* In case of normal termination, use os.WEXITSTATUS() to read the exit
  status (instead of using a hand rolled bit shift). A redundant log
  was removed in this code path.

* In case of termination by a signal, use os.WTERMSIG() to determine
  the signal which caused the worker to terminate. This was buggy
  before, since the WCOREFLAG (0x80) could cause e.g. a SIGSEGV (code
  11) to be reported as "code 139", meaning "code (0x80 | 11)".

* Since waitpid() isn't called with WSTOPPED nor WCONTINUED, there's
  no need to have any os.WIFSTOPPED or os.WIFCONTINUED handling.
According to the python signal documentation[1], SIGCHLD is handled
differently from other signals. Specifically, if the underlying
implementation resets the SIGCHLD signal handler, then python won't
reinstall it (as it does for other signals).

This behavior doesn't seem to exist for neither Linux nor Mac, but
perhaps one could argue that it's good practise anyway.

[1] https://docs.python.org/3/library/signal.html
* Look up handlers in __init__() to induce run-time error early on if
  something is wrong.

* Since we now know that all handlers exist, we can simplify the main
  loop in arbiter, in such a way that we don't need to call wakeup().

So after this commit, the pipe in arbiter is only used to deliver
which signal was sent.
It accepts an optional "due_to_signal" argument which can be used to
tell if the wakeup was made because a signal handler needs to be
executed or not.
@sylt
Copy link

sylt commented Aug 18, 2024

@pajod I've updated/synced my two PRs now, so hopefully they should (if they are accepted) merge cleanly!

@sylt
Copy link

sylt commented Aug 22, 2024

I tested your integration branch a little bit, and I found a bug in my code where I tested the scenario of sending SIGHUP to the arbiter while running a gthread type worker. The following hunk needs to be applied:

diff --git a/gunicorn/workers/gthread.py b/gunicorn/workers/gthread.py
index 196759b8..b47ddaef 100644
--- a/gunicorn/workers/gthread.py
+++ b/gunicorn/workers/gthread.py
@@ -118,8 +118,9 @@ class ThreadWorker(base.Worker):
         return futures.ThreadPoolExecutor(max_workers=self.cfg.threads)
 
     def handle_exit(self, sig, frame):
-        self.alive = False
-        self.method_queue.defer(lambda: None)  # To wake up poller.select()
+        if self.alive:
+            self.alive = False
+            self.method_queue.defer(lambda: None)  # To wake up poller.select()
 
     def handle_quit(self, sig, frame):
         self.thread_pool.shutdown(False)

With this in place, I can spam SIGHUP to the arbiter while running ab in parallel without losing any requests.

I will update my pull request later today!

The main purpose is to remove complexity from gthread by:

* Removing the lock for handling self._keep and self.poller. This is
  possible since we now do all such manipulation on the main thread
  instead. When a connection is done, it posts a callback through the
  PollableMethodCaller which gets executed on the main thread.

* Having a single event queue (self.poller), as opposed to also
  managing a set of futures. This fixes benoitc#3146 (although there are
  more minimal ways of doing it).

There are other more minor things as well:

* Renaming some variables, e.g. self._keep to self.keepalived_conns.
* Remove self-explanatory comments (what the code does, not why).
* Just decide that socket is blocking.
* Use time.monotonic() for timeouts in gthread.

Some complexity has been added to the shutdown sequence, but hopefully
for good reason: it's to make sure that all already accepted
connections are served within the grace period.
@sylt
Copy link

sylt commented Aug 22, 2024

Now it's fixed! Sorry for any trouble.

There is not need to excessively produce high-severity logs, while
there are ways to reach that point from entirely benign network errors.

This partially reverts commit 0b10cba.
workaround greenlet installation on OmniOS v11:
libev assumes inotify.h must be Linux. make autoconf stop offering it.
…'benoitc/pr/3273', 'benoitc/pr/3271', 'benoitc/pr/3275', 'benoitc/pr/3148', 'benoitc/pr/3250', 'benoitc/pr/3264', 'benoitc/pr/3210', 'benoitc/pr/3272' and 'pajod/docs-misc' into integration-v23.1.0
@hbellur0526
Copy link

Any ETA on 23.1.0 bugfix release?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants