Preview: point release v23.1.0 #3

pajod · 2024-08-14T20:38:04Z

This is where I put together various PRs.

The following PRs merge OK:

These merge conflicts are straightforward to resolve:

CI: Pylint 3.x benoitc/gunicorn#3110

I recommend most of the above go into a 23.1.0 bugfix release.

These patches are included to help testing here, albeit not quite ready:

Add options to purge big workers benoitc/gunicorn#3252
https://github.com/pajod/gunicorn/tree/docs-misc (no PR, unsubmitted)

Please do not actually octopus merge though, sequential merges produce clearer output in common git tools. Created using this script:

#!/bin/sh
# assuming .git/config fetches pull requests from GitHub, e.g:
# [remote "benoitc"]
#         url = https://github.com/benoitc/gunicorn.git
#         fetch = +refs/heads/*:refs/remotes/benoitc/*
#         fetch = +refs/pull/*/head:refs/remotes/benoitc/pr/*
# [remote "pajod"]
#         url = https://github.com/pajod/gunicorn.git
#         fetch = +refs/heads/*:refs/remotes/pajod/*
set -e
echo "preparing branch"
test -z "$(git status --porcelain=v1 --untracked-files=no)" || exit 1
git fetch benoitc
git checkout -B integration-v23.1.0 benoitc/master
git reset --hard benoitc/master
git merge --verbose --no-edit --no-rerere-autoupdate benoitc/pr/2938
git apply -3 <<'EOF'
diff --git a/gunicorn/arbiter.py b/gunicorn/arbiter.py
index 646d684e..4d8a1b3b 100644
--- a/gunicorn/arbiter.py
+++ b/gunicorn/arbiter.py
@@ -7,8 +7,8 @@ import random
 import select
 import signal
+import socket
 import sys
 import time
 import traceback
-import socket

 from gunicorn.errors import HaltServer, AppImportError
EOF
git commit -m "help merge imports" -- gunicorn/arbiter.py
echo "octopus merge"
git merge --verbose --no-edit --no-rerere-autoupdate benoitc/pr/2407 \
       benoitc/pr/3157 benoitc/pr/3273 benoitc/pr/3271 \
       benoitc/pr/3275 benoitc/pr/3148 benoitc/pr/3250 \
       benoitc/pr/3264 benoitc/pr/3210 benoitc/pr/3272 \
       pajod/docs-misc
echo "docs"
git apply -3 <<'EOF'
diff --git a/tests/test_nginx.py b/tests/test_nginx.py
index 69742edb..07e07239 100644
--- a/tests/test_nginx.py
+++ b/tests/test_nginx.py
@@ -452,3 +452,3 @@ def test_nginx_proxy(*, ssl, worker_class, dummy_ssl_cert, read_size=1024):
                 expect={
-                    "Handling signal: term",
+                    "Handling signal: SIGTERM",
                     "Shutting down: Master",
EOF
git apply -3 <<'EOF'
diff --git a/tests/test_wrk.py b/tests/test_wrk.py
index f89dc013..fdce18a1 100644
--- a/tests/test_wrk.py
+++ b/tests/test_wrk.py
@@ -403,3 +403,3 @@ def test_wrk(*, ssl, worker_class, dummy_ssl_cert, read_size=1024):
                 expect={
-                    "[INFO] Handling signal: term",
+                    "[INFO] Handling signal: SIGTERM",
                 },
EOF
git commit -m "help merge test for modified logging" -- tests/test_nginx.py tests/test_wrk.py
(cd docs && sphinx-build -n -b html -d build/doctrees source build/html )
git commit --allow-empty -m "prepare docs 23.1.0" -- docs/source/settings.rst
echo "version commit"
sed -i -e 's/23, 0, 0/23, 1, 0/' gunicorn/__init__.py
git commit -m "bump version 23.1.0" -- gunicorn/__init__.py
echo "checking"
pylint -j0 --ignore-paths="^tests/requests/" --max-line-length=120 gunicorn/ tests/ || true
pycodestyle --exclude="tests/requests/*" gunicorn tests || true
python -B -m pytest -s -vvvv --ff --override-ini=addopts=--strict-markers --exitfirst || true
echo "building"
pip --quiet install -U build
python -m build
echo "searching for expected file in sdist"
tar -tf dist/gunicorn-23.1.0.tar.gz  | grep -m 1 gunicorn/__main__.py || true
echo "searching for unexpected file in sdist"
tar -tf dist/gunicorn-23.1.0.tar.gz  | grep -m 1 docs/build || echo 'none found (good!)'
echo "done"

This resolves a `ResourceWarning` caused by the test suite.

This ensures that unclosed resources will be caught in CI

not even in error reponses!

If all the workers are busy or max connections is reached, new connections will queue in the socket backlog, which defaults to 2048. The `gunicorn.backlog` metric provide visibility into this queue, and give an idea on concurrency, and worker saturation. This also adds a distinction between the `timer` and `histogram` statsd metric types, which although treated the same, can be difference, for e.g. in this case histogram is not a timer: https://github.com/b/statsd_spec#timers

Fix failing lint tests

This reverts commit 4023228. We use sys.exit. On purpose. We should therefore not be catching SystemExit.

This is probably wrong. But less wrong than handling *all* BaseException. Reports of this happening may be the result of some *other* async Timeout in the wsgi app bubbling through to us. Gevent docs promise: "[..] if *exception* is the literal ``False``, the timeout is still raised, but the context manager suppresses it, so the code outside the with-block won't see it."

... as opposed to in signal context. This is beneficial, since it means that we can, in a signal safe way, print messages about why e.g. a worker stopped its execution. And since handle_sigchld() logs what it does anyway, don't bother printing out that we're handling SIGCHLD. If workers are killed at rapid pace, we won't get as many SIGCHLD as there are workers killed anyway.

Since we can use something from queue.*, we can make it blocking as well, removing the need for two different data structures.

This change is meant to handle the return value of waitpid() in a way that is more in line with the man page of said syscall. The changes can be summarized as follows: * Use os.WIFEXITED and os.WIFSIGNALED to determine what caused waitpid() to return, and exactly how a worker may have exited. * In case of normal termination, use os.WEXITSTATUS() to read the exit status (instead of using a hand rolled bit shift). A redundant log was removed in this code path. * In case of termination by a signal, use os.WTERMSIG() to determine the signal which caused the worker to terminate. This was buggy before, since the WCOREFLAG (0x80) could cause e.g. a SIGSEGV (code 11) to be reported as "code 139", meaning "code (0x80 | 11)". * Since waitpid() isn't called with WSTOPPED nor WCONTINUED, there's no need to have any os.WIFSTOPPED or os.WIFCONTINUED handling.

According to the python signal documentation[1], SIGCHLD is handled differently from other signals. Specifically, if the underlying implementation resets the SIGCHLD signal handler, then python won't reinstall it (as it does for other signals). This behavior doesn't seem to exist for neither Linux nor Mac, but perhaps one could argue that it's good practise anyway. [1] https://docs.python.org/3/library/signal.html

* Look up handlers in __init__() to induce run-time error early on if something is wrong. * Since we now know that all handlers exist, we can simplify the main loop in arbiter, in such a way that we don't need to call wakeup(). So after this commit, the pipe in arbiter is only used to deliver which signal was sent.

It accepts an optional "due_to_signal" argument which can be used to tell if the wakeup was made because a signal handler needs to be executed or not.

sylt · 2024-08-18T11:09:41Z

@pajod I've updated/synced my two PRs now, so hopefully they should (if they are accepted) merge cleanly!

sylt · 2024-08-22T09:58:52Z

I tested your integration branch a little bit, and I found a bug in my code where I tested the scenario of sending SIGHUP to the arbiter while running a gthread type worker. The following hunk needs to be applied:

diff --git a/gunicorn/workers/gthread.py b/gunicorn/workers/gthread.py
index 196759b8..b47ddaef 100644
--- a/gunicorn/workers/gthread.py
+++ b/gunicorn/workers/gthread.py
@@ -118,8 +118,9 @@ class ThreadWorker(base.Worker):
         return futures.ThreadPoolExecutor(max_workers=self.cfg.threads)
 
     def handle_exit(self, sig, frame):
-        self.alive = False
-        self.method_queue.defer(lambda: None)  # To wake up poller.select()
+        if self.alive:
+            self.alive = False
+            self.method_queue.defer(lambda: None)  # To wake up poller.select()
 
     def handle_quit(self, sig, frame):
         self.thread_pool.shutdown(False)

With this in place, I can spam SIGHUP to the arbiter while running ab in parallel without losing any requests.

I will update my pull request later today!

The main purpose is to remove complexity from gthread by: * Removing the lock for handling self._keep and self.poller. This is possible since we now do all such manipulation on the main thread instead. When a connection is done, it posts a callback through the PollableMethodCaller which gets executed on the main thread. * Having a single event queue (self.poller), as opposed to also managing a set of futures. This fixes benoitc#3146 (although there are more minimal ways of doing it). There are other more minor things as well: * Renaming some variables, e.g. self._keep to self.keepalived_conns. * Remove self-explanatory comments (what the code does, not why). * Just decide that socket is blocking. * Use time.monotonic() for timeouts in gthread. Some complexity has been added to the shutdown sequence, but hopefully for good reason: it's to make sure that all already accepted connections are served within the grace period.

sylt · 2024-08-22T21:48:17Z

Now it's fixed! Sorry for any trouble.

There is not need to excessively produce high-severity logs, while there are ways to reach that point from entirely benign network errors. This partially reverts commit 0b10cba.

workaround greenlet installation on OmniOS v11: libev assumes inotify.h must be Linux. make autoconf stop offering it.

…'benoitc/pr/3273', 'benoitc/pr/3271', 'benoitc/pr/3275', 'benoitc/pr/3148', 'benoitc/pr/3250', 'benoitc/pr/3264', 'benoitc/pr/3210', 'benoitc/pr/3272' and 'pajod/docs-misc' into integration-v23.1.0

hbellur0526 · 2024-11-05T00:46:08Z

Any ETA on 23.1.0 bugfix release?

Ankit Patel and others added 15 commits February 1, 2023 16:13

only create sockets in forked processes not in master

0a0ef9d

detach socket in child process

b0115b9

Close a temporary file acting as a fake socket

49618be

This resolves a `ResourceWarning` caused by the test suite.

Close open sockets when an exception is encountered

62530ef

Close open files to prevent ResourceWarning in the test suite

dc197ff

Escalate warnings to errors during testing

7756175

This ensures that unclosed resources will be caught in CI

assert: never reflect invalid input

a7efba4

not even in error reponses!

config: reload-extra without reload

2096e42

style: argparse inheritance

f340432

style: unused code

353927e

style: re-verbosify HTTP commentary

56b3e42

Do not emit backlog metric if its unsupported

4d53263

Avoid calling get_backlog twice

aa73f3c

Fix failing lint tests

docs: faq: block arbiter to slow down benoitc#2719

13f54ed

pajod force-pushed the integration-v23.1.0 branch 2 times, most recently from a7f742c to c8cd897 Compare August 14, 2024 20:50

pajod added 3 commits August 14, 2024 23:01

Revert "let's exception not bubble"

1fcadcb

This reverts commit 4023228. We use sys.exit. On purpose. We should therefore not be catching SystemExit.

style: line break

ef94875

pajod force-pushed the integration-v23.1.0 branch 2 times, most recently from 21034a0 to fd21207 Compare August 15, 2024 00:28

raags and others added 8 commits August 15, 2024 19:36

Enable only on Linux platforms, and add config flag

d5aa52e

logs: blame WSL2 for unix socket bind fail

7475a75

arbiter: Remove PIPE and only use SIG_QUEUE instead

d653ebc

Since we can use something from queue.*, we can make it blocking as well, removing the need for two different data structures.

arbiter: Add Arbiter:wakeup() method

7ecea2d

It accepts an optional "due_to_signal" argument which can be used to tell if the wakeup was made because a signal handler needs to be executed or not.

pajod force-pushed the integration-v23.1.0 branch from fd21207 to 2ce2976 Compare August 18, 2024 21:34

pajod force-pushed the integration-v23.1.0 branch from 2ce2976 to 2429360 Compare August 23, 2024 14:18

pajod added 3 commits August 23, 2024 21:51

fcntl(fd, FD_CLOEXEC) => os.set_inheritable(fd, False)

e3dd042

fcntl(fd, O_NONBLOCK) => os.set_blocking(fd, False)

0e9b39e

test: setup nginx proxy

a362c2a

pajod force-pushed the integration-v23.1.0 branch 3 times, most recently from 3a0198d to eb55394 Compare August 24, 2024 01:07

pajod mentioned this pull request Aug 27, 2024

revert: 4023228 ("let's exception not bubble") benoitc/gunicorn#3275

Open

pajod added 13 commits September 5, 2024 02:15

Partially revert "unconditionally log request error"

27be287

There is not need to excessively produce high-severity logs, while there are ways to reach that point from entirely benign network errors. This partially reverts commit 0b10cba.

CI: workaround OpenBSD / Python 3.10 / ed25510 x509

601c8b5

Py<=3.7: sum() takes no keyword arguments

16fc020

CI: nginx 1.18 compat

1e78b91

CI: test OpenBSD + FreeBSD + illumos in Linux VM

0d03ef2

workaround greenlet installation on OmniOS v11: libev assumes inotify.h must be Linux. make autoconf stop offering it.

not all children die equally

0d941d8

docs: typo + 23.1.0 header

8adb6ae

Merge remote-tracking branch 'benoitc/pr/2938' into integration-v23.1.0

7268634

help merge imports

a21401d

Merge remote-tracking branches 'benoitc/pr/2407', 'benoitc/pr/3157', …

74f70bc

…'benoitc/pr/3273', 'benoitc/pr/3271', 'benoitc/pr/3275', 'benoitc/pr/3148', 'benoitc/pr/3250', 'benoitc/pr/3264', 'benoitc/pr/3210', 'benoitc/pr/3272' and 'pajod/docs-misc' into integration-v23.1.0

help merge test for modified logging

071a4c0

prepare docs 23.1.0

b7b118b

bump version 23.1.0

325c802

pajod force-pushed the integration-v23.1.0 branch from eb55394 to 325c802 Compare September 12, 2024 08:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preview: point release v23.1.0 #3

Preview: point release v23.1.0 #3

pajod commented Aug 14, 2024 •

edited

Loading

sylt commented Aug 18, 2024

sylt commented Aug 22, 2024

sylt commented Aug 22, 2024

hbellur0526 commented Nov 5, 2024

Preview: point release v23.1.0 #3

Are you sure you want to change the base?

Preview: point release v23.1.0 #3

Conversation

pajod commented Aug 14, 2024 • edited Loading

sylt commented Aug 18, 2024

sylt commented Aug 22, 2024

sylt commented Aug 22, 2024

hbellur0526 commented Nov 5, 2024

pajod commented Aug 14, 2024 •

edited

Loading