-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIGCHLD is not always returned from proc_open #11498
Comments
I have a feeling multiple signals get collapsed into one and a loop is missing somewhere to go over all of them. |
Yep, that's definitely the issue. Here's a quick PoC patch that solves it for me (although testing is appreciated ^^). diff --git a/ext/pcntl/pcntl.c b/ext/pcntl/pcntl.c
index c2b7f390fc..82890eb407 100644
--- a/ext/pcntl/pcntl.c
+++ b/ext/pcntl/pcntl.c
@@ -1055,23 +1055,40 @@ static void pcntl_signal_handler(int signo)
/* oops, too many signals for us to track, so we'll forget about this one */
return;
}
- PCNTL_G(spares) = psig->next;
- psig->signo = signo;
- psig->next = NULL;
+ struct php_pcntl_pending_signal *psig_first = psig;
+
+ /* May be a merged signal. */
+ do {
+ // TODO: we should only do the merge check for *standard* signals, otherwise we shouldn't loop.
+ int status;
+ pid_t pid = waitpid(-1, &status, WNOHANG);
+ if (pid <= 0) {
+ break;
+ }
+
+ psig->signo = signo;
#ifdef HAVE_STRUCT_SIGINFO_T
- psig->siginfo = *siginfo;
+ psig->siginfo = *siginfo;
+ psig->siginfo.si_pid = pid; // TODO: depends on the signal
#endif
+ psig = psig->next;
+ } while (psig);
+
+ PCNTL_G(spares) = psig->next;
+ psig->next = NULL;
+
/* the head check is important, as the tick handler cannot atomically clear both
* the head and tail */
if (PCNTL_G(head) && PCNTL_G(tail)) {
- PCNTL_G(tail)->next = psig;
+ PCNTL_G(tail)->next = psig_first;
} else {
- PCNTL_G(head) = psig;
+ PCNTL_G(head) = psig_first;
}
PCNTL_G(tail) = psig;
+
PCNTL_G(pending_signals) = 1;
if (PCNTL_G(async_signals)) {
zend_atomic_bool_store_ex(&EG(vm_interrupt), true);
|
Hi, thanks for looking into the issue. I tried out the patch and it seems to be working. There is still ~30% chance that it ends prematurely and exits with code 255 though. # php test.php; echo $?
3863
3864
processes remaining:9
3868
3866
processes remaining:7
3867
processes remaining:6
3865
processes remaining:5
3873
3869
3872
3870
3871
255 |
I can't reproduce the exit status 255 problem. |
Hi, thanks for the detailed instructions. I couldn't get the error to show up even with the options so I rebuilt PHP with ASAN support.
|
I can reproduce the "error code 255" issue sometimes, it's very hard to trigger though, I need to repeatedly start the process in a tight loop. I'll try to debug that. |
Okay, partially found the reason it exits |
Bingo, a signal arrives during the write, which causes the return value to be -1. We handle EAGAIN already but we don't handle EINTR... |
Linux, and maybe other unixes, may merge multiple standard signals into a single one. This causes issues when keeping track of process IDs. Solve this by manually checking which children are dead using waitpid(). Test case is based on taka-oyama's test code.
When writing the output in the CLI is interrupted by a signal, the writing will fail in sapi_cli_single_write(), causing an exit later in sapi_cli_ub_write(). This was the other part of the issue in phpGH-11498. The solution is to restart the write if an EINTR has been observed.
When writing the output in the CLI is interrupted by a signal, the writing will fail in sapi_cli_single_write(), causing an exit later in sapi_cli_ub_write(). This was the other part of the issue in phpGH-11498. The solution is to restart the write if an EINTR has been observed.
Thank you for the fixes! |
Linux, and maybe other unixes, may merge multiple standard signals into a single one. This causes issues when keeping track of process IDs. Solve this by manually checking which children are dead using waitpid(). Test case is based on taka-oyama's test code.
* PHP-8.1: Fix GH-11498: SIGCHLD is not always returned from proc_open
* PHP-8.2: Fix GH-11498: SIGCHLD is not always returned from proc_open
People relied on manually waiting for children, but the fix for phpGH-11498 broke this. Fixing this in PHP is fundamentally incompatible with doing the wait loop in userland. This reverts to the old behaviour.
People relied on manually waiting for children, but the fix for phpGH-11498 broke this. Fixing this in PHP is fundamentally incompatible with doing the wait loop in userland. This reverts to the old behaviour.
* PHP-8.1: Revert the fix for GH-11498
* PHP-8.2: Revert the fix for GH-11498
@taka-oyama The fix for this issue caused BC breaks for other people. We have therefore decided to revert this. This aligns the behaviour of PCNTL to the exact behaviour of the OS. php-src/ext/pcntl/tests/waiting_on_sigchild_pcntl_wait.phpt Lines 17 to 20 in 633e745
Note however, that the "exit 255" problem you experienced is still fixed. |
Hi, thanks I will give that a try. |
People relied on manually waiting for children, but the fix for phpGH-11498 broke this. Fixing this in PHP is fundamentally incompatible with doing the wait loop in userland. This reverts to the old behaviour. Closes phpGH-11863.
Hi @nielsdos, I tried the code you provided and I confirmed that it works. Thank you for the workaround. However, I noticed that |
If you perform the pcntl_wait inside the signal handler you can reuse the relevant fields, most fields should be the same for all children. |
I'm not sure that you can reuse the If you run the code below, the exit codes don't match. <?php
pcntl_async_signals(true);
pcntl_signal(SIGCHLD, function($sig, $info) {
usleep(100);
while(($pid = pcntl_wait($status, WUNTRACED | WNOHANG)) > 0) {
echo 'SIGCHLD WUNTRACED: ' . $pid . ' exit code: ' . $info['status'] . PHP_EOL;
}
}, false);
foreach (range(0, 100) as $i) {
// exit proc with exit code: $i
proc_open('echo "$$ exiting with code: ' . $i . '" && exit ' . $i, [], $pipes);
} |
Sorry, I just noticed you can get the exit code by passing |
The kernel doesn't give us the siginfo_t, that information is lost because of the signal merging it performs. You can indeed get the status via the pcntl function. |
Ok. Understood. Thank you for replying 🙂. |
Description
The following code:
Resulted in this output:
But I expected this output instead:
There were some rare cases where the script would exit with code 255 with no errors.
PHP Version
PHP 8.2.7 (cli) (built: Jun 13 2023 23:05:53) (NTS)
Operating System
Debian GNU/Linux 12
The text was updated successfully, but these errors were encountered: