[BUG] Killing webserver process should use SIGTERM on Linux, with SIGKILL as fallback #18209

rmunn · 2022-10-20T14:12:50Z

Context:

Playwright Version: 1.20.0-alpha-feb-23-2022
Operating System: Linux
Node.js version: v16.10.0
Browser: Chromium
Extra: Using docker-compose to launch webserver

Config Snippets

Playwright config:

const config: PlaywrightTestConfig = {
  // ...
  webServer: {
    command: 'docker-compose -f docker/docker-compose.yml up app-for-e2e',
    port: 3238,
    timeout: 10 * 1000,
    reuseExistingServer: !process.env.CI,
  }
};

docker-compose.yml:

app-for-e2e:
  # ...
  container_name: app-for-e2e
  platform: linux/amd64
  ports:
    - 3238:80

Describe the bug

We use a Docker container to launch our dev webserver and run Playwright tests against it. The docker-compose up command responds to SIGINT and SIGTERM by gracefully terminating the container(s) that it launched, but if it receives SIGKILL then it cannot terminate the containers, and they stay running. At the moment, the WebServer class does not provide any way to gracefully close the launched process, instead using the killAndWait function from processLauncher.ts, which sends SIGKILL to the process without ever trying SIGTERM.

The result is that after I run npx playwright test, the app-for-e2e container is still running, because Playwirght never attempted to send SIGTERM to the process and just sent SIGKILL instead. I then have to stop the app-for-e2e container manually, since I don't want it running and taking up system resources when I'm not using it.

Proposed solution

To solve this bug, I would suggest changing killProcess in processLauncher.ts to take a boolean argument gracefully, defaulting to false for backwards compatibility. If gracefully is true, send SIGTERM rather than SIGKILL on Unix-like platforms (Linux and MacOS), or on Windows, run taskkill without the /F flag. Add the same parameter to killProcessAndCleanup and killAndWait. Then change the this._killProcess() method in the WebServer class to look something like this:

this._killProcess = async () => {
  const _hasExited = async () => launchedProcess.exitCode != null;
  const killTimeout = this.config.timeout || 60 * 1000;
  kill(true);
  const cancellationToken = { canceled: false };
  const { timedOut } = (await Promise.race([
    raceAgainstTimeout(() => waitFor(_hasExited, 100, cancellationToken), killTimeout),
    this._processExitedPromise,
  ]));
  cancellationToken.canceled = true;
  if (timedOut)
    kill(false);
};

Additional Info

This is #12299 reopened, since #12299 was closed without a good reason for closing it. (The reasons given were "It looks like this issue only has a handful of upvotes, has not been touched recently and/or we lack sufficient feedback to act on it", none of which are good reasons to close a bug without discussion in the issue first. If a bug is going to be closed for any of those reasons, a comment to that effect should be made in the bug itself first. Closing bugs with no discussion like that is bad issue-management practice.)

The text was updated successfully, but these errors were encountered:

thenbe · 2022-10-20T14:27:27Z

Possibly related: #18041

My current workaround is to manually detect if a process has been started by playwright, then use tree-kill to kill it in GlobalTeardown.

pavelfeldman · 2022-10-20T15:24:20Z

I'm reading docker docs and it seems like docker compose itself is sending SIGTERM to the containers to gracefully kill them: https://docs.docker.com/compose/faq/#:~:text=Compose%20stop%20attempts%20to%20stop,they%20receive%20the%20SIGTERM%20signal. It probably does the same thing when you send SIGTERM to docker compose, or it happens automatically if they are a part of the same process tree.

This is a known anti-pattern when reliably managing dependent process trees - it is prone to dangling and zombie processes. Any process can die at any moment, you can't rely on SIGTERM solely to process the cleanup code. I.e. we can support graceful termination, but they must ensure hard kill still works. It looks like a docker bug to me. Even with your proposed fix, clicking Ctrl+C twice would allow container processes to survive, there is no guarantee they will die.

rmunn · 2022-10-20T16:15:56Z

A quick note on Linux signal handling in case someone reading this is unfamiliar with it: processes are allowed to handle most signals however they want, and shutting down when you receive SIGTERM is only a (universally-followed) convention: you could handle SIGTERM and do nothing with it, though that would be considered a bug in nearly every case. But processes cannot handle SIGKILL. When a process is sent SIGKILL, the Linux kernel immediately kills the process, giving it no time to do any cleanup. Which is why the pattern on Linux is usually to send SIGTERM first (which by convention means "do any necessary cleanup, then shut yourself down"), then wait a configurable number of seconds before sending SIGKILL if the process hasn't terminated in a timely fashion.

So yes, docker-compose needs to have something in place to handle cases where the docker-compose process dies unexpectedly (or is sent a SIGKILL). But bugs in other software are no excuse for Playwright not doing the right thing, and sending SIGTERM first in order to allow webserver processes to do whatever cleanup they might find necessary. I experienced this issue with docker-compose, but there are many other scenarios where a webserver process might have some cleanup (deleting temporary files, etc.) to do, and sending SIGKILL first without ever attempting SIGTERM doesn't allow that cleanup to ever happen.

We now will send `SIGTERM` to the webserver and wait for the `timeout` before sending `SIGKILL` to it. Fixes microsoft#18209

We now will send `SIGTERM` to the webserver and wait for the `timeout` before sending `SIGKILL` to it. Fixes #18209

domvo · 2023-02-06T23:44:18Z

This should be reopened, since the initial fix was reverted.

rmunn · 2023-02-06T23:47:13Z

I'd reopen it, but I don't have permissions to do so on this repo. Any maintainers want to reopen this, or would you prefer I open a new issue linking to this one?

alexis-aquanty · 2023-11-10T20:24:09Z

The fact that the webserver is killed instead of terminating gracefully seems to break any tools that rely on Node V8 coverage of the webserver process, since coverage data is written when the process terminates (but not when it's killed). This breaks code coverage using c8, which is important for us. I don't know why the fix for this issue was correctly implemented and then reverted. SIGKILL should never be the default.

dangoodman · 2024-03-22T09:53:09Z

any update?

Note that this change is also a workaround for: microsoft/playwright#18209

rmunn · 2024-04-25T04:15:20Z

This is a quick comment to keep this issue open, since apparently the Playwright team closes issues without warning when they haven't had much recent activity.

Psst, Playwright team: it would be nice if you would drop a comment on stale issues first, say 14 days before closing them, saying "This issue has not had any recent activity and will be closed in 14 days". That way people wouldn't feel the need to add comments like this one, which is essentially spam but I don't know any other way to be sure this issue will be kept open. After all, it isn't fixed yet. If you're closing issues for not having recent activity, then "bump" comments like this are the only way, apparently, to make sure an issue that hasn't been fixed stays open.

swersk · 2024-08-16T10:52:23Z

Following this issue!

rmunn mentioned this issue Oct 20, 2022

[BUG] Killing webserver process should use SIGTERM on Linux, with SIGKILL as fallback #12299

Closed

dgozman assigned aslushnikov Oct 20, 2022

dgozman added the v1.28 label Oct 20, 2022

aslushnikov mentioned this issue Oct 20, 2022

feat: send SIGTERM to webserver before SIGKILL'ing it. #18220

Merged

aslushnikov added a commit to aslushnikov/playwright that referenced this issue Oct 20, 2022

feat: send SIGTERM to webserver before SIGKILL'ing it.

f21d439

We now will send `SIGTERM` to the webserver and wait for the `timeout` before sending `SIGKILL` to it. Fixes microsoft#18209

aslushnikov added a commit to aslushnikov/playwright that referenced this issue Oct 20, 2022

feat: send SIGTERM to webserver before SIGKILL'ing it.

0e401ca

We now will send `SIGTERM` to the webserver and wait for the `timeout` before sending `SIGKILL` to it. Fixes microsoft#18209

aslushnikov closed this as completed in #18220 Oct 21, 2022

aslushnikov added a commit that referenced this issue Oct 21, 2022

feat: send SIGTERM to webserver before SIGKILL'ing it. (#18220)

c63a0b5

We now will send `SIGTERM` to the webserver and wait for the `timeout` before sending `SIGKILL` to it. Fixes #18209

benmccann mentioned this issue Nov 17, 2022

[BUG] regression in killing web server process in 1.28.0 #18865

Closed

domvo mentioned this issue Feb 7, 2023

[Feature] Send SIGTERM instead of immediate SIGKILL to webServer child process #20705

Closed

mxschmitt reopened this Feb 7, 2023

mxschmitt unassigned aslushnikov Feb 7, 2023

mxschmitt added the feature-test-runner Playwright test specific issues label Feb 7, 2023

cenfun mentioned this issue Jan 6, 2024

[Question]: does it support server side code coverage? cenfun/monocart-coverage-reports#1

Closed

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(test): only start servers when needed

07cf987

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem mentioned this issue Mar 29, 2024

refactor(e2e): only start servers when needed AmadeusITGroup/AgnosUI#674

Merged

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(e2e): only start servers when needed

b3eed7d

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(e2e): only start servers when needed

b4a066e

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(e2e): only start servers when needed

7165275

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(e2e): only start servers when needed

0d4a94b

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(e2e): only start servers when needed

ef9163a

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(e2e): only start servers when needed

1bdb16f

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(e2e): only start servers when needed

c861750

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(e2e): only start servers when needed

444b047

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(e2e): only start servers when needed

52fc721

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(e2e): only start servers when needed

fc0d4e4

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(e2e): only start servers when needed

d4bc220

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(e2e): only start servers when needed

73fa1e3

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(e2e): only start servers when needed

30ddb1d

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Mar 29, 2024

refactor(e2e): only start servers when needed

b743ac5

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Apr 2, 2024

refactor(e2e): only start servers when needed

1c578b3

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Apr 2, 2024

refactor(e2e): only start servers when needed

14148df

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Apr 2, 2024

refactor(e2e): only start servers when needed

5115adf

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Apr 2, 2024

refactor(e2e): only start servers when needed

f751732

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Apr 2, 2024

refactor(e2e): only start servers when needed

c2919dc

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Apr 2, 2024

refactor(e2e): only start servers when needed

abdfeed

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to divdavem/AgnosUI that referenced this issue Apr 2, 2024

refactor(e2e): only start servers when needed

a75b80c

Note that this change is also a workaround for: microsoft/playwright#18209

divdavem added a commit to AmadeusITGroup/AgnosUI that referenced this issue Apr 2, 2024

refactor(e2e): only start servers when needed (#674)

e88f06b

Note that this change is also a workaround for: microsoft/playwright#18209

Martinsos mentioned this issue Apr 4, 2024

Playwright tests & CI wasp-lang/open-saas#69

Merged

Skn0tt mentioned this issue Nov 6, 2024

feat(webserver): customize shutdown with new kill option #33379

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Killing webserver process should use SIGTERM on Linux, with SIGKILL as fallback #18209

[BUG] Killing webserver process should use SIGTERM on Linux, with SIGKILL as fallback #18209

rmunn commented Oct 20, 2022

thenbe commented Oct 20, 2022 •

edited

Loading

pavelfeldman commented Oct 20, 2022

rmunn commented Oct 20, 2022

domvo commented Feb 6, 2023

rmunn commented Feb 6, 2023

alexis-aquanty commented Nov 10, 2023

dangoodman commented Mar 22, 2024

rmunn commented Apr 25, 2024

swersk commented Aug 16, 2024

[BUG] Killing webserver process should use SIGTERM on Linux, with SIGKILL as fallback #18209

[BUG] Killing webserver process should use SIGTERM on Linux, with SIGKILL as fallback #18209

Comments

rmunn commented Oct 20, 2022

thenbe commented Oct 20, 2022 • edited Loading

pavelfeldman commented Oct 20, 2022

rmunn commented Oct 20, 2022

domvo commented Feb 6, 2023

rmunn commented Feb 6, 2023

alexis-aquanty commented Nov 10, 2023

dangoodman commented Mar 22, 2024

rmunn commented Apr 25, 2024

swersk commented Aug 16, 2024

thenbe commented Oct 20, 2022 •

edited

Loading