-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Usage of pthread in a Node.js app never lets it exit #12801
Comments
Is this supposed to work without |
@emaxx-google Yes. |
I'm not an Emscripten expert, but the same page describes in a bit more detail:
|
@emaxx-google That's not the issue here. I already know how pthreads work in Emscripten :) What that paragraph is describing is the reason why you need |
Try building with |
@juj Hmm it does, but that's not necessary in a non-pthread builds. Why is adding pthread different? |
[semi-answering my own question] IIRC the difference is that |
The behavior is likely caused by this line: Line 441 in 5568b81
-s EXIT_RUNTIME=1 . Does that also do a forcible process.exit()?
You can also try running |
Yeah, they're equivalent AFAIK: Line 197 in c2462cd
|
That should be the only difference with respect to process teardown in st and mt builds. If in that mode there is a forcible shutdown with process.exit() happening, then it should also be happening in non-multithreaded builds. Not sure where such a forcible shutdown is happening, though Emscripten at least does not emit a call to process.exit() API into the generated build. |
I actually now wonder if this is "expected behaviour"... if issue is caused by hanging Workers, then there are only two choices:
I do think that it's possible to get a middle ground by looking into those |
So yeah, I can reproduce by only loading the const {Worker} = require('worker_threads');
let w = new Worker('./temp.worker.js'); This also hangs forever, probably because Worker still waits for messages. |
@addaleax I admit I don't fully understand the Basically, the question is - is there a way to make sure that Workers are torn down once the Right now it looks like a reference cycle keeping both main thread and the Worker alive. |
No, that never happens. Worker objects cannot be GC’ed before they are terminated because they are GC roots because they receive events from the event loop.
If you call
That sounds very likely, yes, but I wouldn’t know what we could do about this on the Node.js side. |
Huh. But that's not how they work in browsers? Can't Workers be tied to the |
@RReverser What makes you think that browsers behave differently? I think the behavior you want would require cross-thread heap reference tracking, which doesn’t exist.
|
I think I saw it somewhere, but I'll need to dig to find the reference.
Right, but if the parent thread has exited (which it can e.g. in my last example without Or, more generally, when Worker object is collected, it can send a notification to the actual worker thread to stop listening? |
@RReverser If the parent thread exits, that also forcibly stops all child threads, there’s no unreferencing possible or necessary.
Just to reiterate, Worker objects do not get collected until they are actually stopped because they can generate events at any point, just like network sockets and other event loop objects. |
Hmm, then why is Node still hanging in the example above? |
@RReverser Because the parent doesn’t exit, because the Worker is still there and can emit events. |
But parent is not subscribed to Worker events in this case? To clarify, I'm referring to
|
Ahh, I see. I guess that’s a fair point then … in Node.js, what currently happens when there’s an uncaught exception inside a Worker is that an Would it be acceptable to explicitly use |
I don't know... I mean, it's certainly possible in some simple cases like above to try and find places where explicit In cases like Emscripten's it's a lot harder, because there is an actual reference cycle (main thread and a Worker waiting for each other) which traditionally can only be fixed with a GC (or hard termination). |
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant. |
I pretty new to Emscripten and javascript altogether, but I also ran into this problem. I am compiling a library with no main function and then I have a javascript with some Jest tests in it. Once all my tests have run and I am certain all my threads have executed I am calling PThread.reminateAllThreads which allows node to exit. The code looks like this: I'm not sure about the safety of doing this, but it has worked for me so far |
Does adding |
Working on yet another project where this bites me and also had to apply Mocha workaround like @wheresthecode. @sbc100 Unfortunately, in library case like @wheresthecode's EXIT_RUNTIME doesn't help because rather than using However, I no longer think Emscripten can fix this on its side, since it has no way of knowing whether the library will be called into again, and, thus, whether it will need to use threads from the pthread pool again :( |
wasm-vips also had this issue on non-browser environments. It currently exposes a shutdown function that calls |
In that case perhaps the library could require
|
I think I might have been convinced otherwise and this is actually something that can be fixed on Emscripten side; not 100% sure yet, but going to experiment in the next couple of days. |
While working on PR #18201, I noticed that using Wasm Workers (the Somehow (perhaps due to the use of emscripten/test/core/test_stdio_locking.c Lines 80 to 86 in 7307f41
So, an alternative route is to implement the pthread API on top of Wasm Workers as discussed in #12833 (comment). |
Right, it doesn't deadlock because the API requirement are different from pthreads. Pthreads depend on being able to spawn threads synchronously & block on shared memory, so Emscripten implementation has to make it possible too, but Wasm Workers don't have that compat restriction as it's a brand-new API and can require it to be async. #9910 would have a similar effect if implemented. Anyway, I've ben working and going to send a PR fixing this issue for most common use-cases today, with some others either being fixed later or for now left to end users. |
Great, thanks for doing this! It would indeed be great if issue #9910 is also addressed in the future. There's now a |
@brendandahl, another use case of |
To be precise, #9910 is, this issue itself is not relevant to Asyncify. |
Fixes #12801 for majority of cases. This is a relatively simple change, but it took embarrassingly many attempts to get it in the right places for all obscure tests to pass + to figure out which tests can make use of it instead of doing manual exit + to debug some apparent differences in Node Worker GC behaviour between Windows/Linux as a bonus. I tried two approaches in parallel, a conservative one in this PR and one that brings Emscripten behaviour closer to native in a separate branch. In ideal scenario, I wanted to make Node.js apps behave just like native, where background threads themselves don't keep an app open, and instead app lives as long as it explicitly blocks on `pthread_join` or other blocking APIs. However, it's a more disruptive change that still requires more work and testing, as some Emscripten use-cases implicitly depend on the app running despite not having any more foreground work to do - one notable example is `PROXY_TO_PTHREAD` that spawns a detached thread, but obviously wants the app to continue running. All those cases are fixable, but, as said above, requires more work so I'm keeping it aside for now. Instead, in this PR I'm adding a .ref/.unref "dance" (h/t @d3lm for the original idea) that keeps the app alive as long as _any_ pthreads are running, whether joinable or detached, and whether you have explicit blocking on them or not. It works as following: - Upon creation, all pool workers are strongly referenced as we need to wait for them to be properly loaded. - Once worker is loaded, it's marked as weakly referenced, as we don't want idle workers to prevent app from exiting. - Once worker is associated with & starts running a pthread, it's marked as strongly referenced so that the app stays alive as long as it's doing some work. - Once worker is done and returned to the idle worker pool, it's weakly referenced again. This ensures maximum compatibility, while fixing majority of common cases. One usecase it doesn't fix is when a C/C++ app itself has an internal singletone threadpool (like one created by glib) - in this case there's no way for Emscripten to know that those "running" threads are actually semantically idle. This would be fixed by the more rigorous alternative implementation mentioned above, but, for now, such usecases can be relatively easily worked around with a bit of custom `--pre-js` that goes over all `PThread.runningWorkers` and marks them as `.unref`d. That's what I did in an app I'm currently working on, and it works pretty well. To avoid reaching into JS internals, we might consider adding an `emscripten_`-prefixed API to allow referencing/unreferencing Worker via a `pthread_t` instance from the C code, but for now I'm leaving it out of scope of this PR.
There is something wrong with the way Emscripten uses threads on Node.js. Looks like any usage prevents Node app from ever exiting.
Just tried the simplest example:
Compiled with:
This results in:
Looking at https://nodejs.org/api/worker_threads.html, I suspect Emscripten needs to call
.unref()
to indicate that it's okay to exit as soon as reference is unreachable, but probably doesn't do that yet?The text was updated successfully, but these errors were encountered: