-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
console.log failure while working with worker threads. #30491
Comments
As a temporary solution, send logs to the main and log them there. |
Thanks for answering. |
Fwiw, this is how The issue happens because when the main thread is busy, then it also won’t process such data coming from the Workers. |
I can confirm that. Sending the logs to the main thread does nothing while the main thread is busy. |
That sounds like a good idea. I usually try not to block the main thread at all in those scenarios, to make sure it remains responsive. In theory, we could allow threads to do I/O directly, but it will be tricky and might lead to unwanted side effects. |
how about |
@devsnek What would that do and how would it work? |
@addaleax instead of sending internal messages to the main thread, it would just directly write to stdout/err. messages might be garbled on occasion, but sometimes that isn't a problem. |
@devsnek Libuv won’t let you create multiple handles for the same fd, plus writing synchronously would be problematic for the same reasons for which it is on the main thread (and |
hmm, aren't they separate event loops? |
They are separete event loops – see #30507 for example crashes when sharing |
but it isn't inherently impossible... you just have to make sure you aren't renumbering, closing, etc. |
I have a question, i don't know a lot when it comes to the inner workings of nodejs. But couldn't a console.log from a worker thread be pushed into the event loop of the main thread? That way it should be processed at some point since its in the Q, even if it's late or at the end, still better than not getting them at all. |
If we really want to do that, it might make more sense to have some mutex/locking mechanism to prevent concurrent writes. That would ensure that ouput from
I think that is the current behavior, isn't it? |
mutex is an interesting idea. I guess we'd have to specialise it to our stdio handles? |
I don't know, that's why i'm asking. But if it is the current behavior, than there is an issue no? If it gets pushed into the event loop, than it has to be processed at some point. It can't be freed from the event loop without execution else this would mean the event loop is totally unreliable. |
Yes, but that is assuming that we would prefer allowing worker threads to use I/O directly instead of the current behavior. I hope that @addaleax can shine some light on this.
It depends on what the main thread is doing, if it is busy running synchronous JavaScript code, then it will never have a chance to process messages. If you have some code for us to reproduce the problem, that would be helpful. |
@tniessen I’m not entirely sure how the mutex approach would work… it seems like that’s something that might still require moving stdio to a separate thread if we really want it to always be available, and then post messages to that thread (which can be just a C++ thread, not a |
If it's just about writing to On the other hand, I think that the current behavior is reasonable, and it is a fair restriction to only write from the main thread. I guess worker threads shouldn't really use |
FWIW, I don't believe this is actually the behavior being used, at least not as of 12.9.0 (on MacOS). I just ran a test app that ran an interval on the main thread to output a console.log, and then put console.log lines on the onmessage handler inside the worker thread from the parentport, and on the onmessge handler in the main thread from the worker thread, and then some console.logs inside the worker thread as it does some work, and reports progress out via the port along the way. The interval regularly reported (expected -- basically nothing's going on on the main thread here). If it was just a matter of keeping the main thread unblocked, I'd totally understand that limitation in behavior, but that doesn't appear to be the case at all here -- you need both the worker thread message pump to be free before it can send the console messages back to the main thread, who then ALSO needs its message pump to be free before it can process them. So, I'm going to end up having to basically trap console.log on my worker thread and pipe it back over the message port to the parent thread if I want real-time console reporting of progress (while debugging, etc.) |
@deregtd I can’t reproduce that behaviour based on your description – would you mind sharing the code you used? |
Sorry it took so long to get around to this repro. Life’s been busy… consoleissue.js: const path = require('path');
const wt = require('worker_threads');
const worker = new wt.Worker(path.resolve(path.join(__dirname, 'consoleissue-worker.js')));
worker.on('message', msg => {
console.log('worker response: ' + JSON.stringify(msg));
});
console.log('posting');
worker.postMessage({ dostuff: true });
console.log('posted'); consoleissue-worker.js const wt = require('worker_threads');
wt.parentPort.on('message', (msg) => {
console.log('starting');
let total = 0;
for (let i=0; i<50000000; i++) {
total += Math.sin(i);
if ((i % 5000000) === 0) {
console.log(i);
wt.parentPort.postMessage({ msg: i.toString() });
}
}
console.log('finished - ' + total);
wt.parentPort.postMessage({ done: true });
}); Run
If the console log worked as theoretically stated in here, the worker responses should be right alongside the normal console.logs of the numbers, but instead the console.logs from the worker thread all show up at the same time when the worker thread returns control to the event loop. |
Any calls that send to |
And a copious amount of yak shaving: * I had to reimplement console logging because it's apparently a known bug that it does not work that well on worker threads (nodejs/node#30491) * I had to add a GYP config file because reimplementing logging is apparently impossible without adding native code * I had to change arena.js debug logging to not rely on the percent interpolation functionality in console.log which is apparently not available anywhere else ...and the rwlock may not even work. I'll still need to add a stress test.
Any updates? |
@addaleax In #44710 @JakobJingleheimer and I have been working on getting the ESM loaders processing to happen in a worker thread, and we’ve run into this issue, that import { writeFile } from 'node:fs/promises'
await writeFile('/dev/fd/1', 'some message I want printed to stdout') And this works, even while the main thread is frozen. So I think @devsnek’s idea should be possible. As you might’ve guessed, I’m on a Mac, so I assume this will probably have difficulty in Windows. Is there an equivalent to More generally, could we provide something like a |
@GeoffreyBooth They do print, just asynchronously (which can also happen with the main thread, to be clear). Is the main thread blocked in this scenario? If so, it might be worth thinking a bit about why that would be necessary.
Yeah, I think that’s expected. For debugging this should be totally fine.
IIUC what @devsnek was suggesting was to make stdio handles in the main thread and worker threads work the same way and refer to the same fds. That’s what doesn’t work. I would be very reluctant to go the way of using
You could pass
If we had this, it should probably just be
I do think it’s a good thing that From a very practical point of view, I think something that could address the issue that @deregtd brought up above would be to allow for a certain amount of data to be written without the other side acknowleding it, similar to how stdio works in other contexts. That doesn’t address the case in which the main thread event loop is blocked, but it partially addresses the one in which the worker thread event loop is blocked. |
my suggestion was that console methods should coordinate natively using a mutex or smth, instead of queuing up on the main thread to be processed. i don't think we really need to worry that much about fairness or anything. |
Agreed, I'm also not worried about fairness between threads. |
Hello everyone.
Thank you for your hard work on NodeJS.
Now, i developed a small service, which takes 4 csv files, parses them, maps them together and imports them into elasticsearch.
Each file is being parsed on a different thread.
The parsed content of one of the files is being send via an Event to a different file, this file spawns for each set of data, a new thread that will import that set into ES.
In parallel on the main thread, i send the content of one of the files in chunks via an Event with the contents of the remaining 2 files, to a different script again.
Which will spawn a new thread for that chunk of data. That thread will map the given data to the chunk provided, if they match. Send the mapped data back to the main thread, which again will spawn a new thread who will import the mapped data into ES.
The issue i have here is, that once everything is working at the same time, the only console.logs i get are the ones from the main thread. Everything that is being logged on a worker thread, is being lost somewhere, while the main thread is under load.
Note: The actual code is being processed as it should, it is just the console.logs who do not care.
This makes debugging on worker threads really difficult. Maybe i am missing something.
The text was updated successfully, but these errors were encountered: