Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setImmediate / setTimeout with delay 0 seems blocking the main thread #3333

Closed
andrasat opened this issue Jun 16, 2023 · 12 comments · Fixed by #6674
Closed

setImmediate / setTimeout with delay 0 seems blocking the main thread #3333

andrasat opened this issue Jun 16, 2023 · 12 comments · Fixed by #6674
Labels
bug Something isn't working

Comments

@andrasat
Copy link

What version of Bun is running?

0.6.9

What platform is your computer?

Linux 6.1.31-2-MANJARO x86_64 unknown

What steps can reproduce the bug?

So ive been playing with setImmediate and setTimeout in this repo, and try make a simple server with endpoints:

  • /ping
  • /blocking
  • /not-blocking

/blocking and /not-blocking has the same cpu intensive task, with difference that /blocking is sync, and /not-blocking is async.

Steps:

  • run server
bun start:bun // bun

bun start:node // node
npm run start:node // node
  • hit /ping every second with this command
while true; do date; curl 127.0.0.1:3030/ping; sleep 1; echo; done;
  • in another terminal, hit /not-blocking
curl 127.0.0.1:3030/not-blocking

In node, they are working as expected, when /not-blocking is still running, i can still receive response from /ping endpoint. But in bun, /not-blocking is blocking and /ping response is returned after the task is finished.

What is the expected behavior?

  • hit /not-blocking first, then /ping
  • /ping should return { "data": "PONG" } immediately without waiting for /not-blocking to finish

What do you see instead?

  • hit /not-blocking first, then /ping
  • /ping is waiting for /not-blocking finish before returning { "data": "PONG" }

Additional information

No response

@andrasat andrasat added the bug Something isn't working label Jun 16, 2023
@andrasat andrasat changed the title setImmediate / setTimeout seems not running async setImmediate / setTimeout seems blocking the main thread Jun 21, 2023
@simylein
Copy link
Contributor

For me this actually segfaults at the moment with

curl: (52) Empty reply from server

and

BUN:GET:/ping:2023-07-15T18:21:44.126Z
BUN:GET:/not-blocking:2023-07-15T18:21:44.298Z
Segmentation fault: 11

I am running bun 0.6.14 on Darwin 22.5.0 arm64 arm

@andrasat
Copy link
Author

andrasat commented Jul 16, 2023

UPDATED:
@simylein I got segfaulted to in my m1 pro lol, Darwin 22.5.0 arm64 arm, and bun version 0.6.14, even in /blocking endpoint, it seems there is some problem with crypto module?

-- Screenshot --
Screenshot 2023-07-16 at 18 17 05

Hmm that's interesting @simylein , I haven't had any issues with segfaults tho, with 0.6.14 and running bun-debug in 0.6.15-debug (in my WSL 2 desktop)

Anyway I tried again today with the latest commit in main (0.6.15-debug), it might be related to issue #3226, but when I tried using other than 0 delay time in setTimeout, /not-blocking actually not blocking other endpoints, although it finished slower, i adjusted the iteration to be lower so it can finished quickly.

In bun-debug, there are some differences in logs when running setTimeout 1 and setTImeout 0 or setImmediate

  • setTimeout 0 or setImmediate
BUN:GET:/not-blocking:2023-07-16T09:47:00.906Z
[SYS] write(1, 47) = 47
[FileSink] Wrote 47 bytes (fd: 1, head: 0, 0/913)
[RequestContext] onResolve
[RequestContext] render
[RequestContext] render
[RequestContext] finalize (src.bun.js.api.server.NewRequestContext(false,true,src.bun.js.api.server.NewServer(false,true))@58b50550000)
[RequestContext] finalizeWithoutDeinit (src.bun.js.api.server.NewRequestContext(false,true,src.bun.js.api.server.NewServer(false,true))@58b50550000)
[RequestContext] finalizeWithoutDeinit: response_jsvalue != .zero
[RequestContext] finalizeWithoutDeinit: request_body != null
[RequestContext] finalizeWithoutDeinit: this.pathname.len  0 null
[Server] deinitIfWeCan
[RequestContext] deinit (src.bun.js.api.server.NewRequestContext(false,true,src.bun.js.api.server.NewServer(false,true))@58b50550000)
[RequestContext] create (src.bun.js.api.server.NewRequestContext(false,true,src.bun.js.api.server.NewServer(false,true))@58b50550000)
[bun] poll(1) writable: true (4)

-- Screenshot --
setTimeout0

  • setTimeout 1
BUN:GET:/not-blocking:2023-07-16T09:51:03.473Z
[SYS] write(1, 47) = 47
[FileSink] Wrote 47 bytes (fd: 1, head: 0, 0/78)
[Loop] ref
[Loop] unref
[uws] Timer.deinit()
[Loop] ref
[Loop] unref
[uws] Timer.deinit()
.....
.....
.....
[Loop] ref
[Loop] unref
[uws] Timer.deinit()
[Loop] ref
[Loop] unref
[uws] Timer.deinit()
[RequestContext] onResolve
[RequestContext] render
[RequestContext] render
[RequestContext] finalize (src.bun.js.api.server.NewRequestContext(false,true,src.bun.js.api.server.NewServer(false,true))@45d2e550000)
[RequestContext] finalizeWithoutDeinit (src.bun.js.api.server.NewRequestContext(false,true,src.bun.js.api.server.NewServer(false,true))@45d2e550000)
[RequestContext] finalizeWithoutDeinit: response_jsvalue != .zero
[RequestContext] finalizeWithoutDeinit: request_body != null
[RequestContext] finalizeWithoutDeinit: this.pathname.len  0 null
[Server] deinitIfWeCan
[RequestContext] deinit (src.bun.js.api.server.NewRequestContext(false,true,src.bun.js.api.server.NewServer(false,true))@45d2e550000)

-- Screenshot --
setTimeout1

@andrasat andrasat changed the title setImmediate / setTimeout seems blocking the main thread setImmediate / setTimeout with delay 0 seems blocking the main thread Jul 16, 2023
@fubhy
Copy link

fubhy commented Sep 11, 2023

effect-ts is also affected by this (Effect-TS/io#662)

@ThePrimeagen
Copy link

Here is a simple code example to see it happen


function test() {
    console.log('11111');
    setTimeout(test, 0);
}

function test2() {
    console.log('22222');
    setTimeout(test2, 1);
}

test();
test2();

run bun test.js or node test.js to see the difference

@orangeswim
Copy link

orangeswim commented Sep 12, 2023

Troubleshooting the issue.

I modified the test code to

function repeat(log, delay, count) {
  if (count <= 0) return;
  console.log(log);
  setTimeout(() => repeat(log, delay, count - 1), delay);
}

const timesToRun = 5;
repeat("1", 0, timesToRun);
repeat("2", 1, timesToRun);

I added debug statements in fifo.zig and the events are getting queued in an unexpected order, instead of the expected alternating messages. There seems to be some race condition since adding the extra commands above is changing the order of the queue. In the original test code by @ThePrimeagen , one function always beat the other.

Output

/home/orange/bun/bun/packages/debug-bun-linux-x64/bun-debug test.js
[SYS] read(3, 4096) = 4096 (0.052ms)
[SYS] close(3)
[fs] openat(10, /home/orange/bun/bun/package.json) = 11
[SYS] close(11)
[fs] openat(10, /home/orange/bun/bun/tsconfig.json) = 11
[SYS] close(11)
[fs] openat(0, /home/orange/bun/bun/tsconfig.base.json) = 11
[SYS] close(11)
[SYS] close(6)
[SYS] close(7)
[SYS] close(8)
[SYS] close(9)
[SYS] close(10)
[fs] openat(0, /home/orange/bun/bun/test.js) = 12
[SYS] close(12)
11111
setTimeout
set
writeItem
  Count: 0
22222
setTimeout
set
[Loop] ref
11111
setTimeout
set
writeItem
  Count: 0
11111
setTimeout
set
writeItem
  Count: 0
11111
setTimeout
set
writeItem
  Count: 0
11111
setTimeout
set
writeItem
  Count: 0
[Loop] unref
[uws] Timer.deinit()
writeItem
  Count: 0
22222
setTimeout
set
[Loop] ref
[Loop] unref
[uws] Timer.deinit()
writeItem
  Count: 0
22222
setTimeout
set
[Loop] ref
[Loop] unref
[uws] Timer.deinit()
writeItem
  Count: 0
22222
setTimeout
set
[Loop] ref
[Loop] unref
[uws] Timer.deinit()
writeItem
  Count: 0
22222
setTimeout
set
[Loop] ref
[Loop] unref
[uws] Timer.deinit()
writeItem
  Count: 0

In bun.zig Timer.set there is a condition specifically for 0 interval setTimeout. Removing this logic doesn't fix things as the extern call to us_timer_set in Timer.set doesn't work with 0 ms. The timer never calls the callback.

My understanding of the events:
setTimeout( x, 0) places the cb task on the FIFO queue
setTImeout(x,y) goes on a uws.Timer. The uws.Timer callback places the cb task on the FIFO queue.
The uws.Timer happens outside of the event_loop.zig tick function.

@orangeswim
Copy link

Ok after sleeping on it. I have the correct order of events.

  • event_loop.zig EventLoop.tick()
    • EventLoop.tickWithCount() will run an loop to run all tasks in this.tasks.
    • When setTimeout runs with 0 interval, it will add to this.tasks.
    • with the exception on when func1 and func2 are first executed. All repetitions of func1 will happen regardless of how large the task length is.

So this introduces an infinite loop when a setTimeout with 0 interval adds another setTimeout with 0 interval.
I haven't dug into too much of the nodejs code, but it appears they have a global queue, or at least an intermediate queue for everything that should happen on the nextTick.

Any one have any thoughts on the appropriate flow for setTimeout 0 tasks? I'm thinking it should add the CallbackJob / task on one of the other queues so it doesn't get stuck in that loop.

Updated test code

function repeat(log, delay, count) {
 if (count <= 0) return;

 var i = 0;
 while (i < 10000) {
   i++;
 }
 console.log(log, i);
 setTimeout(() => repeat(log, delay, count - 1), delay);
}

const timesToRun = 5;
repeat("11111", 0, timesToRun);
repeat("22222", 1, timesToRun);

Bun Output

/home/orange/bun/bun/packages/debug-bun-linux-x64/bun-debug test.js
[SYS] read(3, 4096) = 4096 (0.085ms)
[SYS] close(3)
[fs] openat(10, /home/orange/bun/bun/package.json) = 11
[SYS] close(11)
[fs] openat(10, /home/orange/bun/bun/tsconfig.json) = 11
[SYS] close(11)
[fs] openat(0, /home/orange/bun/bun/tsconfig.base.json) = 11
[SYS] close(11)
[SYS] close(6)
[SYS] close(7)
[SYS] close(8)
[SYS] close(9)
[SYS] close(10)
EventLoop.tick
tickWithCount
[fs] openat(0, /home/orange/bun/bun/test.js) = 12
[SYS] close(12)
11111 10000
setTimeout
set 0 id:1
22222 10000
setTimeout
set 1 id:2
[Loop] ref
tickWithCount
tickWithCount.while tasks.readItem
11111 10000
setTimeout
set 0 id:3
tickWithCount.while tasks.readItem
11111 10000
setTimeout
set 0 id:4
tickWithCount.while tasks.readItem
11111 10000
setTimeout
set 0 id:5
tickWithCount.while tasks.readItem
11111 10000
setTimeout
set 0 id:6
tickWithCount.while tasks.readItem
tickWithCount
tickWithCount
Loop.tick
[Loop] unref
[uws] Timer.deinit()
EventLoop.tick
tickWithCount
tickWithCount.while tasks.readItem
22222 10000
setTimeout
set 1 id:7
[Loop] ref
tickWithCount
tickWithCount
EventLoop.tick
tickWithCount
tickWithCount
Loop.tick
[Loop] unref
[uws] Timer.deinit()
EventLoop.tick
tickWithCount
tickWithCount.while tasks.readItem
22222 10000
setTimeout
set 1 id:8
[Loop] ref
tickWithCount
tickWithCount
Loop.tick
[Loop] unref
[uws] Timer.deinit()
EventLoop.tick
tickWithCount
tickWithCount.while tasks.readItem
22222 10000
setTimeout
set 1 id:9
[Loop] ref
tickWithCount
tickWithCount
Loop.tick
[Loop] unref
[uws] Timer.deinit()
EventLoop.tick
tickWithCount
tickWithCount.while tasks.readItem
22222 10000
setTimeout
set 1 id:10
[Loop] ref
tickWithCount
tickWithCount
Loop.tick
[Loop] unref
[uws] Timer.deinit()
EventLoop.tick
tickWithCount
tickWithCount.while tasks.readItem
tickWithCount
tickWithCount

Node Output (exepected)

11111 10000
22222 10000
11111 10000
22222 10000
11111 10000
22222 10000
11111 10000
22222 10000
11111 10000
22222 10000

@ericklt
Copy link

ericklt commented Sep 17, 2023

Expanding on @orangeswim 's comment.

The core of the event_loop's tick() function is:

while (this.tickWithCount() > 0) : (this.global.handleRejectedPromises()) {
    this.tickConcurrent();
}

where there's a loop that execute the tasks:

// event_loop.zig > EventLoop > tickWithCount (L:581)
while (this.tasks.readItem()) |task| {...}

Due to setTimeout(_, 0) being a direct call to the Loop function enqueTask:

// bun.zig > Timer > set (L:3644)
vm.enqueueTask(JSC.Task.init(&job.task));

// event_loop.zig > EventLoop > enqueTask (L:902)
pub fn enqueueTask(this: *EventLoop, task: Task) void {
    this.tasks.writeItem(task) catch unreachable;
}

when a js functon is executed and calls setTimeout(_, 0), it adds to the this.tasks queue before the tickWithCount loop resolves. If the function is recursive or calls multiple setTimeout(_, 0), it keeps the bun event_loop from ticking, possibly indefinetly.

This is the first problem, and can be solved by using an aux. queue as @orangeswim said, or some variation of this, for example:

var executable_tasks: Queue = EventLoop.Queue.init(this.virtual_machine.allocator);
while (this.tasks.readItem()) |task| {
    executable_tasks.writeItem(task) catch unreachable;
}
while (executable_tasks.readItem()) |task| {...}

But the problem goes deeper, doing this unlocks the ticking of the event loop, but the loop still doesn't listen for uws events triggered by setTimeouts with delay greater than 0, since it gets stuck in this loop:

// event_loop.zig > EventLoop > tick (L:841)
while (this.tickWithCount() > 0) : (this.global.handleRejectedPromises()) {
    this.tickConcurrent();
}

By doing something like this:

while (this.tickWithCount() > 0) : (this.global.handleRejectedPromises()) {
    this.tickConcurrent();
    this.autoTick(); // <-- added
}

the loop is able to insert tasks inbetween ticks, but this is probably not the right solution since in some tests that I did, it made the setTimeout(_, 0) calls seem slow (apparently taking 1s between each execution, which is odd).

The test I'm using:

const timesToRun = 10;

var runCount = 0
const obj = {}

function repeat(log, delay) {
  if (runCount >= timesToRun) {
    console.log(`${log}: ${obj[log]}`)
    return
  }

  runCount += 1
  obj[log] = (obj[log] || 0) + 1
  // console.log(`${log}: ${obj[log]}`)
  
  setTimeout(() => repeat(log, delay), delay);
}


repeat("1", 0)
repeat("2", 1)

@ericklt
Copy link

ericklt commented Sep 17, 2023

One interesting thing to keep in mind, running

repeat("1", 0)
repeat("2", 1)

in nodejs gives the exact same output as

repeat("1", 1)
repeat("2", 1)

regardless of the amount of repetitions, so it apparently treats setTimeout(_, 0) the same as setTimeout(_, 1).

When running

repeat("1", 1)
repeat("2", 2)

it gives the expected 2/1 ratio.

@Jarred-Sumner
Copy link
Collaborator

I love the investigative work here

I think making setTimeout have a minimum of 1 is a good option and leaving setImmediate to be the one with special behavior that appends to the task queue. Maybe in the future we will implement requestIdleCallback.

@ericklt
Copy link

ericklt commented Sep 17, 2023

That's probably a good temporary solution.

Something that I just tried, was running setTimeout(_, 0) as any other setTimeout by removing the condition that puts it directly on the loop, and adding this to the epoll_kqueue.c us_timer_set function:

if (ms == 0) {
    timer_spec.it_value.tv_sec = 0;
    timer_spec.it_value.tv_nsec = 1L;
}

this'll only work on linux since it uses the epoll_kqueue event loop, but it seemed to work flawlessly


test 1 (100_000 runs):

repeat("1", 0) // (key, timeout[ms])
repeat("2", 1)

result:

1: 98554
2: 1446

test 2 (100_000 runs):

repeat("1", 0)
repeat("2", 2)

result:

1: 99245
2: 755

I don't know if it can be implemented like this for other OSs, nor the exact impact on compared performance when using timeout 0, but it seems like a interesting temporary solution as well.

@ewrogers
Copy link

ewrogers commented Sep 18, 2023

On macOS I believe you could use Grand Central Dispatch (GCD) via dispatch_after:

dispatch_time_t delay = dispatch_time(DISPATCH_TIME_NOW, (int64_t)(yourDelayInSeconds * NSEC_PER_SEC));

dispatch_after(delay, dispatch_get_main_queue(), ^{
    // Your code to be executed after the specified delay
});

As for Windows you can use CreateTimerQueue, CreateTimerQueueTimer, and DeleteTimerQueue APIs:

void CALLBACK TimerCallback(PVOID lpParameter, BOOLEAN TimerOrWaitFired) {
    // Your code to be executed after the specified delay
    printf("Timeout complete!\n");
}

void setTimeout(void (*callback)(), DWORD delayMs) {
    HANDLE hTimerQueue = CreateTimerQueue();
    if (hTimerQueue == NULL) {
        // Handle error
        return;
    }

    if (!CreateTimerQueueTimer(&hTimer, hTimerQueue, TimerCallback, NULL, delayMs, 0, 0)) {
        // Handle error
    }

    // Clean up the timer queue when done (optional)
    // DeleteTimerQueue(hTimerQueue);
}

I suppose more platform-specific native code could be added to packages/bun-usockets/src/eventing/ for this.

@Jarred-Sumner
Copy link
Collaborator

This will be fixed in the next version of Bun (once #6674 is merged)

The script from @ThePrimeagen below:

function test() {
    console.log('11111');
    setTimeout(test, 0);
}

function test2() {
    console.log('22222');
    setTimeout(test2, 1);
}

test();
test2();

Now logs the following infinitely repeating in bun and node:

11111
22222
11111
22222
11111
22222
11111
22222
11111
22222

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants