Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raspberry pi 300% cpu setTimeout() #49937

Closed
brubbel opened this issue Sep 28, 2023 · 10 comments
Closed

raspberry pi 300% cpu setTimeout() #49937

brubbel opened this issue Sep 28, 2023 · 10 comments

Comments

@brubbel
Copy link

brubbel commented Sep 28, 2023

Version

v20.7.0

Platform

5.15.61-v7l+ #1579 SMP armv7l GNU/Linux

Subsystem

What steps will reproduce the bug?

The following 3-line script causes high cpu on a raspberry pi4 /CM4 (actively tested on 2 devices, seen on more).
After restarting nodejs a couple of times the issue disappears (the device must be rebooted to consistently get the 300% cpu usage again). Both setInterval and setTimeout show the same behaviour.
After 4-5 minutes, cpu usage drops to normal levels again, until reboot that is.

setInterval(() => {
  console.log('test');
}, 20000);

Interval works correctly: 'test' is printed every x seconds

The issue is also present in the following nodejs versions, so it must have been introduced around 18.18.0.
20.7.0 bug
18.18.0 bug
18.17.1 OK!
18.17.0 OK!
18.16.0 OK!
18.14.0: OK!
18.0.0: OK!
16.20.2: OK!

The issue was detected both in a docker container and running nodejs without docker.
Tested both with .deb install and the unoffical builds.
Dockerfile is below.

Dockerfile:

FROM debian:bullseye-slim

WORKDIR /tmp/

RUN set -ex
RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections
RUN apt-get -y update

RUN apt-get -y --no-install-recommends install ca-certificates
RUN apt-get -y --no-install-recommends install curl
RUN apt-get -y --no-install-recommends install gnupg
RUN apt-get -y --no-install-recommends install libatomic1

RUN mkdir -p /etc/apt/keyrings
RUN curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key | gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg
RUN echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_20.x nodistro main" | tee /etc/apt/sources.list.d/nodesource.list
RUN apt-get -y update
RUN apt-get -y --no-install-recommends install nodejs

CMD ["/bin/bash"]

How often does it reproduce? Is there a required condition?

Always after reboot. Starting from versions v18.18.0 up to v20.7.0
Never seen on my desktop, always on RPI devices (rpi4 / CM4)

What is the expected behavior? Why is that the expected behavior?

0% cpu usage, just as in version 16.x

What do you see instead?

300% cpu usage (top -c -H to show threads)

 1083 root      20   0  127644  32736  29248 R  99.9   0.4   2:26.74 node bootservertcprelay.js                                            
 1085 root      20   0  127644  32736  29248 R  99.9   0.4   2:26.61 node bootservertcprelay.js                                            
 1090 root      20   0  127644  32736  29248 R  99.9   0.4   2:26.62 node bootservertcprelay.js 

Additional information

Seen also, but very rarely on nodejs version 16.x, so can't confirm it is the same issue.
Please take note that rpi is running a 32-bit kernel...

@aduh95
Copy link
Contributor

aduh95 commented Sep 28, 2023

It would be extremely useful to know which version of 20.x introduced the regression.

@brubbel
Copy link
Author

brubbel commented Sep 28, 2023

I have tested that the issue appears between version 18.17.1 and 18.18.0.
Did not do any tests between 18.18.0 and 20.7.0, but I am prepared to do so if that would help?

@aduh95
Copy link
Contributor

aduh95 commented Sep 28, 2023

If the bug doesn't exist in 18.17.1 but exists in 18.18.0, it's very likely that it was introduced in one of the recent 20.x releases. It would help a lot because 20.x releases will have fewer commits than 18.18.0, so it would be very helpful to help identify which one was it.

@brubbel
Copy link
Author

brubbel commented Sep 28, 2023

node-v20.2.0-linux-armv6l -> OK
node-v20.3.0-linux-armv6l -> bug

https://nodejs.org/en/blog/release/v20.3.0
I suspect this one: bfcb3d1d9a

@bnoordhuis
Copy link
Member

What do you see when you let perf top run for a while in another terminal? I'm interested in the 10 or 20 hottest functions.

@brubbel
Copy link
Author

brubbel commented Sep 28, 2023

(This was run without docker, using node-v20.3.0-linux-armv6l)
https://unofficial-builds.nodejs.org/download/release/v20.3.0/node-v20.3.0-linux-armv6l.tar.gz

  70.81%  [kernel]                                            [k] io_sq_thread
  14.52%  [kernel]                                            [k] __cond_resched
  11.89%  [kernel]                                            [k] rcu_all_qs
   0.21%  [kernel]                                            [k] _raw_spin_unlock_irqrestore
   0.05%  libc-2.31.so                                        [.] strcmp
   0.05%  perf_5.10                                           [.] 0x000c6464
   0.05%  [kernel]                                            [k] _raw_read_unlock_irqrestore
   0.05%  [kernel]                                            [k] finish_task_switch
   0.04%  [kernel]                                            [k] do_page_fault
   0.04%  libc-2.31.so                                        [.] _int_malloc
   0.04%  libarmmem-v7l.so                                    [.] memset
   0.04%  libc-2.31.so                                        [.] _int_free
   0.04%  perf_5.10                                           [.] 0x000bbbd8
   0.03%  containerd                                          [.] _start
   0.03%  [kernel]                                            [k] kunmap_local_indexed
   0.03%  libpthread-2.31.so                                  [.] __pthread_mutex_unlock_usercnt
   0.03%  [kernel]                                            [k] __slab_free
   0.03%  [vectors]                                           [.] 0x00000fa0
   0.02%  [kernel]                                            [k] unmap_page_range
   0.02%  ld-2.31.so                                          [.] do_lookup_x
   0.02%  [kernel]                                            [k] copy_page
   0.02%  [kernel]                                            [k] __kmap_local_pfn_prot

possibly related: #48654 (comment)

@bnoordhuis
Copy link
Member

Right, that's a known kernel bug. Upgrade your kernel or set UV_USE_IO_URING=0 in the environment.

@brubbel
Copy link
Author

brubbel commented Sep 28, 2023

~/node-v20.7.0-linux-armv6l/bin $ UV_USE_IO_URING=0 ./node test.js
Awesome, it does the trick!

related: #48444

I suppose this issue can be closed now.

@bnoordhuis
Copy link
Member

If you can pinpoint the range of affected kernels, I can blacklist them in libuv.

@brubbel
Copy link
Author

brubbel commented Sep 29, 2023

5.15.61-v7l+ (raspberrypi-kernel 1.20220830-1) -> bug
6.1.21-v7l+ (raspberrypi-kernel 1.20230405-1) -> bug

The 6.1.21 kernel was tested using CM4

  • node-v20.7.0-linux-armv6l
  • the official image raspios-bullseye-armhf-lite.img
  • updating the kernel with apt install raspberrypi-kernel.
  • bug is consistent after reboot and some restarts of nodejs.

6.1.54-v7l+ (manual kernel update) -> bug

however: in 64bit mode (6.1.54-v8+) -> OK!

So, it is safe to conclude that all 32-bit kernel versions, up to 6.1.54-v7l+ are affected.
I have only tested 64-bit for 6.1.54-v8+ during this complete thread.

Available kernels: https://github.com/raspberrypi/firmware/commits/master
Would the 64-bit not cause other issues with VPN etc, I would be happy to switch :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants