-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kworker processes consume CPU using kernel 5.10.x and wifi #4780
Comments
Hi @Roland74 ,
|
H,i Bou6, Here is the output of I am not sure how many time I should call I installed perf 5.10. Here are the top lines of sudo perf report: Should I call it more often or can I do something else? Kind regards, |
using echo l > /proc/sysrq-trigger, you are requesting a backtrace from the the kernel, so you need to trigger it when there is a load. In the backtrace you can see what the PC is executing and figure out the reason of the load, In the attached files PC is always executing arch_cpu_idle, this why I cannot understand the reason of the problem. |
Hello Bou6, So my question is, is the load dropping to zero at your Raspberry Pi 3b+ when you are using the wlan instead of lan? One of my problems is, that I am not used to reading the dmesg backtrace information and the information from perf. It seems that the output of
Hope that helps a little bit more. |
Which perf command did you run? I found the following gives sensible results:
|
Hello pelwell,
Then I copied the lines from the screen. Today I've planned to try to reproduce it on the raspberry PI of a friend together with him. He usually uses lan so we just need to turn on wlan and use it. Then maybe we can come closer to the problem. Let's see. |
Hello pelwell,
and below:
Here a short output from top on my Raspberry 3b:
That's all for tonight. |
That's useful output - it clearly points the finger at a large number of calls to phy_check_link_status. There are a number of code paths that can trigger this, but they aren't visible because of the asynchronous nature of the kworkers. I'd like to add some extra kernel debug output to find out what is triggering the polling. Are you comfortable building a kernel yourself? If not, can you confirm the version string ( |
And could you also state which WLAN dongle you are using - the output of |
Hello Pelwell, I currently would prefer not building the kernel myself and hope that one of the kernel team tries the reproduce the problem which is really quite easy and easier and less time consuming than building a kernel. Currently I mainly use my Raspberry PI 3b (hostname wega). So I don't use a WLAN dongle (WLAN is build in there). There is only an external harddrive on USB connected. Here the output of lsusb:
Kind regards, |
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
I've traced the increased CPU usage to 094e36e, which was a fix for #4393. Increasing the polling interval within the wait loop to 150us reduces the CPU used by that kworker to 0.3% (when shown on top - sometimes it doesn't appear) with no observable detriment. Increasing the interval to 500us doesn't reduce the overhead any further. The link detection is hampered by a bug that requires energy efficiency to be turned off while polling, and it has to remain in this state for at least a second otherwise the link up event can be missed (#4393). A more effective CPU (and power) reduction would be to reduce the rate at which the PHY state machine runs in the event there is no link, but we can get nearly the same effect by inserting an extra 2s wait in the PHY status check if no energy is detected. With b0272c6, running perf found significantly less time is less time is spent running phy_check_link_status - around 0.7% in total. |
Hi Pelwell, |
Yes, I was able to reproduce it. I've only tested the fix on a 3B, but it should work work on all RPis that suffered the problem. |
Thank you very much Pelwell. That's great news. :) |
Yes it will. The last rpi-update kernel release was yesterday so it might take a few days, but all future kernel builds should show the reduced overhead. |
Super 👍 |
Hello Phil, Here is an example output from top:
And here the perf output:
I don't know if this information is helpful but hopefully it's interesting. |
There is a build available including the fix for this issue. A minor complication is that we have just switched our main kernel releases to the 5.15 kernel, but you can install the last 5.10 build with your fix by running:
The fix will come to 5.15 soon - it will be in the next build - but it missed the cut this time. |
Hello Phil, For the new 50% increase of CPU consumption of the kworker processes it could be another problem. Kind regards, |
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: #4780 Signed-off-by: Phil Elwell <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1960323 Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]> (cherry picked from commit d789cd3288a3ebc70e13719382558700e2730a68 rpi-5.15.y) Signed-off-by: Juerg Haefliger <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
I had this issue on a Pi-1B and Pi-3B running Buster at 5.10.103, so I applied the rpi-update 9c... and it updated my kernel to 5.15.80. That ended up running Buster over 5.15.80. It seemed to be functioning well. But I couldn't find kernel headers for 5.15.80 and so couldn't compile the driver for my TP-Link T2U+ WiFi dongle. So I reverted the kernel to 5.10.103. The most recent kernel headers on the firmware site (https://archive.raspberrypi.org/debian/pool/main/r/raspberrypi-firmware/ I think) are for 5.15.76. As I understand it, that version wouldn't have the patch included in 9c... Is there an rpi-update that would install a kernel >= .80 for which I could find the corresponding headers? And if so, where might I look? |
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
Hello, Is this issue already fixed? On my Raspberry Pi 1B with WLAN dongle, the load average increases 0.7 higher while using wlan0 for network and eth0 up.
|
clicube, It's not clear that the problem I had and resolved is the one you're pointing to, but here's a description of mine in case it's helpful. I used the rpi-update command from pelwell's comment of 4 Feb 2022 (above) to install that kernel (5.15.80, I think it was) over a 5.10.103 Raspbian. THAT kernel did, indeed, resolve the load increase problem on my Pi's (3B and 1). That is, with that 5.15.80 kernel installed, I could run with no Ethernet connection but with the WiFi connected, and the load was near zero; with the 5.10 kernel, the load was about 0.7 to 1.0 higher when using WiFi but with no EN connection. I don't want to do a full OS upgrade to 5.15 because I run motion on those Pi's and I need raspistill and friends to be functioning. 5.15 replaces them and breaks motion. I would be surprised if the patched kernel identified by pelwell above didn't carry into subsequent kernel versions. You might try installing the 5.15.80 kernel to see if that resolves the problem for you. Unfortunately, I could never find a kernel+headers of a version > 5.15.76 to install. I needed the headers to compile the driver code for my WiFi dongle, and I didn't want to get into the business of downloading and creating my own kernel headers. So I've chosen, instead, to create an Ethernet loop-back plug and hope that it resolves the phantom load problem when the Pi is operating with just WiFi connected. I haven't tried that yet. |
My downstream patch mentioned previously is no longer in the kernel (it was dropped in 5.19), but neither is the commit it modified that allowed more time for link-up. I believe that neither are necessary following @l1k's commit (3131a20), but please report if you believe you are still seeing excessive kworker loads. |
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no obvious benefit. Reduce that polling rate to ~6Hz. To further save CPU and power, defer the next poll if no energy is detected. See: raspberrypi/linux#4780 Signed-off-by: Phil Elwell <[email protected]>
Is this the right place for my bug report?
Presumably, because kworker is kernel related.
Describe the bug
In short words: The kworker processes consume to much CPU with newer kernels 5.10.x using wifi
After updating my Raspberry PI 1B and 3B from kernel 5.4.83 to the newer kernel 5.10.x the kworker processes continuously consumes more than 1,5% CPU (on Raspberry 3b) and more than 3% CPU on Raspberry 1B and load does not go to 0 anymore although the system has nothing to do (should be idle).
I found out that the problems occures only while using wlan (wifi). When using lan with eth0 the load goes down to 0.00 again.
I found a workaround for using the wlan and beeing not connected to the lan. When I take down eth0 with
/sbin/ifconfig eth0 down
then the kworker process calms down and the load goes to zero again.
To easily reproduce the problem I can bring eth0 up with
/sbin/ifconfig eth0 up
Then the kworker starts consuming CPU again. My Raspberry 1b shows e.g.
Then after taking eth0 down the kworker process does not consume CPU
/sbin/ifconfig eth0 down
=> afterwards top shows e.g.:
To reproduce you can switch on and of the kworker CPU consumption with
/sbin/ifconfig eth0 up
and
/sbin/ifconfig eth0 down
Actual behaviour
After booting the raspberry PIs using Wifi the kworker processes always consumes CPU.
Expected behaviour
It would be great if the kworker processes are calm without the need to call "/sbin/ifconfig eth0 down"
System
My systems are a Raspberry PI 1B and a Raspberry Pi 3B (32 and 64 Bit Raspbian) with kernel from 5.10.x. The older 5.4 kernel did not have this problem.
Here an example from the Raspberry PI 1B:
cat /etc/rpi-issue
Raspberry Pi reference 2019-09-26
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 80d486687ea77d31fc3fc13cf3a2f8b464e129be, stage2
uname -a
Linux antares 5.10.63+ #1488 Thu Nov 18 16:14:04 GMT 2021 armv6l GNU/Linux
Firmware version (vcgencmd version)
Dec 1 2021 15:07:23
Copyright (c) 2012 Broadcom
version 71bd3109023a0c8575585ba87cbb374d2eeb038f (clean) (release) (start_cd)
CPU:
/proc/cpuinfo
=>
processor : 0
model name : ARMv6-compatible processor rev 7 (v6l)
BogoMIPS : 697.95
Features : half thumb fastmult vfp edsp java tls
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xb76
CPU revision : 7
Hardware : BCM2835
Revision : 100000e
Serial : 00000000894b0b9f
Model : Raspberry Pi Model B Rev 2
Now Raspberry PI 3b e.g. in 64 Bit mode:
cat /etc/rpi-issue
Raspberry Pi reference 2021-10-30
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, c12b1df4ed6416fb0df33ba1731c5b13c1bdbdf8, stage2
kernel:
uname -a
Linux wega 5.10.63-v8+ #1488 SMP PREEMPT Thu Nov 18 16:16:16 GMT 2021 aarch64 GNU/Linux
firmware:
Nov 18 2021 16:18:09
Copyright (c) 2012 Broadcom
version d9b293558b4cef6aabedcc53c178e7604de90788 (clean) (release) (start)
CPU:
$ lscpu
Architecture: aarch64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
Vendor ID: ARM
Model: 4
Model name: Cortex-A53
Stepping: r0p4
CPU max MHz: 1200.0000
CPU min MHz: 600.0000
BogoMIPS: 38.40
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Not affected
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fp asimd evtstrm crc32 cpuid
$ cat /proc/cpuinfo
processor : 0
BogoMIPS : 38.40
Features : fp asimd evtstrm crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4
processor : 1
BogoMIPS : 38.40
Features : fp asimd evtstrm crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4
processor : 2
BogoMIPS : 38.40
Features : fp asimd evtstrm crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4
processor : 3
BogoMIPS : 38.40
Features : fp asimd evtstrm crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4
Hardware : BCM2835
Revision : a02082
Serial : 00000000c79dc44c
Model : Raspberry Pi 3 Model B Rev 1.2
I can bring more details form the Raspberry 3B, too (e.g. in 32 Bit mode) but I assume that all versions of Raspberry 1 to at least 3B are affected. At least I observe this problems with 32 Bit and 64 Bit kernels with all kernel versions from 5.10 to the latest version 5.10.63
I already opened an issue on https://forums.raspberrypi.com/viewtopic.php?p=1902788#p1902788
It would be great if you can help and fix this problem.
Kind regards,
Roland
The text was updated successfully, but these errors were encountered: