Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An interesting network issue #181

Open
cemunal opened this issue Mar 19, 2024 · 16 comments
Open

An interesting network issue #181

cemunal opened this issue Mar 19, 2024 · 16 comments

Comments

@cemunal
Copy link

cemunal commented Mar 19, 2024

Sometimes network is not responding but there is no disconnection notification from network manager. I am using build-in kernel drivers. Here is some info:

sudo lspci -v:

02:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8821CE 802.11ac PCIe Wireless Network Adapter
Subsystem: AzureWave Device 3040
Flags: bus master, fast devsel, latency 0, IRQ 128
I/O ports at e000 [size=256]
Memory at ef000000 (64-bit, non-prefetchable) [size=64K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [70] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [148] Device Serial Number 00-e0-4c-ff-fe-c8-21-01
Capabilities: [158] Latency Tolerance Reporting
Capabilities: [160] L1 PM Substates
Capabilities: [170] Precision Time Measurement
Capabilities: [17c] Vendor Specific Information: ID=0003 Rev=1 Len=054 <?>
Kernel driver in use: rtw_8821ce
Kernel modules: rtw88_8821ce


uname -r:

6.7.9-200.fc39.x86_64


/etc/modprobe.d/rtw88_core.conf >>> options rtw88_core disable_lps_deep=y
/etc/modprobe.d/rtw88_pci.conf >>> options rtw88_pci disable_aspm=y
/etc/NetworkManager/conf.d/default-wifi-powersave-on.conf >>> wifi.powersave = 2


sudo dmesg | grep rtw:

[ 7.875217] rtw_8821ce 0000:02:00.0: enabling device (0000 -> 0003)
[ 7.890966] rtw_8821ce 0000:02:00.0: Firmware version 24.11.0, H2C version 12
[ 8.134191] rtw_8821ce 0000:02:00.0 wlp2s0: renamed from wlan0
[ 83.529370] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 83.529377] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 83.529386] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)
[ 83.801262] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 83.801266] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 83.801270] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)
[ 145.166953] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 145.166965] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 145.166977] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)
[ 200.974247] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 200.974257] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 200.974269] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)


When network not responding:

cem@fedora:$ ping -c4 goo.gl
ping: goo.gl: Name or service not known
cem@fedora:
$ ping -c4 goo.gl
ping: goo.gl: Temporary failure in name resolution
cem@fedora:$ ping -c4 goo.gl
ping: goo.gl: Temporary failure in name resolution
cem@fedora:
$ ping -c4 goo.gl
ping: goo.gl: Name or service not known
cem@fedora:$ ping -c4 goo.gl
ping: goo.gl: Temporary failure in name resolution
cem@fedora:
$

@dubhater
Copy link
Collaborator

Excellent! I have been waiting for someone with this problem to test the potential fix which is going into kernel 6.9. You can do it by blacklisting rtw88_8821ce, rtw88_8821c, rtw88_pci, and rtw88_core, and then installing the driver from this repository.

@cemunal
Copy link
Author

cemunal commented Mar 20, 2024

Installed the driver and tested ~6 hours. As a result, there is no problem like described in the post. All the other configs are same like:

options rtw_core disable_lps_deep=y
options rtw_pci disable_aspm=y
wifi.powersave = 2


But getting same dmesg errors. (Not sure but i think getting these errors a bit less when AC plugged after installed repo drivers)


Some extra Qs:

When will not we have to use extra module options and disable wifi powersaving?
When will this fix added to build-in kernel drivers?


Thank you.

@dubhater
Copy link
Collaborator

This fix will be in kernel 6.9. Eventually the stable and longterm kernels will have it as well, maybe when 6.9-rc0 appears.

Are you sure you still need those options? What are they fixing?

@lwfinger
Copy link
Owner

@cemunal - If you need those fixes, it is because of the BIOS found in your device cannot properly handle your PCIe devices. On my 10-year old Samsung laptop, there is never a problem. It is the new Lenovo and HP laptops that show this "feature".

@cemunal
Copy link
Author

cemunal commented Mar 21, 2024

@dubhater, I removed options rtw_core disable_lps_deep=y, options rtw_pci disable_aspm=y and wifi.powersave = 2 and there is no problems :) and I will wait for 6.9.x kernels.

@lwfinger, my laptop MFD 2021 and BIOS release date 2019 (ASUS X540UAR)

and lastly I am getting these logs:

sudo dmesg | grep rtw:


[ 8.629772] rtw_core: loading out-of-tree module taints kernel.
[ 8.629779] rtw_core: module verification failed: signature and/or required key missing - tainting kernel
[ 8.832251] rtw_8821ce 0000:02:00.0: enabling device (0000 -> 0003)
[ 8.839749] rtw_8821ce 0000:02:00.0: Firmware version 24.8.0, H2C version 12
[ 8.948701] rtw_8821ce 0000:02:00.0 wlp2s0: renamed from wlan1
[ 165.081916] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 165.081920] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 165.081923] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)
[ 195.110291] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 195.110300] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 195.110312] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)
[ 206.916185] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 206.916196] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 206.916210] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)
[ 456.017608] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 456.017611] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 456.017615] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)
[ 456.237650] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 456.237657] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 456.237664] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)
[ 531.806429] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 531.806434] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 531.806438] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)
[ 1265.197840] rtw_8821ce 0000:02:00.0: firmware failed to leave lps state
[ 1401.770298] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 1401.770315] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 1401.770333] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)


Can these logs or errors ignored or they need any fixes?

Thanks.

@cemunal
Copy link
Author

cemunal commented Mar 21, 2024

New logs but there is no connection problems:

failed to send h2c command
firmware failed to leave lps state

PS: the system froze once :(

@pyt0xic
Copy link

pyt0xic commented Mar 23, 2024

I have had a similar issue with rtw88_8821cu for ages.
The connection would drop after some time and it was always worse when not connected to AC.

A month ago I switched to this driver and VOILA, no more connection dropping 😄

The firmware download error messages are also gone, I saw there was a fix for that added to 6.9, didn't realize it came from here.

Thank you for your hard work!

@lwfinger
Copy link
Owner

You have the code generation sequence wrong. The new code goes into the wireless-next repo, which is 2 steps ahead of the stable releases. For example, stable is at 6.8, the mainline development is working on 6.9, and wireless-next has 6.10 code. What I do is backport the wireless-next code and make it compile on older kernels. That way the rtw88 repo is ahead of your distros kernels.

@cemunal
Copy link
Author

cemunal commented Mar 23, 2024

Here again to understand all :)

I have no "failed to send h2c command" & "firmware failed to leave lps state" logs and no system freeze with "options rtw_core disable_lps_deep=1" & "options rtw_pci disable_aspm=1" and "wifi.powersave = 2" (using repo drivers). I am only getting these logs:


[ 8.632507] rtw_core: loading out-of-tree module taints kernel.
[ 8.632515] rtw_core: module verification failed: signature and/or required key missing - tainting kernel
[ 8.860964] rtw_8821ce 0000:02:00.0: enabling device (0000 -> 0003)
[ 8.868039] rtw_8821ce 0000:02:00.0: Firmware version 24.8.0, H2C version 12
[ 8.958539] rtw_8821ce 0000:02:00.0 wlp2s0: renamed from wlan0
[ 78.212672] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 78.212677] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 78.212682] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)
[ 123.395771] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 123.395774] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 123.395777] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)
[ 158.829239] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 158.829251] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 158.829266] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)
[ 343.123940] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 343.123953] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 343.123967] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)
[ 435.885108] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 435.885122] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000
[ 435.885196] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)


Q1: Can these logs ignored?
Q2: Can the code improved to use without additional settings like "options rtw_core disable_lps_deep=1" , ... , or are these BIOS releated problems? (PS: I have no error logs with Ubuntu 's DKMS driver.)

And @lwfinger, I am a bit confused about your last post. My main problem is "sometimes network is not responding" you know. Is this fix in 6.9.x or 6.10.x kernels?

Thanks.

@pyt0xic
Copy link

pyt0xic commented Mar 23, 2024

Here again to understand all :)

I have no "failed to send h2c command" & "firmware failed to leave lps state" logs and no system freeze with "options rtw_core disable_lps_deep=1" & "options rtw_pci disable_aspm=1" and "wifi.powersave = 2" (using repo drivers). I am only getting these logs:

[ 8.632507] rtw_core: loading out-of-tree module taints kernel. [ 8.632515] rtw_core: module verification failed: signature and/or required key missing - tainting kernel [ 8.860964] rtw_8821ce 0000:02:00.0: enabling device (0000 -> 0003) [ 8.868039] rtw_8821ce 0000:02:00.0: Firmware version 24.8.0, H2C version 12 [ 8.958539] rtw_8821ce 0000:02:00.0 wlp2s0: renamed from wlan0 [ 78.212672] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID) [ 78.212677] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000 [ 78.212682] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First) [ 123.395771] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID) [ 123.395774] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000 [ 123.395777] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First) [ 158.829239] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID) [ 158.829251] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000 [ 158.829266] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First) [ 343.123940] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID) [ 343.123953] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000 [ 343.123967] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First) [ 435.885108] rtw_8821ce 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID) [ 435.885122] rtw_8821ce 0000:02:00.0: device [10ec:c821] error status/mask=00000001/0000e000 [ 435.885196] rtw_8821ce 0000:02:00.0: [ 0] RxErr (First)

Q1: Can these logs ignored? Q2: Can the code improved to use without additional settings like "options rtw_core disable_lps_deep=1" , ... , or are these BIOS releated problems? (PS: I have no error logs with Ubuntu 's DKMS drives.)

And @lwfinger, I am a bit confused about your last post. My main problem is "sometimes network is not responding" you know. Is this fix in 6.9.x or 6.10.x kernels?

Thanks.

If my understanding is correct, wireless-next is ahead of mainline, and contains all the patches that have been added to this repo.
Some of said patches should be making it into 6.9 (and/or have been added to mainline, I think...)

I don't know much about the 8821ce driver but maybe try the following setting for NetworkManager

[device]
wifi.scan-rand-mac-address=no

Just a shot in the dark xD

@lwfinger
Copy link
Owner

@cemunal - Those options are needed because of BIOS problems. All of the rtw88 drivers work on my system with no options needed.

I would worry about those PCIe Bus errors. Yes, they are corrected, but I have not seen them on any other system.

Any patches in 6.9 are already in this repo, as are those that will be in 6.10.

@cemunal
Copy link
Author

cemunal commented Mar 25, 2024

@lwfinger, You mean my laptop has some BIOS problems but if this is main problem, shouldn't I face similar problems with Ubuntu 's DKMS driver or on Windows? And may be I have a different card you have (Subsystem: AzureWave for example)

I write these because I want to contribute this repo and build-in kernel drivers. And I can do same tests if needed.

Thanks.

@lwfinger
Copy link
Owner

The reason I wrote that is because only Lenovo and HP laptops have problems such as you see. Windows uses a completely different driver than Linux - I have no idea what they do as I have never seen their source. Again, I have no idea what Ubuntu is doing. If you could figure out what triggers the problem on your system, that would be a big help.

@cemunal
Copy link
Author

cemunal commented Mar 26, 2024

@lwfinger, firstly thank you so much for all your detailed answers.

I wanted to analyze the code site of Ubuntu 's DKMS driver and "alt_rtl8821ce" driver (also no problems with this driver) to figure out what triggers the problem and to find diffs from the code of this repo but this is beyond me :)

So I will only ignore the all PCIe Bus Errors and add rtw_core disable_lps_deep=y, rtw_pci disable_aspm=y, wifi.powersave = 2 and wait for 6.9.x kernels to fix "sometimes network is not responding" problem to use build-in kernel drivers, I think.

If there is noting to say you can close the issue.

@cemunal
Copy link
Author

cemunal commented Apr 22, 2024

Updated to 6.8.6-200.fc39.x86_64 to test build-in kernel drivers. Here are new logs. Thanks.
logs.txt

@lwfinger
Copy link
Owner

Firstly, check to make sure that you do not have modules from the kernel and this repo both loaded. 'lsmod | grep rtw' will show what is loaded.
Secondly, if you only have kernel modules loaded, then this problem needs to be reported at [email protected].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants