Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][RFT] generic: platform/mikrotik: add wlan lz77 decompress #5

Merged
merged 1 commit into from
Oct 24, 2024

Conversation

john-tho
Copy link
Owner

usual disclaimer: expect your device to halt and catch fire…

very rough attempt to decode lz77; rough, excessively verbose, and entirely unoptimised.

A number of new (or newly factory flashed) Mikrotik devices are using LZ77 magic for wlan tag hard_config data.
New devices include the Chateau LTE12 1, and ax devices 2 Newly factory flashed devices may include the hap ac3 3

This can be seen in decoded OEM supout 4 dmesg:
"radio data lz77 decompressed from"…

Investigating an arm RouterOS flash.ko module, and supplied example hard_config dumps, the format was determined via 5.

*/
#define LZ77_MK_MAX_COUNT_BIT_LEN 27

#ifdef CONFIG_CPU_LITTLE_ENDIAN
Copy link
Owner Author

@john-tho john-tho Jun 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#ifdef CONFIG_CPU_LITTLE_ENDIAN
#if defined (CONFIG_CPU_LITTLE_ENDIAN) || defined(CONFIG_ARCH_IPQ40XX)

ipq40xx does not set CONFIG_CPU_LITTLE_ENDIAN

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does in the final kernel config

@john-tho john-tho force-pushed the mikrotik_wlan_lz77 branch 4 times, most recently from d9ac436 to 4ad1fd7 Compare June 24, 2023 13:00
@dchard
Copy link

dchard commented Jun 24, 2023

The latest push is tested on Mikrotik Chateau-12LTE (ipq4019). Both radios are working, driver does load the firmware with BDF/CAL properly, WLAN MACs are also correct. Test is conducted on Routerboot_v7 with the required additional patches. Dmesg is completely fine, no errors or weird messages.

firmware ver 10.4b-ct-4019-fW-13-5ae337bb1 api 5

If you want you can use my "tested by" tag:
Csaba Sipos <metro4_at_freemail.hu>

@john-tho john-tho force-pushed the mikrotik_wlan_lz77 branch 4 times, most recently from c0fc59e to 8a8e313 Compare July 1, 2023 18:12
@john-tho
Copy link
Owner Author

john-tho commented Jul 1, 2023

Hopefully no functional changes with push, only fixing tidy messages, and fix some checkpatch and sparse warnings

@dchard
Copy link

dchard commented Jul 7, 2023

Hopefully no functional changes with push, only fixing tidy messages, and fix some checkpatch and sparse warnings

I can recheck it early next week.

@dchard
Copy link

dchard commented Jul 14, 2023

I tested it on ipq4019 (chateau 12) and it still works fine.

@dchard
Copy link

dchard commented Jul 19, 2023

@john-tho found a device out of the many that fails at LZ77 decompress:

[    0.960517] [rb_hardconfig][lz77] input overrun
[    0.960565] [rb_hardconfig] LZ77: LZ77 decompress fail
[    0.964427] [rb_hardconfig][lz77] input overrun
[    0.969139] [rb_hardconfig] LZ77: LZ77 decompress fail
[    0.974058] [rb_hardconfig][lz77] input overrun
[    0.978735] [rb_hardconfig] LZ77: LZ77 decompress fail
[    0.983165] MikroTik RouterBOARD hardware configuration sysfs driver v0.08
[    0.992248] MikroTik RouterBOARD software configuration sysfs driver v0.05

And this is the MTD2 (hard config) dump of the same device:

[deleted]

@john-tho
Copy link
Owner Author

@john-tho found a device out of the many that fails at LZ77 decompress:

[    0.960517] [rb_hardconfig][lz77] input overrun
[    0.960565] [rb_hardconfig] LZ77: LZ77 decompress fail
[    0.964427] [rb_hardconfig][lz77] input overrun
[    0.969139] [rb_hardconfig] LZ77: LZ77 decompress fail
[    0.974058] [rb_hardconfig][lz77] input overrun
[    0.978735] [rb_hardconfig] LZ77: LZ77 decompress fail
[    0.983165] MikroTik RouterBOARD hardware configuration sysfs driver v0.08
[    0.992248] MikroTik RouterBOARD software configuration sysfs driver v0.05

And this is the MTD2 (hard config) dump of the same device:

LZ77_decomp_failed_mtd2_hard_config.zip

Okay, thanks, I pulled a copy.

In my standalone decompressor, looks like the final group (non-match) bits have some set bits.

inbit:0x55d1->outbyte:0x2e8a bits opcode:0xc match, offset: 0x1, len: 0x6 (5 partial memcpy)
inbit:0x55dd->outbyte:0x2e90 bits opcode:0xc non-match, len: 0xc
 correct len
(input_bit 0x55e9 + opcode->length 0xc*8) = 0x5649 > in_len 0xac0 (bits 0x5600)
 correct end
input bit 0x55e9, input_len * 8: 0x5600
input byte -1 0x0, 0xe
input bit 0x55dd
opcode bits
 1 1 0 0 0 0 0 0 0 0 0 0
non-match data bits
 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
input overrun
finished, rc=-1
output first bytes:
44 52 45 00 01 82 28 19

You could try removing the && (!lz77_mikrotik_wlan_get_bit(in, input_bit)) from the test end marker section.
It might just work, and my null bit(s) follows last group opcode assumption (based on all prior examples) was wrong.

We should look to see how the Mikrotik binary handles this last group & payload.

@john-tho
Copy link
Owner Author

In my standalone decompressor, looks like the final group (non-match) bits have some set bits.

inbit:0x55d1->outbyte:0x2e8a bits opcode:0xc match, offset: 0x1, len: 0x6 (5 partial memcpy)
inbit:0x55dd->outbyte:0x2e90 bits opcode:0xc non-match, len: 0xc
 correct len
(input_bit 0x55e9 + opcode->length 0xc*8) = 0x5649 > in_len 0xac0 (bits 0x5600)
 correct end
input bit 0x55e9, input_len * 8: 0x5600
input byte -1 0x0, 0xe
input bit 0x55dd
opcode bits
 1 1 0 0 0 0 0 0 0 0 0 0
non-match data bits
 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
input overrun
finished, rc=-1
output first bytes:
44 52 45 00 01 82 28 19

You could try removing the && (!lz77_mikrotik_wlan_get_bit(in, input_bit)) from the test end marker section. It might just work, and my null bit(s) follows last group opcode assumption (based on all prior examples) was wrong.

We should look to see how the Mikrotik binary handles this last group & payload.

Emulated binary decompressed payload was 0x2e90 bytes long, so I should be able to cut the extra test for bits following the last group opcode: && (!lz77_mikrotik_wlan_get_bit(in, input_bit))

@dchard
Copy link

dchard commented Jul 20, 2023

@john-tho just to be sure: this does not looks like a potential flash/data corruption to you? I used your patch on about 80 devices from multiple batches of Chateau LTE12 and there was a single one so far which produced this fault.

@john-tho
Copy link
Owner Author

@john-tho just to be sure: this does not looks like a potential flash/data corruption to you?

Does not look like corrupted flash. If it was, this lz77 decompress would fail very differently. Just extra set bits after the last (non) match group.

@dchard
Copy link

dchard commented Jul 20, 2023

@john-tho just to be sure: this does not looks like a potential flash/data corruption to you?

Does not look like corrupted flash. If it was, this lz77 decompress would fail very differently. Just extra set bits after the last (non) match group.

Tested your latest modifications on the affected device, and now it works without any errors.

@aanon4
Copy link

aanon4 commented Sep 1, 2023

How little endian is this decompression code? I have two ath79 based device (921GS-5HPacD-19s - one of their newer sectors) and one boots fine while the other doesn't load the firmware. I was thinking this compression scheme might be the problem there also? However this device is big endian so I wondered how portable it was?
Thanks

[Edit]
Looking though the hard_config data, it is definitely LZ77 compressed.

[Edit 2]
I tried it "as is" and it works great! Thanks so much. You might want to consider widening the scope of where this is included.

@john-tho
Copy link
Owner Author

john-tho commented Sep 1, 2023

How little endian is this decompression code? I have two ath79 based device (921GS-5HPacD-19s - one of their newer sectors) and one boots fine while the other doesn't load the firmware. I was thinking this compression scheme might be the problem there also? However this device is big endian so I wondered how portable it was? Thanks

[Edit] Looking though the hard_config data, it is definitely LZ77 compressed.

[Edit 2] I tried it "as is" and it works great! Thanks so much. You might want to consider widening the scope of where this is included.

Ha! Colour me surprised. I was not expecting to see this used on MIPS. Many thanks for the new data and testing Tim. I guess, because the compressed data is read one bit at a time, that should have been the only place where endian-ness mattered. If you could, I would appreciate you detailing the backup RouterBOOT version so we might be able to piece together some idea of when this started: cat /sys/firmware/mikrotik/hard_config/booter_version

Okay, I will adjust the LE|IPQ40XX build conditional next time I touch this.

Cheers.

@aanon4
Copy link

aanon4 commented Sep 1, 2023

The output from that cat command:
6.48.6

@dchard
Copy link

dchard commented Sep 26, 2023

@john-tho just checked the LZ77 and routeros-v7 patches on the new 6.1.55 kernel for IPQ4019 and it seems to work correctly.

@MichaelUray
Copy link

The patch works for me with a MikroTik hAP ac3.
What is actually required to get that merged into the master?

MichaelUray pushed a commit to MichaelUray/openwrt that referenced this pull request Oct 14, 2023
ipq40xx/mikrotik: adds wlan/WiFi lz77 decompress

Fixes the no wireless issue for the MikroTik RouterBOARD hAP ac3 with lz77 compression

Applies the patch from john-tho to which works fine on my hAP ac3 (RBD53iG-5HacD2HnD).
john-tho#5

https://forum.openwrt.org/t/no-wireless-mikrotik-rbd53ig-5hacd2hnd/157763
https://forum.openwrt.org/t/lz77-in-mikrotik/91696

modified:   target/linux/generic/files/drivers/platform/mikrotik/Makefile
modified:   target/linux/generic/files/drivers/platform/mikrotik/rb_hardconfig.c
new file:   target/linux/generic/files/drivers/platform/mikrotik/rb_hardconfig_lz77.c
new file:   target/linux/generic/files/drivers/platform/mikrotik/rb_hardconfig_lz77.h
modified:   target/linux/generic/files/drivers/platform/mikrotik/routerboot.h
@john-tho john-tho force-pushed the mikrotik_wlan_lz77 branch 12 times, most recently from 80a8302 to 27d8dd2 Compare June 27, 2024 21:09
@john-tho john-tho force-pushed the mikrotik_wlan_lz77 branch 2 times, most recently from bdb2027 to 32f0702 Compare July 19, 2024 22:33
@john-tho john-tho force-pushed the mikrotik_wlan_lz77 branch 4 times, most recently from 04849db to ef6d0e9 Compare September 8, 2024 08:20
@john-tho
Copy link
Owner Author

john-tho commented Sep 9, 2024

Hi @aanon4,

I have two ath79 based device (921GS-5HPacD-19s - one of their newer sectors) and one boots fine while the other doesn't load the firmware.
[Edit] Looking though the hard_config data, it is definitely LZ77 compressed.

I am gradually getting this together for upstream OpenWrt (openwrt#15774), and have made a few small changes to the code. I would be interested to see if you could try out this updated code, and/or email me a copy of hard_config and soft_config mtd partitions for that lz77 ath79 device?

The output from that cat command:
6.48.6

There are a number of Mikrotik ath79 devices with ath10k pcie cards, and I have not noticed any other OpenWrt forum or issue reports of no 5GHz Wi-Fi on them (but for some time they have had more modern replacements). My guess now is that this may occur (for ath79) if calibration data is updated (guessing in RouterOS update) for US devices to enable U-NII 2 frequency ranges, or recent factory flashes.

@john-tho john-tho force-pushed the mikrotik_wlan_lz77 branch 2 times, most recently from b69f8ed to 6ef2e0a Compare September 10, 2024 01:11
@aanon4
Copy link

aanon4 commented Sep 10, 2024

Hi @aanon4,

I have two ath79 based device (921GS-5HPacD-19s - one of their newer sectors) and one boots fine while the other doesn't load the firmware.
[Edit] Looking though the hard_config data, it is definitely LZ77 compressed.

I am gradually getting this together for upstream OpenWrt (openwrt#15774), and have made a few small changes to the code. I would be interested to see if you could try out this updated code, and/or email me a copy of hard_config and soft_config mtd partitions for that lz77 ath79 device?

The output from that cat command:
6.48.6

There are a number of Mikrotik ath79 devices with ath10k pcie cards, and I have not noticed any other OpenWrt forum or issue reports of no 5GHz Wi-Fi on them (but for some time they have had more modern replacements). My guess now is that this may occur (for ath79) if calibration data is updated (guessing in RouterOS update) for US devices to enable U-NII 2 frequency ranges, or recent factory flashes.

@john-tho Unfortunately this is tricky. I work on the AREDN project which is built on top of OpenWRT. As such the radio in question is currently at the top of a tower at the top of a mountain. I can't really run test code on it because ... well ... getting to it if fails is a massive pain.

@john-tho
Copy link
Owner Author

@john-tho Unfortunately this is tricky. I work on the AREDN project which is built on top of OpenWRT. As such the radio in question is currently at the top of a tower at the top of a mountain. I can't really run test code on it because ... well ... getting to it if fails is a massive pain.

Haha, yes, no worries, kind of expected this. I too have not-so-easily-accessible devices. Lab test devices are boring, field tests (or deploys) are much more interesting.
If someone could possible remote into the device, and pull a dump of the mtd hard_config partition, that would allow me to run some sanity checks on my code at least. soft_config could just tell us if RouterBoot primary had been updated as well.
Also, please let me know if you heard of any other ath79 devices that needed this (used LZ77 calibration data)?
Cheers.

@john-tho john-tho force-pushed the mikrotik_wlan_lz77 branch 4 times, most recently from 79bc65f to 9132270 Compare October 8, 2024 08:40
A number of new (or with recently updated caldata)
Mikrotik devices are using LZ77 magic for wlan tag hard_config data.
New devices include the Chateau LTE12 [1], and ax devices [2]
Newly factory flashed devices may include the hap ac3 [3]

This can be seen in decoded OEM supout [4] dmesg:
"radio data lz77 decompressed from"…

Investigating an arm RouterOS flash.ko module, and supplied example
hard_config dumps, the format was guessed via decompilation and live
debugging [5]. This decoder was then built from the guessed format
specification.

debug prints can be enabled in a DYNAMIC_DEBUG kernel build via the
kernel cmdline:

        chosen {
-               bootargs = "console=ttyS0,115200";
+               bootargs = "console=ttyS0,115200 dyndbg=\"file drivers/platform/mikrotik/* +p\"";
        };

[1]: https://forum.openwrt.org/t/no-wireless-mikrotik-rbd53ig-5hacd2hnd/157763/4
[2]: https://forum.openwrt.org/t/mikrotik-routeros-v7-x-and-openwrt-sysupgrade/148072/17
[3]: https://forum.openwrt.org/t/adding-support-for-mikrotik-hap-ax2/133715/47
[4]: https://github.com/farseeker/go-mikrotik-rif
[5]: https://github.com/john-tho/routeros-wlan-lz77-decode

Signed-off-by: John Thomson <[email protected]>
Link: openwrt#15774
Signed-off-by: Robert Marko <[email protected]>
@john-tho john-tho merged commit 7d33aed into master Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants