Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] kasan read access error in umm_initialize #12855

Closed
1 task done
Rrooach opened this issue Aug 6, 2024 · 9 comments
Closed
1 task done

[BUG] kasan read access error in umm_initialize #12855

Rrooach opened this issue Aug 6, 2024 · 9 comments
Labels
Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Area: Kernel Kernel issues Area: Memory Management Memory Management issues OS: Linux Issues related to Linux (building system, etc) Type: Bug Something isn't working

Comments

@Rrooach
Copy link

Rrooach commented Aug 6, 2024

Description / Steps to reproduce the issue

I'm encountering an illegal memory read error when running NuttX kernel built with (ASAN) for full image instrumentation. The kernel fails to run due to this error.

Steps to Reproduce:

  1. Build NuttX with ASAN enabled.
  2. Start the kernel using QEMU with the following command:
    qemu-system-riscv64 -semihosting -M virt,aclint=on -cpu rv64 -smp 8 -bios none -kernel /path/to/nuttx/build_t/nuttx -nographic
  3. the console output with
    kasan_report: kasan detected a read access error, address at 0x81f81580, size is 8, return address: 0x80006ec2

GDB Debugging Session:

  1. Connect to QEMU using GDB:
    gdb /path/to/nuttx/build_t/nuttx
    (gdb) target remote :1234
  2. Set breakpoints and continue execution:
    (gdb) b *0x81f81580
    Breakpoint 2 at 0x81f81580
    (gdb) b *0x80006ec2
    Breakpoint 3 at 0x80006ec2: file /path/to/nuttx/mm/kasan/kasan.c, line 110.
    (gdb) c
    Continuing.
  3. Backtrace upon hitting the breakpoint:
    Thread 1 hit Breakpoint 3, 0x0000000080006ec2 in kasan_mem_to_shadow (ptr=ptr@entry=0x81f81587,
        bit=bit@entry=0x80055afc <waiter_state>, size=1)
        at /path/to/nuttx/mm/kasan/kasan.c:110
    (gdb) bt
    #0  0x0000000080006ec2 in kasan_mem_to_shadow (ptr=ptr@entry=0x81f81587,
        bit=bit@entry=0x80055afc <waiter_state>, size=1)
        at /path/to/nuttx/mm/kasan/kasan.c:110
    #1  0x0000000080007078 in kasan_is_poisoned (size=8, addr=0x81f81580)
        at /path/to/nuttx/mm/kasan/kasan.c:162
    ...
    #49 0x0000000080015700 in mm_addregion (heap=heap@entry=0x80056340,
        heapstart=heapstart@entry=0x800565e8, heapsize=<optimized out>, heapsize@entry=33200664)
        at /path/to/nuttx/mm/mm_heap/mm_initialize.c:140
    #50 0x000000008001584e in mm_initialize (name=name@entry=0x8003d990 "Umem", heapstart=0x800565e8,
        heapstart@entry=0x80056340, heapsize=33200664, heapsize@entry=33201344)
        at /path/to/nuttx/mm/mm_heap/mm_initialize.c:279
    #51 0x0000000080015638 in umm_initialize (heap_start=0x80056340, heap_size=33201344)
        at /path/to/nuttx/mm/umm_heap/umm_initialize.c:89
    #52 0x0000000080008d58 in nx_start () at /path/to/nuttx/sched/init/nx_start.c:584
    #53 0x00000000800005ee in qemu_rv_start (mhartid=<optimized out>, dtb=0x87e00000 "\320\r\376\355")
        at /path/to/nuttx/arch/risc-v/src/qemu-rv/qemu_rv_start.c:171
    #54 0x000000008000004c in _stext ()
        at /path/to/nuttx/arch/risc-v/src/qemu-rv/qemu_rv_head.S:74

It appears that there is an illegal memory read operation in the nx_start() function, which is causing the kernel to fail. The error is detected by KASAN, and further investigation using GDB points to the kasan_mem_to_shadow function in kasan.c.

Do you have any idea what might cause this error?

On which OS does this issue occur?

[Linux]

What is the version of your OS?

Ubuntu

NuttX Version

masrer / git version 4197b5a

Issue Architecture

[risc-v]

Issue Area

[Kernel], [Memory Management]

Verification

  • I have verified before submitting the report.
@Rrooach Rrooach added the Type: Bug Something isn't working label Aug 6, 2024
@github-actions github-actions bot added Area: Kernel Kernel issues OS: Linux Issues related to Linux (building system, etc) Area: Memory Management Memory Management issues Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture labels Aug 6, 2024
@xiaoxiang781216
Copy link
Contributor

@Gary-Hobson could you take a look?

@anchao
Copy link
Contributor

anchao commented Aug 12, 2024

@anjiahao1

@anjiahao1
Copy link
Contributor

It seems that Kasan is recursively checking when it is initialized. @Gary-Hobson

@Gary-Hobson
Copy link
Contributor

@Rrooach After I updated to the latest mainline code, I used boards/risc-v/qemu-rv/rv-virt/configs/nsh64 to test and nothing unusual happened

Can you provide a method that can be reproduced in the mainline code?

image

defconfig:

diff --git a/boards/risc-v/qemu-rv/rv-virt/configs/nsh64/defconfig b/boards/risc-v/qemu-rv/rv-virt/configs/nsh64/defconfig
index 68120193b9..485e8c96d7 100644
--- a/boards/risc-v/qemu-rv/rv-virt/configs/nsh64/defconfig
+++ b/boards/risc-v/qemu-rv/rv-virt/configs/nsh64/defconfig
@@ -47,6 +47,7 @@ CONFIG_LIBC_EXECFUNCS=y
 CONFIG_LIBC_PERROR_STDOUT=y
 CONFIG_LIBC_STRERROR=y
 CONFIG_LIBM=y
+CONFIG_MM_KASAN=y
 CONFIG_NFILE_DESCRIPTORS_PER_BLOCK=6
 CONFIG_NSH_ARCHINIT=y
 CONFIG_NSH_BUILTIN_APPS=y

Startup Command

riscv-none-elf-gcc -v
Using built-in specs.
COLLECT_GCC=riscv-none-elf-gcc
COLLECT_LTO_WRAPPER=/home/gary/.tools/risc-v/bin/../libexec/gcc/riscv-none-elf/12.3.0/lto-wrapper
Target: riscv-none-elf
Configured with: /__w/riscv-none-elf-gcc-xpack/riscv-none-elf-gcc-xpack/build/linux-x64/sources/gcc-12.3.0/configure --prefix=/__w/riscv-none-elf-gcc-xpack/riscv-none-elf-gcc-xpack/build/linux-x64/application --with-sysroot=/__w/riscv-none-elf-gcc-xpack/riscv-none-elf-gcc-xpack/build/linux-x64/application/riscv-none-elf --with-native-system-header-dir=/include --infodir=/__w/riscv-none-elf-gcc-xpack/riscv-none-elf-gcc-xpack/build/linux-x64/x86_64-pc-linux-gnu/install/share/info --mandir=/__w/riscv-none-elf-gcc-xpack/riscv-none-elf-gcc-xpack/build/linux-x64/x86_64-pc-linux-gnu/install/share/man --htmldir=/__w/riscv-none-elf-gcc-xpack/riscv-none-elf-gcc-xpack/build/linux-x64/x86_64-pc-linux-gnu/install/share/html --pdfdir=/__w/riscv-none-elf-gcc-xpack/riscv-none-elf-gcc-xpack/build/linux-x64/x86_64-pc-linux-gnu/install/share/pdf --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=riscv-none-elf --disable-libgomp --disable-libmudflap --disable-libquadmath --disable-libsanitizer --disable-libssp --disable-nls --disable-shared --disable-threads --disable-tls --enable-checking=release --enable-languages=c,c++,fortran --with-gmp=/__w/riscv-none-elf-gcc-xpack/riscv-none-elf-gcc-xpack/build/linux-x64/x86_64-pc-linux-gnu/install --with-newlib --with-pkgversion='xPack GNU RISC-V Embedded GCC x86_64' --with-gnu-as --with-gnu-ld --with-system-zlib --with-abi=ilp32 --with-arch=rv32imac --enable-multilib
Thread model: single
Supported LTO compression algorithms: zlib zstd
gcc version 12.3.0 (xPack GNU RISC-V Embedded GCC x86_64) 

./tools/configure.sh rv-virt:nsh64 
make menuconfig
make -j
qemu-system-riscv64 -semihosting -M virt -cpu rv64 -smp 8 \
      -chardev stdio,id=con,mux=on \
      -serial chardev:con \
      -device virtio-gpu-device,xres=640,yres=480,bus=virtio-mmio-bus.0 \
      -mon chardev=con,mode=readline \
      -bios none -kernel nuttx/nuttx -s -S

@Rrooach
Copy link
Author

Rrooach commented Aug 12, 2024

the git version I used is 4197b5a
hi, this is the screenshot for menuconfig when I compile the nuttx
image

And when I try the lastest mainline, seems I get:

serial/uart_16550.c:230:21: error: 'CONFIG_16550_UART0_RX_TRIGGER' undeclared here (not in a function); did you mean 'CONFIG_16550_UART0_RXBUFSIZE'?
  230 |   .rxtrigger      = CONFIG_16550_UART0_RX_TRIGGER,
      |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     CONFIG_16550_UART0_RXBUFSIZE
[97/1243] Building C objec...fs.dir/fs_initialize.c.obj

@Rrooach Rrooach closed this as not planned Won't fix, can't repro, duplicate, stale Aug 12, 2024
@Rrooach
Copy link
Author

Rrooach commented Aug 12, 2024

sorry for the misclose

@anchao
Copy link
Contributor

anchao commented Aug 12, 2024

serial/uart_16550.c:230:21: error: 'CONFIG_16550_UART0_RX_TRIGGER' undeclared here (not in a function); did you mean 'CONFIG_16550_UART0_RXBUFSIZE'?
  230 |   .rxtrigger      = CONFIG_16550_UART0_RX_TRIGGER,
      |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     CONFIG_16550_UART0_RXBUFSIZE
[97/1243] Building C objec...fs.dir/fs_initialize.c.obj

@Rrooach This configuration was added recently #12830, you may need to clear the out directory and compile again

@Rrooach
Copy link
Author

Rrooach commented Aug 12, 2024

@Gary-Hobson Hi, I am not quiet understand what is going on here, but after I pull the lastest mainline: 2ff2b82, compile with:
image

and then I start the qemu:

qemu-system-riscv64 -semihosting -M virt,aclint=on -cpu rv64 -smp 8 -bios none -kernel ./build_tt/nuttx -nographic  -S -s

I still get a memory error:

Thread 1 hit Breakpoint 2, exception_common ()
    at /path/to/nuttx/arch/risc-v/src/common/riscv_exception_common.S:112
112	  addi       sp, sp, -XCPTCONTEXT_SIZE
(gdb) bt
#0  exception_common ()
    at /path/to/nuttx/arch/risc-v/src/common/riscv_exception_common.S:112
#1  0x00000000800075dc in kasan_set_poison (
    addr=addr@entry=0x80056dd0, size=<optimized out>,
    poisoned=poisoned@entry=false)
    at /path/to/nuttx/mm/kasan/kasan.c:188
#2  0x00000000800076ac in kasan_unpoison (addr=addr@entry=0x80056dd0,
    size=<optimized out>)
    at /path/to/nuttx/mm/kasan/kasan.c:241
#3  0x0000000080007d52 in mm_malloc (heap=heap@entry=0x80056b00,
    size=<optimized out>, size@entry=128)
    at /path/to/nuttx/mm/mm_heap/mm_malloc.c:325
#4  0x0000000080007d84 in mm_zalloc (heap=0x80056b00,
    size=size@entry=128)
    at /path/to/nuttx/mm/mm_heap/mm_zalloc.c:45
#5  0x00000000800078a8 in zalloc (size=128)
    at /path/to/nuttx/mm/umm_heap/umm_zalloc.c:70
#6  0x0000000080008ec8 in nx_start ()
    at /path/to/nuttx/sched/init/nx_start.c:613
#7  0x00000000800005ee in qemu_rv_start (mhartid=<optimized out>,
    dtb=0x87e00000 "\320\r\376\355")
    at /path/to/nuttx/arch/risc-v/src/qemu-rv/qemu_rv_start.c:220
#8  0x0000000080000048 in _stext ()
    at /path/to/nuttx/arch/risc-v/src/qemu-rv/qemu_rv_head.S:76
Backtrace stopped: frame did not save the PC

However, I can successfully run the nuttx if I disable "enable asan for the entire image"

@Gary-Hobson
Copy link
Contributor

Thread 1 hit Breakpoint 2, exception_common ()
    at /path/to/nuttx/arch/risc-v/src/common/riscv_exception_common.S:112
112	  addi       sp, sp, -XCPTCONTEXT_SIZE
(gdb) bt
#0  exception_common ()
    at /path/to/nuttx/arch/risc-v/src/common/riscv_exception_common.S:112
#1  0x00000000800075dc in kasan_set_poison (
    addr=addr@entry=0x80056dd0, size=<optimized out>,
    poisoned=poisoned@entry=false)
    at /path/to/nuttx/mm/kasan/kasan.c:188

In the above error, it seems that dataabort/unaligned access has occurred. It is not that Kasan actively panics after detecting the error.

The following error is when executing the code at address 0x80006ec2, kasan detects that it accesses an unallocated memory (addr: 0x81f81580, size is 8)

kasan_report: kasan detected a read access error, address at 0x81f81580, size is 8, return address: 0x80006ec2

@Rrooach Can you provide more information for analysis?
defconfig, elf, log, these are all helpful in analyzing the problem
If a method can be provided to reproduce the problem in the mainline code, this problem can be solved quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Area: Kernel Kernel issues Area: Memory Management Memory Management issues OS: Linux Issues related to Linux (building system, etc) Type: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants