Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

riscv_fork.c: Fix vfork() for kernel mode + SMP #13633

Merged
merged 2 commits into from
Sep 27, 2024

Conversation

pussuw
Copy link
Contributor

@pussuw pussuw commented Sep 26, 2024

Summary

There was an error in the fork() routine when system calls are in use: the child context is saved on the child's user stack, which is incorrect, the context must be saved on the kernel stack instead.

The result is a full system crash if (when) the child executes on a different CPU which does not have the same MMU mappings active.

Impact

Only the RISC-V platform with CONFIG_LIB_SYSCALL=y is affected.

Testing

rv-virt:knsh64
rv-virt:ksmp64

There was an error in the fork() routine when system calls are in use:
the child context is saved on the child's user stack, which is incorrect,
the context must be saved on the kernel stack instead.

The result is a full system crash if (when) the child executes on a
different CPU which does not have the same MMU mappings active.
@github-actions github-actions bot added Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Size: S The size of the change in this PR is small labels Sep 26, 2024
@pussuw
Copy link
Contributor Author

pussuw commented Sep 26, 2024

@hujun260 @lupyuen the fix for fork() is here. @lupyuen could you please either provide your test code/script to me so I can easily run the same test as you did to verify the fix works.

Alternatively you can of course run the test yourself, whichever you prefer.

@pussuw
Copy link
Contributor Author

pussuw commented Sep 26, 2024

Btw I had to revert:
c5ecc49
e4a0470
c9bdb59

In order to get ostest running at all, but I guess this is already a known issue and being fixed.

@lupyuen
Copy link
Member

lupyuen commented Sep 26, 2024

@pussuw Thanks! I'll run the Stress Test now for knsh64.

FYI I'm using this Stress Test Script, changing https://github.com/apache/nuttx/tree/master to https://github.com/tiiuae/nuttx/tree/riscv_fork_kernel_fix

@lupyuen
Copy link
Member

lupyuen commented Sep 26, 2024

@pussuw Sorry knsh64 OSTest is failing intermittently at vfork(): https://gist.github.com/lupyuen/6a74c80dbe35496761c24606056ec2c2

nsh> uname -a
NuttX 10.4.0 c6a3b44f75 Sep 26 2024 17:08:39 risc-v rv-virt
nsh> ostest
user_main: vfork() test
[   71.332000] _assert: Current Version: NuttX  10.4.0 c6a3b44f75 Sep 26 2024 17:08:38 risc-v
[   71.332000] _assert: Assertion failed : at file: common/riscv_fork.c:133 task: ostest process: ostest 0xc000001a
[   71.332000] up_dump_register: EPC: 0000000080213532
[   71.332000] up_dump_register: A0: 0000000080407030 A1: 0000000000000085 A2: 0000000200042020 A3: 00000000804082e0
[   71.332000] up_dump_register: A4: 000000000000000a A5: 0000000000000004 A6: 0000000000000000 A7: 0000000000000000
[   71.332000] up_dump_register: T0: 00000000802068d2 T1: 0000000000000007 T2: 0000000000000000 T3: 00000000c0206070
[   71.332000] up_dump_register: T4: 00000000c0206068 T5: 00000000c000b8f9 T6: 000000000000003d
[   71.332000] up_dump_register: S0: 0000000000000000 S1: 0000000080408f00 S2: 0000000080408f00 S3: 0000000000000000
[   71.332000] up_dump_register: S4: 0000000200042020 S5: 0000000000000000 S6: 0000000080219410 S7: 0000000000000000
[   71.332000] up_dump_register: S8: 0000000000000006 S9: 0000000080407250 S10: 0000000000000085 S11: 00000000804072d0
[   71.332000] up_dump_register: SP: 000000008040c860 FP: 0000000000000000 TP: 0000000080408f00 RA: 0000000080213532
[   71.332000] dump_stack: Kernel Stack:
[   71.332000] dump_stack:   base: 0x8040c030
[   71.332000] dump_stack:   size: 00003072
[   71.332000] dump_stack:     sp: 0x8040c860
[   71.332000] stack_dump: 0x8040c820: 0000000000000009 0000000080409058 000000008021bf00 0000000000000000 00000000c0202070 0000000080408f00 0000000000000000 00000000802137b4
[   71.332000] stack_dump: 0x8040c860: 00000000c000001a fffffffffffffffc 0000000000006010 00000000c0200000 0000000080408f00 0000000080407030 0000000000000000 0000000080219410
[   71.332000] stack_dump: 0x8040c8a0: 0000000000000085 ffff00587474754e 0000000000000c10 0000000080407f00 0000000080407f00 0000000080206b2a 00000000c0206000 2e30310080407f00
[   71.332000] stack_dump: 0x8040c8e0: 0000000000302e34 0000000080208704 3462336136630000 7065532035376634 3432303220363220 333a38303a373120 0000000080400038 0000000080203386
[   71.332000] stack_dump: 0x8040c920: 7369720000000000 0000000000762d63 0000000000000000 000000008040cc40 000000008040cf40 0000000000000000 0000000000000000 0000000000000000
[   71.332000] stack_dump: 0x8040c960: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000005 0000000200042022 0000000080408f00 00000000c000919a
[   71.332000] stack_dump: 0x8040c9a0: ffffffffffffffda 000000008020694c ffffffffffffffda 000000008020a362 0000000000000000 0000000000000020 0000000000000008 00000000c000919a
[   71.332000] stack_dump: 0x8040c9e0: ffffffffffffffda 00000000802068da 0000000000000008 0000000080200940 0000000000000000 0000000080200972 0000000200042020 00000000802001a2
[   71.332000] stack_dump: 0x8040ca20: 00000000c000919a 00000000c0008af8 00000000c0203ef0 0000000080408f00 0000000000000000 0000000000000000 00000000c01017b0 0000000000000000
[   71.332000] stack_dump: 0x8040ca60: 00000000c0202040 0000000000000000 0000000000000022 00000000c02003d8 00000000c0200448 0000000000000018 000000000000000a 00000000c0101f4e
[   71.332000] stack_dump: 0x8040caa0: 0000000000000009 0000000000000000 0000000000000005 00000000c0010a48 0000000000000005 0000000000000000 0000000000000000 0000000000000000
[   71.332000] stack_dump: 0x8040cae0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00000000c0008146 00000000c000b6f8 00000000c000b8f9 000000000000003d
[   71.332000] stack_dump: 0x8040cb20: 0000000200042020 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   71.332000] stack_dump: 0x8040cb60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   71.332000] stack_dump: 0x8040cba0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   71.332000] stack_dump: 0x8040cbe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00000000802034e6 0000000000000000 0000000000000000
[   71.332000] stack_dump: 0x8040cc20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ostest_main: Exiting with status 256

@lupyuen
Copy link
Member

lupyuen commented Sep 26, 2024

@pussuw It's also failing at the Signals Test: https://gist.github.com/lupyuen/83c7e33837018a534eb5df09574efe99

user_main: nested signal handler test
signest_test: Starting signal waiter task at priority 101
waiter_main: Waiter started
waiter_main: Setting signal mask
waiter_main: Registering signal handler
waiter_main: Waiting on semaphore
signest_test: Started waiter_main pid=51
signest_test: Starting interfering task at priority 102
interfere_main: Waiting on semaphore
signest_test: Started interfere_main pid=52
signest_test: Simple case:
  Total signalled 1240  Odd=620 Even=620
  Total handled   1240  Odd=620 Even=620
  Total nested    0    Odd=0   Even=0  
signest_test: With task locking
  Total signalled 2480  Odd=1240 Even=1240
  Total handled   2480  Odd=1240 Even=1240
  Total nested    0    Odd=0   Even=0  
[   55.123000] riscv_exception: EXCEPTION: Instruction page fault. MCAUSE: 000000000000000c, EPC: 0000000000000000, MTVAL: 0000000000000000
[   55.123000] riscv_exception: Segmentation fault in PID 10: ostest
ostest_main: Exiting with status 2816

Here's how I built NuttX for knsh64: https://gist.github.com/lupyuen/6721a8e9bd8dd2156fab62c9b765837f

@pussuw
Copy link
Contributor Author

pussuw commented Sep 26, 2024

@pussuw It's also failing at the Signals Test: https://gist.github.com/lupyuen/83c7e33837018a534eb5df09574efe99

user_main: nested signal handler test
signest_test: Starting signal waiter task at priority 101
waiter_main: Waiter started
waiter_main: Setting signal mask
waiter_main: Registering signal handler
waiter_main: Waiting on semaphore
signest_test: Started waiter_main pid=51
signest_test: Starting interfering task at priority 102
interfere_main: Waiting on semaphore
signest_test: Started interfere_main pid=52
signest_test: Simple case:
  Total signalled 1240  Odd=620 Even=620
  Total handled   1240  Odd=620 Even=620
  Total nested    0    Odd=0   Even=0  
signest_test: With task locking
  Total signalled 2480  Odd=1240 Even=1240
  Total handled   2480  Odd=1240 Even=1240
  Total nested    0    Odd=0   Even=0  
[   55.123000] riscv_exception: EXCEPTION: Instruction page fault. MCAUSE: 000000000000000c, EPC: 0000000000000000, MTVAL: 0000000000000000
[   55.123000] riscv_exception: Segmentation fault in PID 10: ostest
ostest_main: Exiting with status 2816

Here's how I built NuttX for knsh64: https://gist.github.com/lupyuen/6721a8e9bd8dd2156fab62c9b765837f

Yes this happens if you don't revert the 3 commits I mentioned.

@lupyuen
Copy link
Member

lupyuen commented Sep 26, 2024

Yes this happens if you don't revert the 3 commits I mentioned.

@pussuw Sounds like things are getting complicated :-) Wonder if should revert all these for now, and re-fix one by one?

We are also blocking #13579. Milk-V Duo S has been unstable for a week, I'm getting nervous 😬

@pussuw
Copy link
Contributor Author

pussuw commented Sep 26, 2024

@pussuw Sorry knsh64 OSTest is failing intermittently at vfork(): https://gist.github.com/lupyuen/6a74c80dbe35496761c24606056ec2c2

nsh> uname -a
NuttX 10.4.0 c6a3b44f75 Sep 26 2024 17:08:39 risc-v rv-virt
nsh> ostest
user_main: vfork() test
[   71.332000] _assert: Current Version: NuttX  10.4.0 c6a3b44f75 Sep 26 2024 17:08:38 risc-v
[   71.332000] _assert: Assertion failed : at file: common/riscv_fork.c:133 task: ostest process: ostest 0xc000001a
[   71.332000] up_dump_register: EPC: 0000000080213532
[   71.332000] up_dump_register: A0: 0000000080407030 A1: 0000000000000085 A2: 0000000200042020 A3: 00000000804082e0
[   71.332000] up_dump_register: A4: 000000000000000a A5: 0000000000000004 A6: 0000000000000000 A7: 0000000000000000
[   71.332000] up_dump_register: T0: 00000000802068d2 T1: 0000000000000007 T2: 0000000000000000 T3: 00000000c0206070
[   71.332000] up_dump_register: T4: 00000000c0206068 T5: 00000000c000b8f9 T6: 000000000000003d
[   71.332000] up_dump_register: S0: 0000000000000000 S1: 0000000080408f00 S2: 0000000080408f00 S3: 0000000000000000
[   71.332000] up_dump_register: S4: 0000000200042020 S5: 0000000000000000 S6: 0000000080219410 S7: 0000000000000000
[   71.332000] up_dump_register: S8: 0000000000000006 S9: 0000000080407250 S10: 0000000000000085 S11: 00000000804072d0
[   71.332000] up_dump_register: SP: 000000008040c860 FP: 0000000000000000 TP: 0000000080408f00 RA: 0000000080213532
[   71.332000] dump_stack: Kernel Stack:
[   71.332000] dump_stack:   base: 0x8040c030
[   71.332000] dump_stack:   size: 00003072
[   71.332000] dump_stack:     sp: 0x8040c860
[   71.332000] stack_dump: 0x8040c820: 0000000000000009 0000000080409058 000000008021bf00 0000000000000000 00000000c0202070 0000000080408f00 0000000000000000 00000000802137b4
[   71.332000] stack_dump: 0x8040c860: 00000000c000001a fffffffffffffffc 0000000000006010 00000000c0200000 0000000080408f00 0000000080407030 0000000000000000 0000000080219410
[   71.332000] stack_dump: 0x8040c8a0: 0000000000000085 ffff00587474754e 0000000000000c10 0000000080407f00 0000000080407f00 0000000080206b2a 00000000c0206000 2e30310080407f00
[   71.332000] stack_dump: 0x8040c8e0: 0000000000302e34 0000000080208704 3462336136630000 7065532035376634 3432303220363220 333a38303a373120 0000000080400038 0000000080203386
[   71.332000] stack_dump: 0x8040c920: 7369720000000000 0000000000762d63 0000000000000000 000000008040cc40 000000008040cf40 0000000000000000 0000000000000000 0000000000000000
[   71.332000] stack_dump: 0x8040c960: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000005 0000000200042022 0000000080408f00 00000000c000919a
[   71.332000] stack_dump: 0x8040c9a0: ffffffffffffffda 000000008020694c ffffffffffffffda 000000008020a362 0000000000000000 0000000000000020 0000000000000008 00000000c000919a
[   71.332000] stack_dump: 0x8040c9e0: ffffffffffffffda 00000000802068da 0000000000000008 0000000080200940 0000000000000000 0000000080200972 0000000200042020 00000000802001a2
[   71.332000] stack_dump: 0x8040ca20: 00000000c000919a 00000000c0008af8 00000000c0203ef0 0000000080408f00 0000000000000000 0000000000000000 00000000c01017b0 0000000000000000
[   71.332000] stack_dump: 0x8040ca60: 00000000c0202040 0000000000000000 0000000000000022 00000000c02003d8 00000000c0200448 0000000000000018 000000000000000a 00000000c0101f4e
[   71.332000] stack_dump: 0x8040caa0: 0000000000000009 0000000000000000 0000000000000005 00000000c0010a48 0000000000000005 0000000000000000 0000000000000000 0000000000000000
[   71.332000] stack_dump: 0x8040cae0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00000000c0008146 00000000c000b6f8 00000000c000b8f9 000000000000003d
[   71.332000] stack_dump: 0x8040cb20: 0000000200042020 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   71.332000] stack_dump: 0x8040cb60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   71.332000] stack_dump: 0x8040cba0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   71.332000] stack_dump: 0x8040cbe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00000000802034e6 0000000000000000 0000000000000000
[   71.332000] stack_dump: 0x8040cc20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ostest_main: Exiting with status 256

@hujun260 already found the other issue with fork(). It is that parent->xcp.regs can change due to a context switch. He tried disabling interrupts in e4a0470#diff-247bf40a98451576669424ddccd9ded0ea35b233d2fc4903a625a201d347adc8R117-R120 but this is not enough.

Interrupts are not the issue here, it is a potential context switch (which can happen in ISR of course, but more likely in some other synchronization point within the fork() call itself) which changes the value of parent->xcp.regs and then the integer register save area no longer points to where we want it to point, which is the saved user context when the user process first enters the kernel via ecall.

@pussuw
Copy link
Contributor Author

pussuw commented Sep 26, 2024

Yes this happens if you don't revert the 3 commits I mentioned.

@pussuw Sounds like things are getting complicated :-) Wonder if should revert all these for now, and re-fix one by one?

* [riscv: g_current_regs is only used to determine if we are in irq #13561](https://github.com/apache/nuttx/pull/13561)

* [riscv: add a return value to riscv_swint #13564](https://github.com/apache/nuttx/pull/13564)

* [arm: g_current_regs is only used to determine if we are in irq, #13444](https://github.com/apache/nuttx/pull/13444)

We are also blocking #13579. Milk-V Duo S has been unstable for a week, I'm getting nervous 😬

Yes there are several overlapping issues now, that have emerged due to insufficient testing. The fork() issue is not related to these 3 though, the issue is entirely my doing, the initial fork() implementation for kernel mode is simply a buggy piece of trash. I'll fix it.

@lupyuen
Copy link
Member

lupyuen commented Sep 26, 2024

@pussuw Alternatively shall we push ahead and merge #13585, since we Stress-Tested knsh64 200 times, and all 200 times succeeded? We could patch any outstanding issues after the merge?

@pussuw
Copy link
Contributor Author

pussuw commented Sep 26, 2024

@pussuw Alternatively shall we push ahead and merge #13585, since we Stress-Tested knsh64 200 times, and all 200 times succeeded? We could patch any outstanding issues after the merge?

PR #13585 is now overloaded with 2 things:

In addition it does these 2 things in 1 single patch. The PR must be split into 2 parts (2 separate PRs) and 2 patches as the fixes are not related to each other.

@lupyuen
Copy link
Member

lupyuen commented Sep 26, 2024

The PR must be split into 2 parts (2 separate PRs) and 2 patches as the fixes are not related to each other.

Hi @hujun260 could you help please? I'm sorry it's blocking a couple of fixes. Thanks!

We need to record the parent's integer register context upon exception
entry to a separate non-volatile area. Why?

Because xcp.regs can move due to a context switch within the fork() system
call, be it either via interrupt or a synchronization point.

Fix this by adding a "sregs" area where the saved user context is placed.
The critical section within fork() is also unnecessary.
@pussuw
Copy link
Contributor Author

pussuw commented Sep 26, 2024

There is a fix for the race condition here as well.

@pussuw
Copy link
Contributor Author

pussuw commented Sep 26, 2024

@pussuw Thanks! I'll run the Stress Test now for knsh64.

FYI I'm using this Stress Test Script, changing https://github.com/apache/nuttx/tree/master to https://github.com/tiiuae/nuttx/tree/riscv_fork_kernel_fix

Looks like a nice and simple testbench. I'll shamelessly start using it locally myself :-P

@lupyuen
Copy link
Member

lupyuen commented Sep 26, 2024

There is a fix for the race condition here as well.

Thanks @pussuw: This version seems to be stuck consistently at the Signals Test, I'm not able to test vfork() though: https://gist.github.com/lupyuen/a73fe449b1a916207fcb90ab0a6ebc44

@pussuw
Copy link
Contributor Author

pussuw commented Sep 26, 2024

There is a fix for the race condition here as well.

Thanks @pussuw: This version seems to be stuck consistently at the Signals Test, I'm not able to test vfork() though: https://gist.github.com/lupyuen/a73fe449b1a916207fcb90ab0a6ebc44

Yes the signal handler test is broken, I'm using this locally https://github.com/tiiuae/nuttx/tree/for_lups_testing
It reverts the 3 changes that break the signal handler test

@pussuw
Copy link
Contributor Author

pussuw commented Sep 26, 2024

There is a fix for the race condition here as well.

Thanks @pussuw: This version seems to be stuck consistently at the Signals Test, I'm not able to test vfork() though: https://gist.github.com/lupyuen/a73fe449b1a916207fcb90ab0a6ebc44

Yes the signal handler test is broken, I'm using this locally https://github.com/tiiuae/nuttx/tree/for_lups_testing It reverts the 3 changes that break the signal handler test

At least for me this ran 200 iterations of vfork_test() no problem.

@lupyuen
Copy link
Member

lupyuen commented Sep 26, 2024

At least for me this ran 200 iterations of vfork_test() no problem.

Excellent @pussuw! I'm running 200 iterations of https://github.com/tiiuae/nuttx/tree/for_lups_testing and it seems OK for 30 iterations so far. How shall we proceed?

@pussuw
Copy link
Contributor Author

pussuw commented Sep 26, 2024

At least for me this ran 200 iterations of vfork_test() no problem.

Excellent @pussuw! I'm running 200 iterations of https://github.com/tiiuae/nuttx/tree/for_lups_testing and it seems OK for 30 iterations so far. How shall we proceed?

This PR fixes the vfork issue so I think it can be safely merged without any other side effect.

As per the signal test regression, #13585 fixes it, but the PR must be cleaned up first before merging.
After both are merged I think we are in the clear, until some new issue rises.

@lupyuen
Copy link
Member

lupyuen commented Sep 26, 2024

As per the signal test regression, #13585 fixes it, but the PR must be cleaned up first before merging.

Hi @hujun260 could you help please? Thank you so much! 🙏

Copy link
Member

@lupyuen lupyuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My Stress Test of rv-virt:knsh64 (reboot qemu + restart ostest) succeeded for all 200 iterations on https://github.com/tiiuae/nuttx/tree/for_lups_testing Thanks!

@hujun260
Copy link
Contributor

As per the signal test regression, #13585 fixes it, but the PR must be cleaned up first before merging.

Hi @hujun260 could you help please? Thank you so much! 🙏

i update my patch, Removed vfork's relevant modifications

@xiaoxiang781216 xiaoxiang781216 merged commit 9ef76e3 into apache:master Sep 27, 2024
29 checks passed
@pussuw pussuw deleted the riscv_fork_kernel_fix branch September 27, 2024 05:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Size: S The size of the change in this PR is small
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants