Ubuntu 20.04 suddenly exit on V62 (never meet before) #22

LeCmnGend · 2022-01-19T13:58:19Z

After turn on again, it show:
An existing connection was forcibly closed by the remote host.
Press any key to continue...
My .wslconfig file: (w10 update 21h2 16gb ram amd4800h 8 core 16 process)
[wsl2]
kernel = C:\Users\bzImage
processors=13
memory=9.5GB
swap=0.5GB
swapFile=%USERPROFILE%\AppData\Local\Temp\swap.vhdx
localhostForwarding=true

nathanchance · 2022-01-19T15:05:16Z

Is that kernel path right? I would have expected it to be somewhere in your user's folder (I.e., C:\Users\you), rather than the Users folder.

Otherwise, I do not recall ever seeing that error. Does it happen all the time?

LeCmnGend · 2022-01-19T16:41:13Z

I have right path,(2 user profile so i paste in c:\user :)) )
Sometime it happens.
Newest today:
first: I open ubuntu then repo sync (crDroid source)
second: open another ubuntu windows then git clone (my device tree)
i think it cause problem.
try to restore host by default by this solution but it is not working:
microsoft/WSL#4105

LeCmnGend · 2022-01-19T16:43:09Z

How can i grap Ubuntu app logs sir?

nathanchance · 2022-01-19T16:48:12Z

I have no idea unfortunately. If you remove the kernel line, does it still occur?

LeCmnGend · 2022-01-20T07:21:02Z

I have no idea unfortunately. If you remove the kernel line, does it still occur?

Something conflict in v62 sir, v61 work fine,
use kernel from default no suddenly exit (but very lag and slow)

LeCmnGend · 2022-01-20T16:16:01Z

After a day in v61, it works fine sir.

nathanchance · 2022-01-20T16:28:44Z

WSL is pretty much a black box for debugging so since my builds work perfectly fine, I cannot really do much here. I'll probably build a v63 here in the next couple of days that could resolve your issue by chance.

LeCmnGend · 2022-01-24T12:58:15Z

Try v62 again today. When compile rom it eat all my ram (like normal) and then force stop again. No luck for me.

nathanchance · 2022-01-26T00:37:44Z

Could you try this kernel and see if there is any improvement? There was an out of bounds array access that showed up with the latest version of the dxgkrnl driver that I have now fixed, which could potentially be problematic. Additionally, this has a few more -next versions in it, which could fix a transient bug.

LeCmnGend · 2022-01-26T06:43:39Z

Could you try this kernel and see if there is any improvement? There was an out of bounds array access that showed up with the latest version of the dxgkrnl driver that I have now fixed, which could potentially be problematic. Additionally, this has a few more -next versions in it, which could fix a transient bug.

the file is empty sir

nathanchance · 2022-01-26T16:11:02Z

Weird, don't know how that happened.... Try this one.

LeCmnGend · 2022-01-26T16:51:18Z

Weird, don't know how that happened.... Try this one.

i will reply tomorrow, 15 min and still not exist :-D

LeCmnGend · 2022-01-26T17:48:57Z

When i compile rom:
/bin/sh: 1: Cannot fork

Ninja 137 err code.
Both search google all cause by out of memory error.
Back to 61 and still fine. So stranger

nathanchance · 2022-01-26T21:58:35Z

The heaviest memory intensive workload that I have at this point is compiling a kernel with full LTO, which I can complete just fine with that kernel. My machine does have a Ryzen 9 3900X and 64GB of RAM but I only give WSL2 8 cores and 16GB of RAM.

At this point, I am not really sure what could be going wrong.

LeCmnGend · 2022-01-27T05:50:17Z

Fixed: Increase swapFile to 64gb, lol
but v63 is still suddenly exit, which is inluce in this.zip doesnt have in v63 sir?

nathanchance · 2022-01-27T14:41:08Z

There is nothing in that zip that is not in v63.

LeCmnGend · 2022-01-28T14:00:08Z

There is nothing in that zip that is not in v63.

thank again, this zip (maybe 62.5 work fine for me too)

Added a selftest with three__user usages: a __user pointer-type argument in bpf_testmod, a __user pointer-type struct member in bpf_testmod, and a __user pointer-type struct member in vmlinux. In all cases, directly accessing the user memory will result verification failure. $ ./test_progs -v -n 22/3 ... libbpf: prog 'test_user1': BPF program load failed: Permission denied libbpf: prog 'test_user1': -- BEGIN PROG LOAD LOG -- R1 type=ctx expected=fp 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 ; int BPF_PROG(test_user1, struct bpf_testmod_btf_type_tag_1 *arg) 0: (79) r1 = *(u64 *)(r1 +0) func 'bpf_testmod_test_btf_type_tag_user_1' arg0 has btf_id 136561 type STRUCT 'bpf_testmod_btf_type_tag_1' 1: R1_w=user_ptr_bpf_testmod_btf_type_tag_1(id=0,off=0,imm=0) ; g = arg->a; 1: (61) r1 = *(u32 *)(r1 +0) R1 invalid mem access 'user_ptr_' ... #22/3 btf_tag/btf_type_tag_user_mod1:OK $ ./test_progs -v -n 22/4 ... libbpf: prog 'test_user2': BPF program load failed: Permission denied libbpf: prog 'test_user2': -- BEGIN PROG LOAD LOG -- R1 type=ctx expected=fp 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 ; int BPF_PROG(test_user2, struct bpf_testmod_btf_type_tag_2 *arg) 0: (79) r1 = *(u64 *)(r1 +0) func 'bpf_testmod_test_btf_type_tag_user_2' arg0 has btf_id 136563 type STRUCT 'bpf_testmod_btf_type_tag_2' 1: R1_w=ptr_bpf_testmod_btf_type_tag_2(id=0,off=0,imm=0) ; g = arg->p->a; 1: (79) r1 = *(u64 *)(r1 +0) ; R1_w=user_ptr_bpf_testmod_btf_type_tag_1(id=0,off=0,imm=0) ; g = arg->p->a; 2: (61) r1 = *(u32 *)(r1 +0) R1 invalid mem access 'user_ptr_' ... #22/4 btf_tag/btf_type_tag_user_mod2:OK $ ./test_progs -v -n 22/5 ... libbpf: prog 'test_sys_getsockname': BPF program load failed: Permission denied libbpf: prog 'test_sys_getsockname': -- BEGIN PROG LOAD LOG -- R1 type=ctx expected=fp 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 ; int BPF_PROG(test_sys_getsockname, int fd, struct sockaddr *usockaddr, 0: (79) r1 = *(u64 *)(r1 +8) func '__sys_getsockname' arg1 has btf_id 2319 type STRUCT 'sockaddr' 1: R1_w=user_ptr_sockaddr(id=0,off=0,imm=0) ; g = usockaddr->sa_family; 1: (69) r1 = *(u16 *)(r1 +0) R1 invalid mem access 'user_ptr_' ... #22/5 btf_tag/btf_type_tag_user_vmlinux:OK Signed-off-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

After waking up a suspended VM, the kernel prints the following trace for virtio drivers which do not directly call virtio_device_ready() in the .restore: PM: suspend exit irq 22: nobody cared (try booting with the "irqpoll" option) Call Trace: <IRQ> dump_stack_lvl+0x38/0x49 dump_stack+0x10/0x12 __report_bad_irq+0x3a/0xaf note_interrupt.cold+0xb/0x60 handle_irq_event+0x71/0x80 handle_fasteoi_irq+0x95/0x1e0 __common_interrupt+0x6b/0x110 common_interrupt+0x63/0xe0 asm_common_interrupt+0x1e/0x40 ? __do_softirq+0x75/0x2f3 irq_exit_rcu+0x93/0xe0 sysvec_apic_timer_interrupt+0xac/0xd0 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x12/0x20 arch_cpu_idle+0x12/0x20 default_idle_call+0x39/0xf0 do_idle+0x1b5/0x210 cpu_startup_entry+0x20/0x30 start_secondary+0xf3/0x100 secondary_startup_64_no_verify+0xc3/0xcb </TASK> handlers: [<000000008f9bac49>] vp_interrupt [<000000008f9bac49>] vp_interrupt Disabling IRQ #22 This happens because we don't invoke .enable_cbs callback in virtio_device_restore(). That callback is used by some transports (e.g. virtio-pci) to enable interrupts. Let's fix it, by calling virtio_device_ready() as we do in virtio_dev_probe(). This function calls .enable_cts callback and sets DRIVER_OK status bit. This fix also avoids setting DRIVER_OK twice for those drivers that call virtio_device_ready() in the .restore. Fixes: d50497e ("virtio_config: introduce a new .enable_cbs method") Signed-off-by: Stefano Garzarella <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>

BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0 preempt_count: 1, expected: 0 ........... CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.1-rt16-yocto-preempt-rt #22 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x60/0x8c dump_stack+0x10/0x12 __might_resched.cold+0x13b/0x173 rt_spin_lock+0x5b/0xf0 ___cache_free+0xa5/0x180 qlist_free_all+0x7a/0x160 per_cpu_remove_cache+0x5f/0x70 smp_call_function_many_cond+0x4c4/0x4f0 on_each_cpu_cond_mask+0x49/0xc0 kasan_quarantine_remove_cache+0x54/0xf0 kasan_cache_shrink+0x9/0x10 kmem_cache_shrink+0x13/0x20 acpi_os_purge_cache+0xe/0x20 acpi_purge_cached_objects+0x21/0x6d acpi_initialize_objects+0x15/0x3b acpi_init+0x130/0x5ba do_one_initcall+0xe5/0x5b0 kernel_init_freeable+0x34f/0x3ad kernel_init+0x1e/0x140 ret_from_fork+0x22/0x30 When the kmem_cache_shrink() was called, the IPI was triggered, the ___cache_free() is called in IPI interrupt context, the local-lock or spin-lock will be acquired. On PREEMPT_RT kernel, these locks are replaced with sleepbale rt-spinlock, so the above problem is triggered. Fix it by moving the qlist_free_allfrom() from IPI interrupt context to task context when PREEMPT_RT is enabled. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Zqiang <[email protected]> Acked-by: Dmitry Vyukov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]>

BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0 preempt_count: 1, expected: 0 ........... CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.1-rt16-yocto-preempt-rt #22 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x60/0x8c dump_stack+0x10/0x12 __might_resched.cold+0x13b/0x173 rt_spin_lock+0x5b/0xf0 ___cache_free+0xa5/0x180 qlist_free_all+0x7a/0x160 per_cpu_remove_cache+0x5f/0x70 smp_call_function_many_cond+0x4c4/0x4f0 on_each_cpu_cond_mask+0x49/0xc0 kasan_quarantine_remove_cache+0x54/0xf0 kasan_cache_shrink+0x9/0x10 kmem_cache_shrink+0x13/0x20 acpi_os_purge_cache+0xe/0x20 acpi_purge_cached_objects+0x21/0x6d acpi_initialize_objects+0x15/0x3b acpi_init+0x130/0x5ba do_one_initcall+0xe5/0x5b0 kernel_init_freeable+0x34f/0x3ad kernel_init+0x1e/0x140 ret_from_fork+0x22/0x30 When the kmem_cache_shrink() was called, the IPI was triggered, the ___cache_free() is called in IPI interrupt context, the local-lock or spin-lock will be acquired. On PREEMPT_RT kernel, these locks are replaced with sleepbale rt-spinlock, so the above problem is triggered. Fix it by moving the qlist_free_allfrom() from IPI interrupt context to task context when PREEMPT_RT is enabled. [[email protected]: reduce ifdeffery] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Zqiang <[email protected]> Acked-by: Dmitry Vyukov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Andrew Morton <[email protected]>

Li Huafei reports that mcount-based ftrace with module PLTs was broken by commit: a625357 ("arm64: ftrace: consistently handle PLTs.") When a module PLTs are used and a module is loaded sufficiently far away from the kernel, we'll create PLTs for any branches which are out-of-range. These are separate from the special ftrace trampoline PLTs, which the module PLT code doesn't directly manipulate. When mcount is in use this is a problem, as each mcount callsite in a module will be initialized to point to a module PLT, but since commit a625357 ftrace_make_nop() will assume that the callsite has been initialized to point to the special ftrace trampoline PLT, and ftrace_find_callable_addr() rejects other cases. This means that when ftrace tries to initialize a callsite via ftrace_make_nop(), the call to ftrace_find_callable_addr() will find that the `_mcount` stub is out-of-range and is not handled by the ftrace PLT, resulting in a splat: | ftrace_test: loading out-of-tree module taints kernel. | ftrace: no module PLT for _mcount | ------------[ ftrace bug ]------------ | ftrace failed to modify | [<ffff800029180014>] 0xffff800029180014 | actual: 44:00:00:94 | Initializing ftrace call sites | ftrace record flags: 2000000 | (0) | expected tramp: ffff80000802eb3c | ------------[ cut here ]------------ | WARNING: CPU: 3 PID: 157 at kernel/trace/ftrace.c:2120 ftrace_bug+0x94/0x270 | Modules linked in: | CPU: 3 PID: 157 Comm: insmod Tainted: G O 6.0.0-rc6-00151-gcd722513a189-dirty #22 | Hardware name: linux,dummy-virt (DT) | pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : ftrace_bug+0x94/0x270 | lr : ftrace_bug+0x21c/0x270 | sp : ffff80000b2bbaf0 | x29: ffff80000b2bbaf0 x28: 0000000000000000 x27: ffff0000c4d38000 | x26: 0000000000000001 x25: ffff800009d7e000 x24: ffff0000c4d86e00 | x23: 0000000002000000 x22: ffff80000a62b000 x21: ffff8000098ebea8 | x20: ffff0000c4d38000 x19: ffff80000aa24158 x18: ffffffffffffffff | x17: 0000000000000000 x16: 0a0d2d2d2d2d2d2d x15: ffff800009aa9118 | x14: 0000000000000000 x13: 6333626532303830 x12: 3030303866666666 | x11: 203a706d61727420 x10: 6465746365707865 x9 : 3362653230383030 | x8 : c0000000ffffefff x7 : 0000000000017fe8 x6 : 000000000000bff4 | x5 : 0000000000057fa8 x4 : 0000000000000000 x3 : 0000000000000001 | x2 : ad2cb14bb5438900 x1 : 0000000000000000 x0 : 0000000000000022 | Call trace: | ftrace_bug+0x94/0x270 | ftrace_process_locs+0x308/0x430 | ftrace_module_init+0x44/0x60 | load_module+0x15b4/0x1ce8 | __do_sys_init_module+0x1ec/0x238 | __arm64_sys_init_module+0x24/0x30 | invoke_syscall+0x54/0x118 | el0_svc_common.constprop.4+0x84/0x100 | do_el0_svc+0x3c/0xd0 | el0_svc+0x1c/0x50 | el0t_64_sync_handler+0x90/0xb8 | el0t_64_sync+0x15c/0x160 | ---[ end trace 0000000000000000 ]--- | ---------test_init----------- Fix this by reverting to the old behaviour of ignoring the old instruction when initialising an mcount callsite in a module, which was the behaviour prior to commit a625357. Signed-off-by: Mark Rutland <[email protected]> Fixes: a625357 ("arm64: ftrace: consistently handle PLTs.") Reported-by: Li Huafei <[email protected]> Link: https://lore.kernel.org/linux-arm-kernel/[email protected] Cc: Ard Biesheuvel <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>

Tests for races between shinfo_cache (de)activation and hypercall+ioctl() processing. KVM has had bugs where activating the shared info cache multiple times and/or with concurrent users results in lock corruption, NULL pointer dereferences, and other fun. For the timer injection testcase (#22), re-arm the timer until the IRQ is successfully injected. If the timer expires while the shared info is deactivated (invalid), KVM will drop the event. Signed-off-by: Michal Luczaj <[email protected]> Co-developed-by: Sean Christopherson <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>

nathanchance added wontfix This will not be worked on bug Something isn't working unreproducible This bug cannot be reproduced and removed wontfix This will not be worked on labels Jan 26, 2022

nathanchance closed this as completed Jan 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ubuntu 20.04 suddenly exit on V62 (never meet before) #22

Ubuntu 20.04 suddenly exit on V62 (never meet before) #22

LeCmnGend commented Jan 19, 2022

nathanchance commented Jan 19, 2022

LeCmnGend commented Jan 19, 2022 •

edited

Loading

LeCmnGend commented Jan 19, 2022

nathanchance commented Jan 19, 2022

LeCmnGend commented Jan 20, 2022

LeCmnGend commented Jan 20, 2022

nathanchance commented Jan 20, 2022

LeCmnGend commented Jan 24, 2022

nathanchance commented Jan 26, 2022

LeCmnGend commented Jan 26, 2022

nathanchance commented Jan 26, 2022

LeCmnGend commented Jan 26, 2022

LeCmnGend commented Jan 26, 2022

nathanchance commented Jan 26, 2022

LeCmnGend commented Jan 27, 2022

nathanchance commented Jan 27, 2022

LeCmnGend commented Jan 28, 2022

Ubuntu 20.04 suddenly exit on V62 (never meet before) #22

Ubuntu 20.04 suddenly exit on V62 (never meet before) #22

Comments

LeCmnGend commented Jan 19, 2022

nathanchance commented Jan 19, 2022

LeCmnGend commented Jan 19, 2022 • edited Loading

LeCmnGend commented Jan 19, 2022

nathanchance commented Jan 19, 2022

LeCmnGend commented Jan 20, 2022

LeCmnGend commented Jan 20, 2022

nathanchance commented Jan 20, 2022

LeCmnGend commented Jan 24, 2022

nathanchance commented Jan 26, 2022

LeCmnGend commented Jan 26, 2022

nathanchance commented Jan 26, 2022

LeCmnGend commented Jan 26, 2022

LeCmnGend commented Jan 26, 2022

nathanchance commented Jan 26, 2022

LeCmnGend commented Jan 27, 2022

nathanchance commented Jan 27, 2022

LeCmnGend commented Jan 28, 2022

LeCmnGend commented Jan 19, 2022 •

edited

Loading