Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reschedule_from_interrupt assert(sched::exception_depth <= 1) when run specjbb2015 #933

Open
yuwang888 opened this issue Jan 2, 2018 · 5 comments

Comments

@yuwang888
Copy link

yuwang888 commented Jan 2, 2018

page fault outside application, addr: 0xffff800240000000 pc:0x00000000003e0e3c text_start:0x0000000000202000 text_end 0x000000000059df08
this is mean page fault in osv kernel code. but i don't how to debug
[registers]
RIP: 0x00000000003e0e3c memory::page_range_allocator::free(memory::page_range*)+268
RFL: 0x0000000000010202 CS: 0x0000000000000008 SS: 0x0000000000000010
RAX: 0x0000000000000000 RBX: 0xffff80023fe00000 RCX: 0Assertion failed: sched::exception_depth <= 1 (core/sched.cc: reschedule_from_interrupt: 238)

here is the call trace, and the command is "./scripts/run.py -m 8G -c 12", and if i use "./scripts/run.py -m 8G -c 4" it doesn't crash.

Thread 9 (Thread 9):
#0  0x000000000039c942 in cli_hlt () at arch/x64/processor.hh:247
#1  halt_no_interrupts () at arch/x64/arch.hh:48
#2  osv::halt () at arch/x64/power.cc:24
#3  0x0000000000226405 in abort (fmt=fmt@entry=0x635840 "Assertion failed: %s (%s: %s: %d)\n") at runtime.cc:132
#4  0x0000000000226449 in __assert_fail (expr=<optimized out>, file=<optimized out>, line=<optimized out>, func=<optimized out>) at runtime.cc:139
#5  0x00000000003f4a64 in sched::cpu::reschedule_from_interrupt (this=0xffff8001004e5040, called_from_yield=called_from_yield@entry=false, preempt_after=..., preempt_after@entry=...)
    at core/sched.cc:238
#6  0x0000000000397769 in preempt () at include/osv/sched.hh:991
#7  interrupt (frame=0xffff8000bff7d068) at arch/x64/exceptions.cc:266
#8  <signal handler called>
#9  putchar (ch=112 'p') at drivers/isa-serial.cc:108
#10 console::isa_serial_console::write (this=<optimized out>, str=0xb900c1 <debug_ll::msg+1> "xffffa0010144c9ff <???+21285375>\n", len=<optimized out>) at drivers/isa-serial.cc:79
#11 0x0000000000340fc8 in drivers_write (len=1, str=0xb900c0 <debug_ll::msg> "0xffffa0010144c9ff <???+21285375>\n", this=0xb7a6e0 <console::mux>) at drivers/console-multiplexer.cc:49
---Type <return> to continue, or q <return> to quit---
#12 operator() (len=1, str=0xb900c0 <debug_ll::msg> "0xffffa0010144c9ff <???+21285375>\n", __closure=<optimized out>) at drivers/console-multiplexer.cc:36
#13 std::_Function_handler<void(char const*, long unsigned int), console::console_multiplexer::start()::__lambda2>::_M_invoke(const std::_Any_data &, const char *, unsigned long) (
    __functor=..., __args#0=0xb900c0 <debug_ll::msg> "0xffffa0010144c9ff <???+21285375>\n", __args#1=1) at /usr/include/c++/4.8.2/functional:2071
#14 0x0000000000341968 in console::LineDiscipline::write(char const*, unsigned long, std::function<void (char const*, unsigned long)>&) (this=0xffffa00101f59500, 
    str=0xb900c0 <debug_ll::msg> "0xffffa0010144c9ff <???+21285375>\n", len=56, writer=...) at drivers/line-discipline.cc:179
#15 0x00000000003414a1 in console::console_multiplexer::write_ll (this=this@entry=0xb7a6e0 <console::mux>, str=str@entry=0xb900c0 <debug_ll::msg> "0xffffa0010144c9ff <???+21285375>\n", 
    len=<optimized out>) at drivers/console-multiplexer.cc:71
#16 0x0000000000340c11 in console::write_ll (msg=msg@entry=0xb900c0 <debug_ll::msg> "0xffffa0010144c9ff <???+21285375>\n", len=<optimized out>) at drivers/console.cc:63
#17 0x00000000003d3820 in debug_ll (fmt=<optimized out>) at core/debug.cc:250
#18 0x000000000032f93f in mmu::vm_sigsegv (addr=<optimized out>, ef=0xffff8001004e7048) at core/mmu.cc:1314
#19 0x0000000000330b0b in mmu::vm_fault (addr=<optimized out>, addr@entry=18446603345884872704, ef=ef@entry=0xffff8001004e7048) at core/mmu.cc:1338
#20 0x0000000000395fe8 in page_fault (ef=0xffff8001004e7048) at arch/x64/mmu.cc:38
#21 <signal handler called>
#22 remove (pr=..., this=0xb912e0 <memory::free_page_ranges>) at core/mempool.cc:604
#23 memory::page_range_allocator::free (this=this@entry=0xb912e0 <memory::free_page_ranges>, pr=pr@entry=0xffff80023fe00000) at core/mempool.cc:772
#24 0x00000000003e11dd in memory::free_page_range_locked (range=range@entry=0xffff80023fe00000) at core/mempool.cc:1041
#25 0x00000000003e11fe in memory::free_page_range (range=0xffff80023fe00000) at core/mempool.cc:1049
#26 0x00000000003e126a in free_page_range (size=size@entry=2097152, addr=addr@entry=0xffff80023fe00000) at core/mempool.cc:1056
#27 memory::free_huge_page (v=v@entry=0xffff80023fe00000, N=N@entry=2097152) at core/mempool.cc:1497
#28 0x0000000000336271 in set_pte<1> (this=<optimized out>, pte=..., ptep=..., addr=0xffff80023fe00000) at core/mmu.cc:1024
#29 mmu::uninitialized_anonymous_page_provider::map (this=<optimized out>, offset=79691776, ptep=..., pte=..., write=<optimized out>) at core/mmu.cc:1038
#30 0x000000000033f6ef in page<1> (offset=79691776, ptep=..., this=0xffff8000bff80be0) at core/mmu.cc:533
#31 page<mmu::populate<(mmu::account_opt)0>, 1> (offset=79691776, ptep=..., pops=...) at core/mmu.cc:312
#32 mmu::map_level<mmu::populate<(mmu::account_opt)0>, 2>::operator() (this=this@entry=0xffff8000bff80b20, parent=..., base_virt=34059845632, base_virt@entry=33285996544) at core/mmu.cc:438
#33 0x000000000033f976 in map_range<2> (this=<optimized out>, base_virt=<optimized out>, ptep=..., slop=<optimized out>, page_mapper=..., size=<optimized out>, vcur=<optimized out>)
    at core/mmu.cc:400
#34 mmu::map_level<mmu::populate<(mmu::account_opt)0>, 3>::operator() (this=0xffff8000bff80c00, parent=..., base_virt=33285996544) at core/mmu.cc:440
#35 0x000000000033fc8b in map_range<3> (this=<optimized out>, base_virt=0, ptep=..., slop=4096, page_mapper=..., size=2097152, vcur=<optimized out>) at core/mmu.cc:400
#36 operator() (base_virt=0, parent=..., this=<synthetic pointer>) at core/mmu.cc:450
#37 map_range<mmu::populate<(mmu::account_opt)0> > (slop=4096, page_mapper=..., size=<optimized out>, vstart=34059845632, vma_start=33980153856) at core/mmu.cc:355
#38 operate_range<mmu::populate<(mmu::account_opt)0> > (size=<optimized out>, start=<optimized out>, vma_start=0x7e9600000, mapper=...) at core/mmu.cc:807
#39 operate_range<mmu::populate<(mmu::account_opt)0> > (size=34059845632, addr=0xffff800000be6000, mapper=..., this=0x7fffffffff) at core/mmu.cc:1413
#40 mmu::populate_vma<(mmu::account_opt)0> (vma=vma@entry=0xffffa000a6554800, v=v@entry=0x7ee200000, size=size@entry=2097152, write=<optimized out>) at core/mmu.cc:1209
#41 0x00000000003356a8 in mmu::vma::fault (this=0xffffa000a6554800, addr=34059845632, ef=<optimized out>) at core/mmu.cc:1440
#42 0x0000000000330adc in mmu::vm_fault (addr=<optimized out>, addr@entry=34060054424, ef=ef@entry=0xffff8000bff81068) at core/mmu.cc:1342
#43 0x0000000000395fe8 in page_fault (ef=0xffff8000bff81068) at arch/x64/mmu.cc:38
#44 <signal handler called>
#45 0x0000100001c3c1c8 in oopDesc* PSPromotionManager::copy_to_survivor_space<false>(oopDesc*) ()
#46 0x0000100001c3b848 in PSPromotionManager::drain_stacks_depth(bool) ()
#47 0x0000100001c3f3aa in StealTask::do_it(GCTaskManager*, unsigned int) ()
#48 0x0000100001948c62 in GCTaskThread::run() ()
#49 0x0000100001bd0f22 in java_start(Thread*) ()
#50 0x000000000044e836 in operator() (__closure=0xffffa001058b8200) at libc/pthread.cc:114
#51 std::_Function_handler<void(), pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::__lambda6>::_M_invoke(const std::_Any_data &) (
    __functor=...) at /usr/include/c++/4.8.2/functional:2071
#52 0x00000000003f6558 in main (this=0xffff8000bff7c040) at core/sched.cc:1209
#53 sched::thread_main_c (t=0xffff8000bff7c040) at arch/x64/arch-switch.hh:170
#54 0x0000000000395db6 in thread_main () at arch/x64/entry.S:113
@nyh
Copy link
Contributor

nyh commented Jan 2, 2018

Hi, thanks, the gdb stack trace is an excellent start. Does this happen all the time, or intermittently?
We have to start looking at this stack trace from the bottom, because near the top we start to see a cascade of unrelated errors (e.g., a context switch while we are handling a previous error caused an assertion failure), which may be bugs too but unrelated to the primary bug.
In frames 43-41, it seems we had a page fault on newly allocated memory and we need to map a new huge page (2MB page) and frames 40-28 seem to be the implementation of this mapping.
In frame 27, we reached a supposedly rare (?) case described in commit 4bd8720: If two threads concurrently page-fault on the same page and both allocate memory for it, one will fail and free the memory it allocated. So frame 27 frees that 2MB page.
But this free_huge_page() call crashes with a page fault (in frame 20). The details are strange. In frame 23,
pr=0xffff80023fe00000 - this is the physical memory area. So is the address you saw printed of the address of the page fault ( 0xffff800240000000). BUT, 0x240000000 it's at over 9GB, while you only gave the VM 8 GB, so this is why we got the page fault (frame 20). But why did we get this address?

Here I don't know how to continue. If you can easily reproduce this bug, you can try add an assert() in free_huge_page() to crash immediately when the given address is in the physical map (0xffff8...) but more than 8 GB. You can add a similar assert in alloc_huge_page() to see if it ever returns such an impossible address.

I'm sure there's a bug somewhere, but I still can't put my finger on it.

@yuwang888 yuwang888 reopened this Jan 2, 2018
@yuwang888
Copy link
Author

@nyh
./scripts/run.py -m 8G -c 20 -V

pr: ffff80023fe00000 //this is add by me, it is larger than 8g
Assertion failed: (unsigned long)pr <= 0xffff800200000000 (core/mempool.cc: alloc_huge_page: 1482)

[backtrace]
0x0000000000226448 <__assert_fail+24>
0x00000000003e0a97 <memory::alloc_huge_page(unsigned long)+151>
0x00000000003361fd <mmu::uninitialized_anonymous_page_provider::map(unsigned long, mmu::hw_ptep<1>, mmu::pt_element<1>, bool)+45>
0x000000000033f6fe <mmu::map_level<mmu::populate<(mmu::account_opt)0>, 2>::operator()(mmu::hw_ptep<2>, unsigned long)+830>
0x000000000033f985 <mmu::map_level<mmu::populate<(mmu::account_opt)0>, 3>::operator()(mmu::hw_ptep<3>, unsigned long)+261>
0x000000000033fc9a <unsigned long mmu::populate_vma<(mmu::account_opt)0>(mmu::vma*, void*, unsigned long, bool)+602>
0x00000000003356b7 <mmu::vma::fault(unsigned long, exception_frame*)+87>
0x0000000000330aeb <mmu::vm_fault(unsigned long, exception_frame*)+235>
0x0000000000395ff7 <page_fault+119>
0x0000000000394e46 <???+3755590>
0x0000100001c3b847 <???+29603911>
0x0000100001c3f3a9 <???+29619113>
0x0000100001948c61 <???+26512481>
0x0000100001bd0f21 <???+29167393>
0x000000000044e8e5 <???+4516069>
0x00000000003f6607 <thread_main_c+39>
0x0000000000395dc5 <???+3759557>

@yuwang888
Copy link
Author

@nyh @narkisr @pdziepak @anatol @benoit-canet
i think here is one bug, because if i add debug info on alloc_huge_page,
when assert pr >= 4g, the pr is ffff80023fe00000
when assert pr>=8g, the pr is also ffff80023fe00000
pr: ffff80023fe00000Assertion failed: (unsigned long)pr <= 0xffff800100000000 (core/mempool.cc: alloc_huge_page: 1482)

[backtrace]
0x0000000000226448 <__assert_fail+24>
0x00000000003e0a97 <memory::alloc_huge_page(unsigned long)+151>
0x00000000003361fd <mmu::uninitialized_anonymous_page_provider::map(unsigned long, mmu::hw_ptep<1>, mmu::pt_element<1>, bool)+45>
0x000000000033f6fe <mmu::map_level<mmu::populate<(mmu::account_opt)0>, 2>::operator()(mmu::hw_ptep<2>, unsigned long)+830>
0x000000000033f985 <mmu::map_level<mmu::pop

@yuwang888
Copy link
Author

yuwang888 commented Jan 3, 2018

@nyh @narkisr @pdziepak @anatol @benoit-canet
i disable huge page, no crash anymore, so it is huge page bug, pls look at this question

 void vma::fault(uintptr_t addr, exception_frame *ef)
 {
-    auto hp_start = align_up(_range.start(), huge_page_size);
-    auto hp_end = align_down(_range.end(), huge_page_size);
+//     auto hp_start = align_up(_range.start(), huge_page_size);
+//    auto hp_end = align_down(_range.end(), huge_page_size);
     size_t size;
-    if (!has_flags(mmap_jvm_balloon|mmap_small) && (hp_start <= addr && addr < hp_end)) {
-        addr = align_down(addr, huge_page_size);
-        size = huge_page_size;
-    } else {
+//     if (!has_flags(mmap_jvm_balloon|mmap_small) && (hp_start <= addr && addr < hp_end)) {
+//        addr = align_down(addr, huge_page_size);
+//        size = huge_page_size;
+//    } else {
         size = page_size;
-    }
+//    }

@nyh
Copy link
Contributor

nyh commented Jan 3, 2018

Disabling huge pages will reduce performance, as the CPU's address cache (the TLB) will get filled with a lot of small pages instead of few huge pages, so it's generally not a good idea to disable it, except for debugging of course.

I have to admit I don't remember all the details of our memory allocation, which @pdziepak rewrote, so nothing springs to my mind as to why this bug happens. The alloc_huge_page returning an address bigger than 8G seems like a bug to me but maybe there is an explanation why this is legal? (I can't think of any). If that is indeed the bug, you can start to try figuring out where this number comes from, and since it's so easy to recognize with an if() you can perhaps add such if()s in various places to find what is causing this.

You said the bug is specific to having many CPUs. This might also indicate a race or locking bug or memory overrun (like in commit 74543fc) or something so that when a lot of CPUs allocate memory concurrently, one gets wrongs results. Of course, this is just a guess, I have no idea what the actual problem is. Issue #755 is another problem we saw in this area of the code, maybe it's related (but then again, maybe not).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants