-
-
Notifications
You must be signed in to change notification settings - Fork 605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "wait morphing" to improve condvar wake performance #24
Labels
Comments
ghost
assigned nyh
Jul 18, 2013
Here are some tentative implementation ideas for the convar/mutex wait morphing:
|
Done, in commit aa3a624 and adjacent commits. |
penberg
pushed a commit
that referenced
this issue
Aug 26, 2013
Fix mincore() to deal with unmapped addresses like msync() does. This fixes a SIGSEGV in libunwind's access_mem() when leak detector is enabled: (gdb) bt #0 page_fault (ef=0xffffc0003ffe7008) at ../../core/mmu.cc:871 #1 <signal handler called> #2 ContiguousSpace::block_start_const (this=<optimized out>, p=0x77d2f3968) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/oops/oop.inline.hpp:411 #3 0x00001000008ae16c in GenerationBlockStartClosure::do_space (this=0x2000001f9100, s=<optimized out>) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/memory/generation.cpp:242 #4 0x00001000007f097c in DefNewGeneration::space_iterate (this=0xffffc0003fb68c00, blk=0x2000001f9100, usedOnly=<optimized out>) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/memory/defNewGeneration.cpp:480 #5 0x00001000008aca0e in Generation::block_start (this=<optimized out>, p=<optimized out>) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/memory/generation.cpp:251 #6 0x0000100000b06d2f in os::print_location (st=st@entry=0x2000001f9560, x=32165017960, verbose=verbose@entry=false) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/runtime/os.cpp:868 #7 0x0000100000b11b5b in os::print_register_info (st=0x2000001f9560, context=0x2000001f9740) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:839 #8 0x0000100000c6cde8 in VMError::report (this=0x2000001f9610, st=st@entry=0x2000001f9560) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:551 #9 0x0000100000c6da3b in VMError::report_and_die (this=this@entry=0x2000001f9610) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:984 #10 0x0000100000b1109f in JVM_handle_linux_signal (sig=11, info=0x2000001f9bb8, ucVoid=0x2000001f9740, abort_if_unrecognized=<optimized out>) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:528 #11 0x000000000039f242 in call_signal_handler (frame=0x2000001f9b10) at ../../arch/x64/signal.cc:69 #12 <signal handler called> #13 0x000000000057d721 in access_mem () #14 0x000000000057cb1d in dwarf_get () #15 0x000000000057ce51 in _ULx86_64_step () #16 0x00000000004315fd in backtrace (buffer=0x1ff9d80 <memory::alloc_tracker::remember(void*, int)::bt>, size=20) at ../../libc/misc/backtrace.cc:16 #17 0x00000000003b8d99 in memory::alloc_tracker::remember (this=0x1777ae0 <memory::tracker>, addr=0xffffc0004508de00, size=54) at ../../core/alloctracker.cc:59 #18 0x00000000003b0504 in memory::tracker_remember (addr=0xffffc0004508de00, size=54) at ../../core/mempool.cc:43 #19 0x00000000003b2152 in std_malloc (size=54) at ../../core/mempool.cc:723 #20 0x00000000003b259c in malloc (size=54) at ../../core/mempool.cc:856 #21 0x0000100001615e4c in JNU_GetStringPlatformChars (env=env@entry=0xffffc0003a4dc1d8, jstr=jstr@entry=0xffffc0004591b800, isCopy=isCopy@entry=0x0) at ../../../src/share/native/common/jni_util.c:801 #22 0x000010000161ada6 in Java_java_io_UnixFileSystem_getBooleanAttributes0 (env=0xffffc0003a4dc1d8, this=<optimized out>, file=<optimized out>) at ../../../src/solaris/native/java/io/UnixFileSystem_md.c:111 #23 0x000020000021ed8e in ?? () #24 0x00002000001faa58 in ?? () #25 0x00002000001faac0 in ?? () #26 0x00002000001faa50 in ?? () #27 0x0000000000000000 in ?? () Spotted by Avi Kivity.
penberg
pushed a commit
that referenced
this issue
Jan 13, 2014
See scripts/trace.py prof-wait -h The command is using sched_wait and sched_wait_ret tracepoints to calculate the amount of time a thread was waiting. Samples are collected and presented in a form of call graph tree. By default callees are closer to the root. To inverse the order pass -r|--caller-oriented. If there is too much output, it can be narrowed down using --max-levels and --min-duration options. The presented time spectrum can be narrowed down using --since and --until options which accept timestamps. Example: scripts/trace.py prof-wait --max-levels 3 trace-file === Thread 0xffffc0003eaeb010 === 12.43 s (100.00%, #7696) All |-- 12.43 s (99.99%, #7658) sched::thread::do_wait_until | |-- 10.47 s (84.22%, #6417) condvar::wait(lockfree::mutex*, unsigned long) | | condvar_wait | | |-- 6.47 s (52.08%, #6250) cv_timedwait | | | txg_delay | | | dsl_pool_tempreserve_space | | | dsl_dir_tempreserve_space | | | dmu_tx_try_assign | | | dmu_tx_assign | | | | | |-- 2.37 s (19.06%, #24) arc_read_nolock | | | arc_read | | | dsl_read | | | traverse_visitbp | | | | | |-- 911.75 ms (7.33%, #3) txg_wait_open | | | dmu_tx_wait | | | zfs_write | | | vfs_file::write(uio*, int) | | | sys_write | | | pwritev | | | writev | | | __stdio_write | | | __fwritex | | | fwrite | | | 0x100000005a5f | | | osv::run(std::string, int, char**, int*) By default every thread has a separate tree, because duration is best interpreted in the context of particular thread. There is however an option to merge samples from all threads into one tree: -m|--merge-threads. It may be useful if you want to inspect all paths going in/out to/from particular function. The direction can be changed with -r|--caller-oriented option. Function names is passed to --function parameter. Example: check where zfs_write() blocks: scripts/trace.py prof-wait -rm --function=zfs_write trace-file 7.46 s (100.00%, #7314) All zfs_write |-- 6.48 s (86.85%, #6371) dmu_tx_assign | |-- 6.47 s (86.75%, #6273) dmu_tx_try_assign | | dsl_dir_tempreserve_space | | |-- 6.47 s (86.75%, #6248) dsl_pool_tempreserve_space | | | txg_delay | | | cv_timedwait | | | condvar_wait | | | condvar::wait(lockfree::mutex*, unsigned long) | | | sched::thread::do_wait_until | | | | | |-- 87.87 us (0.00%, #24) mutex_lock | | | sched::thread::do_wait_until | | | | | \-- 6.40 us (0.00%, #1) dsl_dir_tempreserve_impl | | mutex_lock | | sched::thread::do_wait_until | | | \-- 7.32 ms (0.10%, #98) mutex_lock | sched::thread::do_wait_until | |-- 911.75 ms (12.22%, #3) dmu_tx_wait | txg_wait_open | condvar_wait | condvar::wait(lockfree::mutex*, unsigned long) | sched::thread::do_wait_until Signed-off-by: Tomasz Grabiec <[email protected]> Signed-off-by: Pekka Enberg <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The existing codvar_wake_one()/all() code wakes up one/all of the threads sleeping on the mutex, and they all wake up and try to take the user mutex associated with the condvar.
This creates two performance problems:
Both problems can be solved by a technique called wait morphing: when the condvar has an associated mutex, condvar_wake_*() does not wake up the threads, but rather move them from waiting on the condition variable, to waiting on the mutex. Each of these threads will later be woken (separately) when the mutex becomes available for it, and when it wakes up it doesn't need to take the mutex, because it was already taken for it.
It is not clear that this issue actually has a lot of performance impact on any of our use cases (once the scheduler bugs are eliminated), but it seems there's popular demand for solving it.
The text was updated successfully, but these errors were encountered: