Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watchman sometimes hangs indefinitely on OS X #96

Closed
dturner-tw opened this issue Apr 22, 2015 · 8 comments
Closed

Watchman sometimes hangs indefinitely on OS X #96

dturner-tw opened this issue Apr 22, 2015 · 8 comments

Comments

@dturner-tw
Copy link
Contributor

We are using ed148a4.

When this happens, we kill it after a few minutes.

Here are some thread traces:

(lldb) bt all
* thread #1: tid = 0x15b834, 0x00007fff883fd94c libsystem_kernel.dylib`poll + 12, queue = 'com.apple.main-thread', stop reason = instruction step into
  * frame #0: 0x00007fff883fd94c libsystem_kernel.dylib`poll + 12
    frame #1: 0x0000000108b621ee watchman`w_start_listener + 1470
    frame #2: 0x0000000108b633e8 watchman`run_service + 136
    frame #3: 0x0000000108b632b0 watchman`main + 2192
    frame #4: 0x00007fff80b385fd libdyld.dylib`start + 1

  thread #2: tid = 0x15b836, 0x00007fff883fca3a libsystem_kernel.dylib`__semwait_signal + 10
    frame #0: 0x00007fff883fca3a libsystem_kernel.dylib`__semwait_signal + 10
    frame #1: 0x00007fff8bedbdc0 libsystem_c.dylib`nanosleep + 200
    frame #2: 0x00007fff8bedbcb2 libsystem_c.dylib`usleep + 54
    frame #3: 0x0000000108b624ea watchman`child_reaper + 58
    frame #4: 0x00007fff8a514899 libsystem_pthread.dylib`_pthread_body + 138
    frame #5: 0x00007fff8a51472a libsystem_pthread.dylib`_pthread_start + 137
    frame #6: 0x00007fff8a518fc9 libsystem_pthread.dylib`thread_start + 13

  thread #3: tid = 0x15e8ff, 0x0000000108b6835f watchman`w_string_equal + 47
    frame #0: 0x0000000108b6835f watchman`w_string_equal + 47
    frame #1: 0x0000000108b59e37 watchman`w_ht_del + 119
    frame #2: 0x0000000108b67a40 watchman`age_out_dir + 144
    frame #3: 0x0000000108b67a28 watchman`age_out_dir + 120
    frame #4: 0x0000000108b65738 watchman`age_out_file + 248
    frame #5: 0x0000000108b679ee watchman`age_out_dir + 62
    frame #6: 0x0000000108b65738 watchman`age_out_file + 248
    frame #7: 0x0000000108b655ae watchman`w_root_perform_age_out + 126
    frame #8: 0x0000000108b6754a watchman`run_notify_thread + 1450
    frame #9: 0x00007fff8a514899 libsystem_pthread.dylib`_pthread_body + 138
    frame #10: 0x00007fff8a51472a libsystem_pthread.dylib`_pthread_start + 137
    frame #11: 0x00007fff8a518fc9 libsystem_pthread.dylib`thread_start + 13

  thread #4: tid = 0x15e900, 0x00007fff883f8a1a libsystem_kernel.dylib`mach_msg_trap + 10
    frame #0: 0x00007fff883f8a1a libsystem_kernel.dylib`mach_msg_trap + 10
    frame #1: 0x00007fff883f7d18 libsystem_kernel.dylib`mach_msg + 64
    frame #2: 0x00007fff8e144f15 CoreFoundation`__CFRunLoopServiceMachPort + 181
    frame #3: 0x00007fff8e144539 CoreFoundation`__CFRunLoopRun + 1161
    frame #4: 0x00007fff8e143e75 CoreFoundation`CFRunLoopRunSpecific + 309
    frame #5: 0x00007fff8e1f9811 CoreFoundation`CFRunLoopRun + 97
    frame #6: 0x0000000108b5ffbe watchman`fsevents_thread + 398
    frame #7: 0x00007fff8a514899 libsystem_pthread.dylib`_pthread_body + 138
    frame #8: 0x00007fff8a51472a libsystem_pthread.dylib`_pthread_start + 137
    frame #9: 0x00007fff8a518fc9 libsystem_pthread.dylib`thread_start + 13

  thread #5: tid = 0x15e901, 0x00007fff883fd662 libsystem_kernel.dylib`kevent64 + 10, queue = 'com.apple.libdispatch-manager'
    frame #0: 0x00007fff883fd662 libsystem_kernel.dylib`kevent64 + 10
    frame #1: 0x00007fff882d9421 libdispatch.dylib`_dispatch_mgr_invoke + 239
    frame #2: 0x00007fff882d9136 libdispatch.dylib`_dispatch_mgr_thread + 52

  thread #6: tid = 0x1c3057, 0x00007fff883fc746 libsystem_kernel.dylib`__psynch_mutexwait + 10
    frame #0: 0x00007fff883fc746 libsystem_kernel.dylib`__psynch_mutexwait + 10
    frame #1: 0x00007fff8a517779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
    frame #2: 0x0000000108b63656 watchman`w_root_lock + 22
    frame #3: 0x0000000108b5c54b watchman`cmd_watch + 75
    frame #4: 0x0000000108b5af98 watchman`dispatch_command + 72
    frame #5: 0x0000000108b6264c watchman`client_thread + 316
    frame #6: 0x00007fff8a514899 libsystem_pthread.dylib`_pthread_body + 138
    frame #7: 0x00007fff8a51472a libsystem_pthread.dylib`_pthread_start + 137
    frame #8: 0x00007fff8a518fc9 libsystem_pthread.dylib`thread_start + 13
(lldb)

From another user

(lldb) bt all
thread #1: tid = 0xf8a67, 0x00007fff9330094a libsystem_kernel.dylib`poll + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
frame #0: 0x00007fff9330094a libsystem_kernel.dylib`poll + 10
frame #1: 0x00000001043951ee watchman`w_start_listener + 1470
frame #2: 0x00000001043963e8 watchman`run_service + 136
frame #3: 0x00000001043962b0 watchman`main + 2192
frame #4: 0x00007fff88fdd5fd libdyld.dylib`start + 1
frame #5: 0x00007fff88fdd5fd libdyld.dylib`start + 1
thread #2: tid = 0xf8a69, 0x00007fff932ffa3a libsystem_kernel.dylib`__semwait_signal + 10
frame #0: 0x00007fff932ffa3a libsystem_kernel.dylib`__semwait_signal + 10
frame #1: 0x00007fff89c08dc0 libsystem_c.dylib`nanosleep + 200
frame #2: 0x00007fff89c08cb2 libsystem_c.dylib`usleep + 54
frame #3: 0x00000001043954ea watchman`child_reaper + 58
frame #4: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #5: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #6: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #3: tid = 0x16a58c, 0x000000010438cdc0 watchman`w_ht_del
frame #0: 0x000000010438cdc0 watchman`w_ht_del
frame #1: 0x000000010439aa40 watchman`age_out_dir + 144
frame #2: 0x000000010439aa28 watchman`age_out_dir + 120
frame #3: 0x0000000104398738 watchman`age_out_file + 248
frame #4: 0x000000010439a9ee watchman`age_out_dir + 62
frame #5: 0x0000000104398738 watchman`age_out_file + 248
frame #6: 0x000000010439a9ee watchman`age_out_dir + 62
frame #7: 0x0000000104398738 watchman`age_out_file + 248
frame #8: 0x000000010439a9ee watchman`age_out_dir + 62
frame #9: 0x0000000104398738 watchman`age_out_file + 248
frame #10: 0x000000010439a9ee watchman`age_out_dir + 62
frame #11: 0x0000000104398738 watchman`age_out_file + 248
frame #12: 0x000000010439a9ee watchman`age_out_dir + 62
frame #13: 0x0000000104398738 watchman`age_out_file + 248
frame #14: 0x000000010439a9ee watchman`age_out_dir + 62
frame #15: 0x0000000104398738 watchman`age_out_file + 248
frame #16: 0x000000010439a9ee watchman`age_out_dir + 62
frame #17: 0x0000000104398738 watchman`age_out_file + 248
frame #18: 0x00000001043985ae watchman`w_root_perform_age_out + 126
frame #19: 0x000000010439a54a watchman`run_notify_thread + 1450
frame #20: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #21: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #22: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #4: tid = 0x16a58d, 0x00007fff932fba1a libsystem_kernel.dylib`mach_msg_trap + 10
frame #0: 0x00007fff932fba1a libsystem_kernel.dylib`mach_msg_trap + 10
frame #1: 0x00007fff932fad18 libsystem_kernel.dylib`mach_msg + 64
frame #2: 0x00007fff88921f15 CoreFoundation`__CFRunLoopServiceMachPort + 181
frame #3: 0x00007fff88921539 CoreFoundation`__CFRunLoopRun + 1161
frame #4: 0x00007fff88920e75 CoreFoundation`CFRunLoopRunSpecific + 309
frame #5: 0x00007fff889d6811 CoreFoundation`CFRunLoopRun + 97
frame #6: 0x0000000104392fbe watchman`fsevents_thread + 398
frame #7: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #8: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #9: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #5: tid = 0x16a58e, 0x00007fff93300662 libsystem_kernel.dylib`kevent64 + 10, queue = 'com.apple.libdispatch-manager'
frame #0: 0x00007fff93300662 libsystem_kernel.dylib`kevent64 + 10
frame #1: 0x00007fff94c0c421 libdispatch.dylib`_dispatch_mgr_invoke + 239
frame #2: 0x00007fff94c0c136 libdispatch.dylib`_dispatch_mgr_thread + 52
thread #6: tid = 0x1b2e38, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #7: tid = 0x1b3e3f, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #8: tid = 0x1b3ec6, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #9: tid = 0x1b99c7, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #10: tid = 0x1b9c27, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #11: tid = 0x1ba060, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #12: tid = 0x1ba0d5, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #13: tid = 0x1ba581, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #14: tid = 0x1baae9, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #15: tid = 0x1bac1e, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #16: tid = 0x1bac60, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #17: tid = 0x1bb954, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #18: tid = 0x1be637, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #19: tid = 0x1be9da, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #20: tid = 0x1bedc0, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13
thread #21: tid = 0x1bee3d, 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #0: 0x00007fff932ff746 libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff8aad2779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
frame #2: 0x0000000104396656 watchman`w_root_lock + 22
frame #3: 0x000000010438f54b watchman`cmd_watch + 75
frame #4: 0x000000010438df98 watchman`dispatch_command + 72
frame #5: 0x000000010439564c watchman`client_thread + 316
frame #6: 0x00007fff8aacf899 libsystem_pthread.dylib`_pthread_body + 138
frame #7: 0x00007fff8aacf72a libsystem_pthread.dylib`_pthread_start + 137
frame #8: 0x00007fff8aad3fc9 libsystem_pthread.dylib`thread_start + 13

I have also seen hangs on Linux, but I know that I am using a kernel with unpatched inotify bugs that I know affect watchman (3.13), so I have not bothered to investigate further.

@wez
Copy link
Contributor

wez commented Apr 22, 2015

Is it hanging or spinning?
If the latter, it's because we traded the crash bug for a perf bug, and you can workaround it by setting your /etc/watchman.json to:

{"gc_interval_seconds": 0}

or by changing the default in the source if that is easier to deploy:

watchman.h:#define DEFAULT_GC_INTERVAL 86400

Also, facepalm moment, I recommend that you upgrade your OS X build to at least 81507da

@dturner-tw
Copy link
Contributor Author

I haven't done a full check, but some people claim it is using 100% CPU (the rest did not mention high CPU). So I guess that's spinning.

@sahrens
Copy link

sahrens commented Apr 24, 2015

We might be hitting this too, causing sporadic failures of our react-native tests - see the error near the end of the output here: https://travis-ci.org/facebook/react-native/jobs/59927075

@wez
Copy link
Contributor

wez commented Apr 26, 2015

I just put https://reviews.facebook.net/D37683 up for review to tackle this issue.

@sahrens The age-out code only triggers once watchman has been running with an active watch that has had files deleted 1 day prior. I'm doubtful that this is impacting your Travis setup. I'll post some suggestions on facebook/react-native#239.

wez added a commit that referenced this issue Apr 26, 2015
Summary:
looking at #96 (comment)

we see a trace like:

```
thread #3: tid = 0x16a58c, 0x000000010438cdc0 watchman`w_ht_del
frame #0: 0x000000010438cdc0 watchman`w_ht_del
frame #1: 0x000000010439aa40 watchman`age_out_dir + 144
frame #2: 0x000000010439aa28 watchman`age_out_dir + 120
frame #3: 0x0000000104398738 watchman`age_out_file + 248
frame #4: 0x000000010439a9ee watchman`age_out_dir + 62
frame #5: 0x0000000104398738 watchman`age_out_file + 248
frame #6: 0x000000010439a9ee watchman`age_out_dir + 62
frame #7: 0x0000000104398738 watchman`age_out_file + 248
frame #8: 0x000000010439a9ee watchman`age_out_dir + 62
frame #9: 0x0000000104398738 watchman`age_out_file + 248
frame #10: 0x000000010439a9ee watchman`age_out_dir + 62
frame #11: 0x0000000104398738 watchman`age_out_file + 248
frame #12: 0x000000010439a9ee watchman`age_out_dir + 62
frame #13: 0x0000000104398738 watchman`age_out_file + 248
frame #14: 0x000000010439a9ee watchman`age_out_dir + 62
frame #15: 0x0000000104398738 watchman`age_out_file + 248
frame #16: 0x000000010439a9ee watchman`age_out_dir + 62
frame #17: 0x0000000104398738 watchman`age_out_file + 248
```

In cases where we have a large number of files that are eligible to age
out across a reasonable number of dirs, we can walk the same portions of
the tree we maintain in memory multiple times.

This diff records the names of the dir nodes for which we would previously
recursively call age_out_dir(); we use a hash to unique the names.

After we have walked all eligible file nodes we then iterate the list
of saved dirs and delete them.

This reduces the complexity of age out processing to a single scan of the file
node list followed by a single scan of the list of dirs that we accumulated in
the file node scan.

Test Plan:
`arc unit tests/integration/age.php` exercises this specifically.  also: `make integration`

a more rigorous test on our www repo:

```
hg up -C master~40000 ; sleep 10 ; hg up -C master ; sleep 10 ; time watchman debug-ageout . 1
hg up -C master~40000 ; sleep 10 ; hg up -C master ; sleep 10 ; time watchman debug-ageout . 1
```

(it's important to run it twice in succession because we had a long
standing bug that we hadn't noticed until now!)

The time portion of this outputs:

```
watchman debug-ageout . 1  0.00s user 0.00s system 2% cpu 0.157 total
```

which is a good bit better than 20+ minutes.

Refs: #96

Reviewers: sid0

Reviewed By: sid0

Differential Revision: https://reviews.facebook.net/D37683
wez added a commit that referenced this issue Apr 26, 2015
Summary: didn't catch this last night because I was focused mostly on
Linux.  On case insensitive filesystems we also need to clean up the
lc_files linkage to avoid a crash during age out.

The multiple attempts I added to the integration test trigger this
issue: `arc unit tests/integration/age.php`

Refs: #96
@wez
Copy link
Contributor

wez commented Jun 10, 2015

Fixed a while back; please re-open if this comes back!

@wez wez closed this as completed Jun 10, 2015
@jtsom
Copy link

jtsom commented Jun 17, 2015

This happened to me just yesterday (6/16) on OS X 10.10.3 while trying to run an ember app with ember-cli. I had to kill the watchman process. Ref: ember-cli/ember-cli#4273 (comment)

@NathanJang
Copy link

Same with @jtsom, but after upgrading to El Capitan. It worked after reinstalling Watchman.

@wez
Copy link
Contributor

wez commented Jun 17, 2015

Please don't pile on an old closed watchman issue. Open a new one with information that can help diagnose and debug it, thanks!

@facebook facebook locked and limited conversation to collaborators Jun 17, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants