CPU stall / hung - only able to clear via hard power cycle #557

machtech · 2012-02-03T14:02:20Z

G'day ZFS on Linux community.

Running ZFS rc6 from PPA on Ubuntu 11.10 x64. Really hoping this can be solved when rc7 is achieved please. Any debugging or assistance we can provide to the devs/project with our 8x SANs will be provided freely.........

When moderate activity placed on file services presented via ZFS, will cause a "rcu_sched_state detected stall on CPU" error to be logged by the kernel, and / or the error logged being "hung_task_timeout_secs". See bottom of this post for relevant logging extracts, where despite all the great work being done by the devs/project - the issue has remained outstanding for the past 2+months. Issue routinely experienced across a diverse range of physical hardware from $2k through to $100k kit.

The net effect is continually escalating high load average will occur, climbing up to as high as 1000 load average value seen if left to run for hours. If entering "reboot" command the server will never actually reboot - the only action to clear the condition is to hard power cycle.

We have been able to reduce the frequency of these attacks from a few times a day to a few times a month, by limiting the size of "zfs_arc_max" to 1/4 of available RAM.

This is a very serious issue for us - as it is a deal-breaking stability issue that is preventing more wide scale and more serious production rollout.

I will take a moment to celebrate the ZFS on Linux's ability to maintain data integrity in the face of this issue - there are very few SAN systems that can routinely have the power literally hard cycled as much as we have done, and NEVER lose a single file or corrupt a single file. Utterly impressive :-)

Very much looking forward to solving this stability issue as after months of trying every possible suggestion in the newsgroups, we have had not found a viable workaround.

Cheers

Paul

root@gsan1-coy:~# cat /var/log/syslog | grep -e hung
Nov 24 17:15:23 gsan1-coy kernel: [232737.512137] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 17:17:24 gsan1-coy kernel: [232857.504204] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 17:19:24 gsan1-coy kernel: [232977.495599] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 17:21:24 gsan1-coy kernel: [233097.487354] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 17:27:24 gsan1-coy kernel: [233457.413291] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 17:43:22 gsan1-coy kernel: [ 361.018833] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 18:08:18 gsan1-coy kernel: [ 361.024042] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 18:08:18 gsan1-coy kernel: [ 361.051342] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 18:42:29 gsan1-coy kernel: [ 241.052531] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 18:44:29 gsan1-coy kernel: [ 361.045481] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 18:56:33 gsan1-coy kernel: [ 1077.001800] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 18:58:28 gsan1-coy kernel: [ 1196.981191] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 18:58:28 gsan1-coy kernel: [ 1196.996135] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 19:00:25 gsan1-coy kernel: [ 1316.976739] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 19:00:26 gsan1-coy kernel: [ 1316.994442] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 19:02:25 gsan1-coy kernel: [ 1436.982179] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 19:16:25 gsan1-coy kernel: [ 2276.771722] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 24 19:29:28 gsan1-coy kernel: [ 3059.941819] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

root@gsan1-coy:~# cat /var/log/syslog | grep stall
Nov 24 18:54:25 gsan1-coy kernel: [ 432.316135] INFO: rcu_sched_state detected stall on CPU 2 (t=15000 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 432.324129] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 2} (detected by 7, t=15002 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 575.229472] INFO: rcu_sched_state detected stall on CPU 6 (t=15000 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 575.229476] INFO: rcu_sched_state detected stall on CPU 2 (t=15000 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 575.229480] INFO: rcu_sched_state detected stall on CPU 7 (t=15000 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 575.229485] INFO: rcu_sched_state detected stall on CPU 5 (t=15000 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 575.229489] INFO: rcu_sched_state detected stall on CPU 1 (t=15000 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 575.229494] INFO: rcu_sched_state detected stall on CPU 4 (t=15000 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 575.229497] INFO: rcu_sched_state detected stall on CPU 3 (t=15000 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 755.300583] INFO: rcu_sched_state detected stall on CPU 6 (t=60030 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 755.300589] INFO: rcu_sched_state detected stall on CPU 5 (t=60030 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 755.300593] INFO: rcu_sched_state detected stall on CPU 7 (t=60030 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 755.300597] INFO: rcu_sched_state detected stall on CPU 3 (t=60030 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 755.300601] INFO: rcu_sched_state detected stall on CPU 2 (t=60030 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 755.300604] INFO: rcu_sched_state detected stall on CPU 1 (t=60030 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 755.300609] INFO: rcu_sched_state detected stall on CPU 4 (t=60030 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 935.371591] INFO: rcu_sched_state detected stall on CPU 5 (t=105060 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 935.371596] INFO: rcu_sched_state detected stall on CPU 6 (t=105060 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 935.371601] INFO: rcu_sched_state detected stall on CPU 7 (t=105060 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 935.371605] INFO: rcu_sched_state detected stall on CPU 1 (t=105060 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 935.371608] INFO: rcu_sched_state detected stall on CPU 2 (t=105060 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 935.371612] INFO: rcu_sched_state detected stall on CPU 3 (t=105060 jiffies)
Nov 24 18:54:25 gsan1-coy kernel: [ 935.371616] INFO: rcu_sched_state detected stall on CPU 4 (t=105060 jiffies)
Nov 24 19:25:28 gsan1-coy kernel: [ 2562.893218] INFO: rcu_sched_state detected stall on CPU 6 (t=15000 jiffies)
Nov 24 19:25:28 gsan1-coy kernel: [ 2562.901208] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 6} (detected by 5, t=15002 jiffies)
Nov 24 19:25:28 gsan1-coy kernel: [ 2742.970645] INFO: rcu_sched_state detected stall on CPU 6 (t=60032 jiffies)
Nov 24 19:25:28 gsan1-coy kernel: [ 2742.978638] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 6} (detected by 5, t=60034 jiffies)
Nov 24 19:38:48 gsan1-coy kernel: [ 3196.369340] INFO: rcu_sched_state detected stall on CPU 7 (t=15000 jiffies)
Nov 24 19:38:48 gsan1-coy kernel: [ 3196.369345] INFO: rcu_sched_state detected stall on CPU 5 (t=15000 jiffies)
Nov 24 19:38:48 gsan1-coy kernel: [ 3196.369349] INFO: rcu_sched_state detected stall on CPU 6 (t=15000 jiffies)
Nov 24 19:38:48 gsan1-coy kernel: [ 3196.369353] INFO: rcu_sched_state detected stall on CPU 1 (t=15000 jiffies)
Nov 24 19:38:48 gsan1-coy kernel: [ 3196.369357] INFO: rcu_sched_state detected stall on CPU 2 (t=15000 jiffies)
Nov 24 19:38:48 gsan1-coy kernel: [ 3196.369362] INFO: rcu_sched_state detected stall on CPU 4 (t=15000 jiffies)
Nov 24 19:38:48 gsan1-coy kernel: [ 3196.369365] INFO: rcu_sched_state detected stall on CPU 3 (t=15000 jiffies)
Nov 24 19:38:48 gsan1-coy kernel: [ 3196.369368] INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3376.440249] INFO: rcu_sched_state detected stall on CPU 6 (t=60030 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3376.440255] INFO: rcu_sched_state detected stall on CPU 7 (t=60030 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3376.440259] INFO: rcu_sched_state detected stall on CPU 5 (t=60030 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3376.440263] INFO: rcu_sched_state detected stall on CPU 1 (t=60030 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3376.440268] INFO: rcu_sched_state detected stall on CPU 4 (t=60030 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3376.440271] INFO: rcu_sched_state detected stall on CPU 3 (t=60030 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3376.440275] INFO: rcu_sched_state detected stall on CPU 2 (t=60030 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3376.440278] INFO: rcu_sched_state detected stall on CPU 0 (t=60030 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3556.510874] INFO: rcu_sched_state detected stall on CPU 7 (t=105060 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3556.510879] INFO: rcu_sched_state detected stall on CPU 6 (t=105060 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3556.510884] INFO: rcu_sched_state detected stall on CPU 5 (t=105060 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3556.510888] INFO: rcu_sched_state detected stall on CPU 3 (t=105060 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3556.510892] INFO: rcu_sched_state detected stall on CPU 1 (t=105060 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3556.510896] INFO: rcu_sched_state detected stall on CPU 4 (t=105060 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3556.510899] INFO: rcu_sched_state detected stall on CPU 2 (t=105060 jiffies)
Nov 24 19:39:04 gsan1-coy kernel: [ 3556.510903] INFO: rcu_sched_state detected stall on CPU 0 (t=105060 jiffies)
Nov 24 20:21:30 gsan1-coy kernel: [ 5914.420208] INFO: rcu_sched_state detected stall on CPU 1 (t=15000 jiffies)
Nov 24 20:21:30 gsan1-coy kernel: [ 6094.494578] INFO: rcu_sched_state detected stall on CPU 1 (t=60031 jiffies)
Nov 24 22:13:27 gsan1-coy kernel: [12757.532111] INFO: rcu_sched_state detected stall on CPU 2 (t=15000 jiffies)
Nov 24 22:13:27 gsan1-coy kernel: [12757.540104] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 2} (detected by 7, t=15002 jiffies)
Nov 24 22:47:10 gsan1-coy kernel: [14827.405601] INFO: rcu_sched_state detected stall on CPU 7 (t=15000 jiffies)
Nov 24 22:47:10 gsan1-coy kernel: [14827.405606] INFO: rcu_sched_state detected stall on CPU 6 (t=15000 jiffies)
Nov 24 22:47:10 gsan1-coy kernel: [14827.405611] INFO: rcu_sched_state detected stall on CPU 5 (t=15000 jiffies)
Nov 24 22:47:10 gsan1-coy kernel: [14827.405614] INFO: rcu_sched_state detected stall on CPU 1 (t=15000 jiffies)
Nov 24 22:47:10 gsan1-coy kernel: [14827.405618] INFO: rcu_sched_state detected stall on CPU 2 (t=15000 jiffies)
Nov 24 22:47:10 gsan1-coy kernel: [14827.405622] INFO: rcu_sched_state detected stall on CPU 3 (t=15000 jiffies)
Nov 24 22:47:10 gsan1-coy kernel: [14827.405625] INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies)
Nov 24 22:47:10 gsan1-coy kernel: [14827.405629] INFO: rcu_sched_state detected stall on CPU 4 (t=15000 jiffies)
Nov 25 00:19:28 gsan1-coy kernel: [20240.935394] INFO: rcu_sched_state detected stall on CPU 5 (t=15000 jiffies)
Nov 25 00:19:28 gsan1-coy kernel: [20421.005987] INFO: rcu_sched_state detected stall on CPU 5 (t=60030 jiffies)
Nov 25 00:26:41 gsan1-coy kernel: [20888.081850] INFO: rcu_sched_state detected stall on CPU 5 (t=15000 jiffies)
Nov 25 00:26:41 gsan1-coy kernel: [20888.081855] INFO: rcu_sched_state detected stall on CPU 6 (t=15000 jiffies)
Nov 25 00:26:41 gsan1-coy kernel: [20888.081860] INFO: rcu_sched_state detected stall on CPU 7 (t=15000 jiffies)
Nov 25 00:26:41 gsan1-coy kernel: [20888.081864] INFO: rcu_sched_state detected stall on CPU 2 (t=15000 jiffies)
Nov 25 00:26:41 gsan1-coy kernel: [20888.081868] INFO: rcu_sched_state detected stall on CPU 3 (t=15000 jiffies)
Nov 25 00:26:41 gsan1-coy kernel: [20888.081872] INFO: rcu_sched_state detected stall on CPU 4 (t=15000 jiffies)
Nov 25 00:26:41 gsan1-coy kernel: [20888.081875] INFO: rcu_sched_state detected stall on CPU 1 (t=15000 jiffies)
Nov 25 00:29:42 gsan1-coy kernel: [21068.152456] INFO: rcu_sched_state detected stall on CPU 6 (t=60030 jiffies)
Nov 25 00:29:42 gsan1-coy kernel: [21068.152461] INFO: rcu_sched_state detected stall on CPU 5 (t=60030 jiffies)
Nov 25 00:29:42 gsan1-coy kernel: [21068.152466] INFO: rcu_sched_state detected stall on CPU 7 (t=60030 jiffies)
Nov 25 00:29:42 gsan1-coy kernel: [21068.152470] INFO: rcu_sched_state detected stall on CPU 4 (t=60030 jiffies)
Nov 25 00:29:42 gsan1-coy kernel: [21068.152474] INFO: rcu_sched_state detected stall on CPU 2 (t=60030 jiffies)
Nov 25 00:29:42 gsan1-coy kernel: [21068.152477] INFO: rcu_sched_state detected stall on CPU 1 (t=60030 jiffies)
Nov 25 00:29:42 gsan1-coy kernel: [21068.152481] INFO: rcu_sched_state detected stall on CPU 3 (t=60030 jiffies)

chrisrd · 2012-02-10T07:33:48Z

G'day,

"Me too". linux-3.1.10, openzfs/spl@3c6ed54, b4b599d

I just got these for the first time today, after 8 days of uptime with a constant inbound rsync load:

[697058.832119] INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies)
[697236.655477] INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies)
[697393.897927] INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies)
[697631.698675] INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies)
[697811.546972] INFO: rcu_sched_state detected stall on CPU 0 (t=60030 jiffies)
[697889.109795] INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies)
[698068.958092] INFO: rcu_sched_state detected stall on CPU 0 (t=60030 jiffies)
[698248.806380] INFO: rcu_sched_state detected stall on CPU 0 (t=105060 jiffies)
[698355.489202] INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies)
[698535.337484] INFO: rcu_sched_state detected stall on CPU 0 (t=60030 jiffies)
[698666.071968] INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies)

...and the machine required a hard reset.

The above is a direct cut-n-paste from kern.log, i.e. there weren't any stack traces associated with these stalls that might help diagnose where they came from (Documentation/RCU/stallwarn.txt). The kernel seems to have stack tracing enabled:

$ grep STACKTRACE .config
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_STACKTRACE=y
CONFIG_USER_STACKTRACE_SUPPORT=y

Is there something I need to enable to get a stack trace with the stall warning?

Paul, are there any stack traces associated with your stalls?

In contrast to Paul, I don't have any hung task messages since the machine booted.

Chris

machtech · 2012-02-10T09:29:32Z

G'day Chris,

Sorry to hear you are affected also. I have not been able to pin down stack trace relevant data, but I am not an expert in that regard, so for me has been the proverbial "needle in the haystack".

I am hoping there is a definitive reply soon on what precisely we need to do to give the dev's the data they need to debug/nail this issue down.

Cheers

Paul

chrisrd · 2012-02-11T01:34:14Z

Hi Paul,

If you have stack traces they should be in your log just after the related "detected stall" line, something like... oh, let me see, I think I might have one or two around... :-) :-( (note: this one's unrelated to the "detected stall" stuff):

Feb  8 13:11:27 b2 kernel: [ 4069.503225] BUG: unable to handle kernel NULL pointer dereference at 00000000000001c6
Feb  8 13:11:27 b2 kernel: [ 4069.503276] IP: [<ffffffff8112a270>] __blkdev_get+0x8f/0x399
Feb  8 13:11:27 b2 kernel: [ 4069.503309] PGD bdf8d9067 PUD c03a1a067 PMD 0 
Feb  8 13:11:27 b2 kernel: [ 4069.503340] Oops: 0000 [#1] SMP 
Feb  8 13:11:27 b2 kernel: [ 4069.503367] CPU 0 
Feb  8 13:11:27 b2 kernel: [ 4069.503373] Modules linked in: target_core_mod configfs fuse nfsd exportfs ext4 mbcache jbd2 crc16 zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl zlib_deflate sha256_generic cryptd aes_x86_64 aes_generic cbc dm_crypt dm_mod nfs lockd auth_rpcgss nfs_acl sunrpc bridge stp llc usbhid hid sg sd_mod uhci_hcd psmouse mpt2sas scsi_transport_sas raid_class scsi_mod i2c_i801 i2c_core ioatdma ehci_hcd igb dca button processor thermal_sys
Feb  8 13:11:27 b2 kernel: [ 4069.508653] 
Feb  8 13:11:27 b2 kernel: [ 4069.508673] Pid: 12943, comm: zpool Tainted: P            3.1.10-otn-00022-gbdeb973 #1 Supermicro X8DTH-i/6/iF/6F/X8DTH
Feb  8 13:11:27 b2 kernel: [ 4069.508729] RIP: 0010:[<ffffffff8112a270>]  [<ffffffff8112a270>] __blkdev_get+0x8f/0x399
Feb  8 13:11:27 b2 kernel: [ 4069.508778] RSP: 0018:ffff880c029adb98  EFLAGS: 00010282
Feb  8 13:11:27 b2 kernel: [ 4069.508804] RAX: ffffffffffffff86 RBX: ffff881808c0bdc0 RCX: 00000000000000ff
Feb  8 13:11:27 b2 kernel: [ 4069.508833] RDX: ffffffffffffff86 RSI: 0000000000000000 RDI: ffffffffa03ae830
Feb  8 13:11:27 b2 kernel: [ 4069.508862] RBP: ffff880c029adbf8 R08: ffff880c029adb80 R09: 0000000000000000
Feb  8 13:11:27 b2 kernel: [ 4069.508891] R10: ffff88180f001260 R11: ffff881803f95eb0 R12: 0000000000000000
Feb  8 13:11:27 b2 kernel: [ 4069.508921] R13: ffffffffffffff86 R14: 000000000000001d R15: ffff880c029adbc4
Feb  8 13:11:27 b2 kernel: [ 4069.508951] FS:  00007f6e514fbc00(0000) GS:ffff880c0fc00000(0000) knlGS:0000000000000000
Feb  8 13:11:27 b2 kernel: [ 4069.508996] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb  8 13:11:27 b2 kernel: [ 4069.509023] CR2: 00000000000001c6 CR3: 0000000c05aad000 CR4: 00000000000006f0
Feb  8 13:11:27 b2 kernel: [ 4069.509053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb  8 13:11:27 b2 kernel: [ 4069.509082] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Feb  8 13:11:27 b2 kernel: [ 4069.509112] Process zpool (pid: 12943, threadinfo ffff880c029ac000, task ffff880be112d9c0)
Feb  8 13:11:27 b2 kernel: [ 4069.509156] Stack:
Feb  8 13:11:27 b2 kernel: [ 4069.509176]  ffff880c0f810c00 0000000000000000 ffff880c029adbc8 ffff881808c0bdd8
Feb  8 13:11:27 b2 kernel: [ 4069.509227]  ffff881808c0be98 000000000f810c00 ffff880c029adbe8 ffff881808c0bdc0
Feb  8 13:11:27 b2 kernel: [ 4069.509279]  0000000000000000 ffff880c05119680 0000000000000000 ffffffff8112a85c
Feb  8 13:11:27 b2 kernel: [ 4069.509330] Call Trace:
Feb  8 13:11:27 b2 kernel: [ 4069.509353]  [<ffffffff8112a85c>] ? blkdev_get+0x2e2/0x2e2
Feb  8 13:11:27 b2 kernel: [ 4069.509380]  [<ffffffff8112a751>] blkdev_get+0x1d7/0x2e2
Feb  8 13:11:27 b2 kernel: [ 4069.509407]  [<ffffffff81129a60>] ? bdget+0x11a/0x126
Feb  8 13:11:27 b2 kernel: [ 4069.509434]  [<ffffffff8112a85c>] ? blkdev_get+0x2e2/0x2e2
Feb  8 13:11:27 b2 kernel: [ 4069.509461]  [<ffffffff8112a8c5>] blkdev_open+0x69/0x6d
Feb  8 13:11:27 b2 kernel: [ 4069.509490]  [<ffffffff810febd6>] __dentry_open+0x17c/0x290
Feb  8 13:11:27 b2 kernel: [ 4069.509518]  [<ffffffff81108e89>] ? do_lookup+0x46/0x2ed
Feb  8 13:11:27 b2 kernel: [ 4069.509545]  [<ffffffff810feda7>] nameidata_to_filp+0x5b/0x62
Feb  8 13:11:27 b2 kernel: [ 4069.509573]  [<ffffffff8110bef9>] do_last+0x605/0x760
Feb  8 13:11:27 b2 kernel: [ 4069.509599]  [<ffffffff8110c11e>] path_openat+0xca/0x344
Feb  8 13:11:27 b2 kernel: [ 4069.509626]  [<ffffffff8110ce70>] ? user_path_at_empty+0x5e/0x8f
Feb  8 13:11:27 b2 kernel: [ 4069.509655]  [<ffffffff8110c487>] do_filp_open+0x38/0x84
Feb  8 13:11:27 b2 kernel: [ 4069.509683]  [<ffffffff81115e2f>] ? alloc_fd+0x76/0x11e
Feb  8 13:11:27 b2 kernel: [ 4069.509709]  [<ffffffff810fe879>] do_sys_open+0x10b/0x19d
Feb  8 13:11:27 b2 kernel: [ 4069.509736]  [<ffffffff810fe917>] sys_openat+0xc/0xe
Feb  8 13:11:27 b2 kernel: [ 4069.509764]  [<ffffffff812ce612>] system_call_fastpath+0x16/0x1b
Feb  8 13:11:27 b2 kernel: [ 4069.509791] Code: 00 85 c0 41 89 c4 0f 85 0e 03 00 00 eb c0 8b 3b 4c 89 fe e8 01 0f 05 00 48 85 c0 49 89 c5 75 0b 41 bc fa ff ff ff e9 ef 02 00 00 
Feb  8 13:11:27 b2 kernel: [ 4069.510006] RIP  [<ffffffff8112a270>] __blkdev_get+0x8f/0x399
Feb  8 13:11:27 b2 kernel: [ 4069.510036]  RSP <ffff880c029adb98>
Feb  8 13:11:27 b2 kernel: [ 4069.510058] CR2: 00000000000001c6
Feb  8 13:11:27 b2 kernel: [ 4069.510418] ---[ end trace 64e7ba432a4b756e ]---

...all the stuff between the "BUG" line and the "end trace" line should be included.

It would be interesting to get a bunch of traces for your "detected stall" instances if you have them.

Chris

chrisrd · 2012-02-11T03:41:37Z

OK, looks like I can reproduce the stalls at will by doing a getfattr -dh a_particular_symlink. Note that this filesystem is used as a gluster "brick" so all dirs and files (including symlinks) have xattrs, and at one point it was mounted xattr=sa (ref: #513 (comment)). Man, I'm lovin' those xattrs! (ref: #503, #513)

Hmmm... Paul, are you using xattrs?

I had an open window to the system which was still responsive (for a while) and I was able to get some stack traces using echo l > /proc/sysrq-trigger:

[63729.740222] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=15002 jiffies)
[63909.596496] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=60034 jiffies)
[64089.452753] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=105066 jiffies)
[64269.309020] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=150098 jiffies)
[64449.165298] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=195130 jiffies)
[64629.021552] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=240162 jiffies)
[64808.877821] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=285194 jiffies)
[64988.734134] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=330226 jiffies)
[65168.590498] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=375258 jiffies)
[65348.446856] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=420290 jiffies)
[65528.303221] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=465322 jiffies)
[65622.672282] SysRq : Show backtrace of all active CPUs
[65622.672684] CPU6:
[65622.672762]  ffff880c3fc83ed0 ffff880c3fc83ee8 0000000000000046 0000000000000006
[65622.673067]  ffff880c3fc83f60 00000000ffffffff ffff880627c12ee8 ffff880c3fc83f18
[65622.673370]  ffffffff8101294b ffff880c3fc83f38 ffffffff811fb6cc 000000000000000f
[65622.673673] Call Trace:
[65622.673751]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[65622.673898]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[65622.673990]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[65622.674109]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[65622.674200]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[65622.674290]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[65622.674385]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[65622.674475]  <EOI>  [<ffffffff81079b10>] ? smp_call_function_many+0x1de/0x1f5
[65622.674617]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[65622.674705]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[65622.674795]  [<ffffffff81079b5e>] ? on_each_cpu+0x16/0x45
[65622.674883]  [<ffffffff81034343>] ? flush_tlb_all+0x17/0x19
[65622.674974]  [<ffffffff810e5c14>] ? __purge_vmap_area_lazy+0x132/0x18d
[65622.675067]  [<ffffffff810e5cc3>] ? free_vmap_area_noflush+0x54/0x5b
[65622.675159]  [<ffffffff810e71dc>] ? free_unmap_vmap_area+0x20/0x24
[65622.675250]  [<ffffffff810e7244>] ? remove_vm_area+0x64/0x78
[65622.675339]  [<ffffffff810e7353>] ? __vunmap+0x39/0xb8
[65622.675426]  [<ffffffff810e7423>] ? vfree+0x29/0x2b
[65622.675522]  [<ffffffffa02c69ea>] ? kv_free+0x33/0x35 [spl]
[65622.675614]  [<ffffffffa02c6be6>] ? spl_slab_reclaim+0x1fa/0x223 [spl]
[65622.675740]  [<ffffffffa03edf8e>] ? zpl_inode_alloc+0x70/0x70 [zfs]
[65622.675834]  [<ffffffffa02c6c5b>] ? spl_kmem_cache_reap_now+0x4c/0x5f [spl]
[65622.675931]  [<ffffffffa02c6cac>] ? __spl_kmem_cache_generic_shrinker+0x3e/0x89 [spl]
[65622.676053]  [<ffffffffa02c6d24>] ? spl_kmem_reap+0x22/0x24 [spl]
[65622.676167]  [<ffffffffa03edfc3>] ? zpl_prune_sbs+0x35/0x37 [zfs]
[65622.676261]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[65622.676364]  [<ffffffffa036e9de>] ? arc_adjust_meta+0x135/0x194 [zfs]
[65622.676459]  [<ffffffffa02c6792>] ? spl_kmem_availrmem+0x19/0x1f [spl]
[65622.676554]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[65622.676659]  [<ffffffffa036f670>] ? arc_reclaim_thread+0xb3/0x11e [zfs]
[65622.676762]  [<ffffffffa036f5bd>] ? arc_do_user_evicts+0xc4/0xc4 [zfs]
[65622.676858]  [<ffffffffa02c81ed>] ? thread_generic_wrapper+0x6c/0x79 [spl]
[65622.676955]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[65622.677049]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[65622.677136]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[65622.677227]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[65622.677318]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[65708.159577] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=510354 jiffies)
[65888.015937] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=555386 jiffies)
[66023.207799] SysRq : Show backtrace of all active CPUs
[66025.212028] SysRq : Show backtrace of all active CPUs
[66025.212269] CPU0:
[66025.212332]  ffff880627c03ed0 ffff880627c03ee8 0000000000000046 0000000000000000
[66025.212587]  ffff880627c03f60 00000000ffffffff ffff880627c92ee8 ffff880627c03f18
[66025.212843]  ffffffff8101294b ffff880627c03f38 ffffffff811fb6cc 0000000000000000
[66025.213098] Call Trace:
[66025.213160]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66025.213280]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66025.213351]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66025.213437]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66025.213506]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66025.213577]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66025.213650]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66025.213719]  <EOI>  [<ffffffff81079a46>] ? smp_call_function_many+0x114/0x1f5
[66025.213836]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66025.213906]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66025.213975]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66025.214047]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66025.214117]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66025.214186]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66025.214255]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66025.214324]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66025.214394]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66025.214464]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66025.214532] CPU6:
[66025.214610]  ffff880c3fc83ed0 ffff880c3fc83ee8 0000000000000046 0000000000000006
[66025.214916]  ffff880c3fc83f60 00000000ffffffff ffff880627c92ee8 ffff880c3fc83f18
[66025.215219]  ffffffff8101294b ffff880c3fc83f38 ffffffff811fb6cc 000000000000000f
[66025.215522] Call Trace:
[66025.215600]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66025.215739]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66025.215828]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66025.215946]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66025.216036]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66025.222051]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66025.222144]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66025.222235]  <EOI>  [<ffffffff81079b10>] ? smp_call_function_many+0x1de/0x1f5
[66025.222377]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66025.222464]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66025.222554]  [<ffffffff81079b5e>] ? on_each_cpu+0x16/0x45
[66025.222642]  [<ffffffff81034343>] ? flush_tlb_all+0x17/0x19
[66025.222731]  [<ffffffff810e5c14>] ? __purge_vmap_area_lazy+0x132/0x18d
[66025.222824]  [<ffffffff810e5cc3>] ? free_vmap_area_noflush+0x54/0x5b
[66025.222915]  [<ffffffff810e71dc>] ? free_unmap_vmap_area+0x20/0x24
[66025.223006]  [<ffffffff810e7244>] ? remove_vm_area+0x64/0x78
[66025.223095]  [<ffffffff810e7353>] ? __vunmap+0x39/0xb8
[66025.223183]  [<ffffffff810e7423>] ? vfree+0x29/0x2b
[66025.223273]  [<ffffffffa02c69ea>] ? kv_free+0x33/0x35 [spl]
[66025.223365]  [<ffffffffa02c6be6>] ? spl_slab_reclaim+0x1fa/0x223 [spl]
[66025.223484]  [<ffffffffa03edf8e>] ? zpl_inode_alloc+0x70/0x70 [zfs]
[66025.223578]  [<ffffffffa02c6c5b>] ? spl_kmem_cache_reap_now+0x4c/0x5f [spl]
[66025.223674]  [<ffffffffa02c6cac>] ? __spl_kmem_cache_generic_shrinker+0x3e/0x89 [spl]
[66025.223797]  [<ffffffffa02c6d24>] ? spl_kmem_reap+0x22/0x24 [spl]
[66025.223911]  [<ffffffffa03edfc3>] ? zpl_prune_sbs+0x35/0x37 [zfs]
[66025.224005]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66025.224108]  [<ffffffffa036e9de>] ? arc_adjust_meta+0x135/0x194 [zfs]
[66025.224203]  [<ffffffffa02c6792>] ? spl_kmem_availrmem+0x19/0x1f [spl]
[66025.224298]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66025.224401]  [<ffffffffa036f670>] ? arc_reclaim_thread+0xb3/0x11e [zfs]
[66025.224505]  [<ffffffffa036f5bd>] ? arc_do_user_evicts+0xc4/0xc4 [zfs]
[66025.224600]  [<ffffffffa02c81ed>] ? thread_generic_wrapper+0x6c/0x79 [spl]
[66025.224697]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66025.224789]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66025.224877]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66025.224968]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66025.225059]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66027.213815] SysRq : Show backtrace of all active CPUs
[66027.214035] CPU0:
[66027.214097]  ffff880627c03ed0 ffff880627c03ee8 0000000000000046 0000000000000000
[66027.214353]  ffff880627c03f60 00000000ffffffff ffff880627c52ee8 ffff880627c03f18
[66027.214608]  ffffffff8101294b ffff880627c03f38 ffffffff811fb6cc 0000000000000000
[66027.214863] Call Trace:
[66027.214924]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66027.215039]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66027.215108]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66027.215194]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66027.215263]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66027.215333]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66027.215404]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66027.215473]  <EOI>  [<ffffffff81079a40>] ? smp_call_function_many+0x10e/0x1f5
[66027.215591]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66027.215661]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66027.215730]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66027.215801]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66027.215871]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66027.215940]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66027.216009]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66027.216077]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66027.216148]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66027.216217]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66027.216285] CPU6:
[66027.216362]  ffff880c3fc83ed0 ffff880c3fc83ee8 0000000000000046 0000000000000006
[66027.216666]  ffff880c3fc83f60 00000000ffffffff ffff880627c52ee8 ffff880c3fc83f18
[66027.216969]  ffffffff8101294b ffff880c3fc83f38 ffffffff811fb6cc 000000000000000f
[66027.217272] Call Trace:
[66027.217350]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66027.217489]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66027.217578]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66027.217697]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66027.217787]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66027.217875]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66027.217968]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66027.218059]  <EOI>  [<ffffffff81079b10>] ? smp_call_function_many+0x1de/0x1f5
[66027.218202]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66027.218289]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66027.218380]  [<ffffffff81079b5e>] ? on_each_cpu+0x16/0x45
[66027.218467]  [<ffffffff81034343>] ? flush_tlb_all+0x17/0x19
[66027.218557]  [<ffffffff810e5c14>] ? __purge_vmap_area_lazy+0x132/0x18d
[66027.218649]  [<ffffffff810e5cc3>] ? free_vmap_area_noflush+0x54/0x5b
[66027.218740]  [<ffffffff810e71dc>] ? free_unmap_vmap_area+0x20/0x24
[66027.218831]  [<ffffffff810e7244>] ? remove_vm_area+0x64/0x78
[66027.218920]  [<ffffffff810e7353>] ? __vunmap+0x39/0xb8
[66027.219008]  [<ffffffff810e7423>] ? vfree+0x29/0x2b
[66027.219098]  [<ffffffffa02c69ea>] ? kv_free+0x33/0x35 [spl]
[66027.219189]  [<ffffffffa02c6be6>] ? spl_slab_reclaim+0x1fa/0x223 [spl]
[66027.219306]  [<ffffffffa03edf8e>] ? zpl_inode_alloc+0x70/0x70 [zfs]
[66027.219401]  [<ffffffffa02c6c5b>] ? spl_kmem_cache_reap_now+0x4c/0x5f [spl]
[66027.219497]  [<ffffffffa02c6cac>] ? __spl_kmem_cache_generic_shrinker+0x3e/0x89 [spl]
[66027.219619]  [<ffffffffa02c6d24>] ? spl_kmem_reap+0x22/0x24 [spl]
[66027.219734]  [<ffffffffa03edfc3>] ? zpl_prune_sbs+0x35/0x37 [zfs]
[66027.219828]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66027.219931]  [<ffffffffa036e9de>] ? arc_adjust_meta+0x135/0x194 [zfs]
[66027.220026]  [<ffffffffa02c6792>] ? spl_kmem_availrmem+0x19/0x1f [spl]
[66027.220121]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66027.220224]  [<ffffffffa036f670>] ? arc_reclaim_thread+0xb3/0x11e [zfs]
[66027.220328]  [<ffffffffa036f5bd>] ? arc_do_user_evicts+0xc4/0xc4 [zfs]
[66027.220423]  [<ffffffffa02c81ed>] ? thread_generic_wrapper+0x6c/0x79 [spl]
[66027.220519]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66027.220611]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66027.220699]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66027.220789]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66027.220880]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66029.215880] SysRq : Show backtrace of all active CPUs
[66031.216303] SysRq : Show backtrace of all active CPUs
[66031.216526] CPU0:
[66031.216588]  ffff880627c03ed0 ffff880627c03ee8 0000000000000046 0000000000000000
[66031.216844]  ffff880627c03f60 00000000ffffffff ffff880627cd2ee8 ffff880627c03f18
[66031.217099]  ffffffff8101294b ffff880627c03f38 ffffffff811fb6cc 0000000000000000
[66031.217354] Call Trace:
[66031.217416]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66031.217531]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66031.217600]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66031.217686]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66031.217755]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66031.217825]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66031.217896]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66031.217965]  <EOI>  [<ffffffff81079a40>] ? smp_call_function_many+0x10e/0x1f5
[66031.218082]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66031.218152]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66031.218222]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66031.218292]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66031.218362]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66031.218431]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66031.218501]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66031.218569]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66031.218639]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66031.218709]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66031.218776] CPU1:
[66031.218837]  ffff880627c43ed0 ffff880627c43ee8 0000000000000046 0000000000000001
[66031.219089]  ffff880627c43f60 00000000ffffffff ffff880627cd2ee8 ffff880627c43f18
[66031.219341]  ffffffff8101294b ffff880627c43f38 ffffffff811fb6cc 0000000000000000
[66031.219593] Call Trace:
[66031.219654]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66031.219769]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66031.219837]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66031.219924]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66031.219994]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66031.220062]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66031.220131]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66031.220199]  <EOI>  [<ffffffff81079a46>] ? smp_call_function_many+0x114/0x1f5
[66031.220314]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66031.220382]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66031.220450]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66031.220519]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66031.220587]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66031.220655]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66031.220722]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66031.220789]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66031.220857]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66031.220925]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66031.220992] CPU6:
[66031.221069]  ffff880c3fc83ed0 ffff880c3fc83ee8 0000000000000046 0000000000000006
[66031.221372]  ffff880c3fc83f60 00000000ffffffff ffff880627cd2ee8 ffff880c3fc83f18
[66031.221675]  ffffffff8101294b ffff880c3fc83f38 ffffffff811fb6cc 000000000000000f
[66031.221977] Call Trace:
[66031.222055]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66031.222194]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66031.222282]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66031.222402]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66031.222492]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66031.222579]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66031.222672]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66031.228680]  <EOI>  [<ffffffff81079b16>] ? smp_call_function_many+0x1e4/0x1f5
[66031.228822]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66031.228910]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66031.229000]  [<ffffffff81079b5e>] ? on_each_cpu+0x16/0x45
[66031.229088]  [<ffffffff81034343>] ? flush_tlb_all+0x17/0x19
[66031.229177]  [<ffffffff810e5c14>] ? __purge_vmap_area_lazy+0x132/0x18d
[66031.229269]  [<ffffffff810e5cc3>] ? free_vmap_area_noflush+0x54/0x5b
[66031.229361]  [<ffffffff810e71dc>] ? free_unmap_vmap_area+0x20/0x24
[66031.229452]  [<ffffffff810e7244>] ? remove_vm_area+0x64/0x78
[66031.229542]  [<ffffffff810e7353>] ? __vunmap+0x39/0xb8
[66031.229630]  [<ffffffff810e7423>] ? vfree+0x29/0x2b
[66031.229720]  [<ffffffffa02c69ea>] ? kv_free+0x33/0x35 [spl]
[66031.229812]  [<ffffffffa02c6be6>] ? spl_slab_reclaim+0x1fa/0x223 [spl]
[66031.229929]  [<ffffffffa03edf8e>] ? zpl_inode_alloc+0x70/0x70 [zfs]
[66031.230024]  [<ffffffffa02c6c5b>] ? spl_kmem_cache_reap_now+0x4c/0x5f [spl]
[66031.230120]  [<ffffffffa02c6cac>] ? __spl_kmem_cache_generic_shrinker+0x3e/0x89 [spl]
[66031.230242]  [<ffffffffa02c6d24>] ? spl_kmem_reap+0x22/0x24 [spl]
[66031.230357]  [<ffffffffa03edfc3>] ? zpl_prune_sbs+0x35/0x37 [zfs]
[66031.230451]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66031.230554]  [<ffffffffa036e9de>] ? arc_adjust_meta+0x135/0x194 [zfs]
[66031.230649]  [<ffffffffa02c6792>] ? spl_kmem_availrmem+0x19/0x1f [spl]
[66031.230744]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66031.230847]  [<ffffffffa036f670>] ? arc_reclaim_thread+0xb3/0x11e [zfs]
[66031.230951]  [<ffffffffa036f5bd>] ? arc_do_user_evicts+0xc4/0xc4 [zfs]
[66031.231046]  [<ffffffffa02c81ed>] ? thread_generic_wrapper+0x6c/0x79 [spl]
[66031.231142]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66031.231234]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66031.231322]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66031.231413]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66031.231503]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66033.218245] SysRq : Show backtrace of all active CPUs
[66035.218500] SysRq : Show backtrace of all active CPUs
[66035.218742] CPU0:
[66035.218803]  ffff880627c03ed0 ffff880627c03ee8 0000000000000046 0000000000000000
[66035.219059]  ffff880627c03f60 00000000ffffffff ffff880627d52ee8 ffff880627c03f18
[66035.219314]  ffffffff8101294b ffff880627c03f38 ffffffff811fb6cc 0000000000000000
[66035.219570] Call Trace:
[66035.219631]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66035.219746]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66035.219815]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66035.219901]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66035.219970]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66035.220039]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66035.220110]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66035.220179]  <EOI>  [<ffffffff81079a3e>] ? smp_call_function_many+0x10c/0x1f5
[66035.220296]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66035.220366]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66035.220435]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66035.220506]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66035.220576]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66035.220645]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66035.220714]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66035.220782]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66035.220853]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66035.220922]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66035.220990] CPU1:
[66035.221051]  ffff880627c43ed0 ffff880627c43ee8 0000000000000046 0000000000000001
[66035.221303]  ffff880627c43f60 00000000ffffffff ffff880627d52ee8 ffff880627c43f18
[66035.221555]  ffffffff8101294b ffff880627c43f38 ffffffff811fb6cc 0000000000000000
[66035.221815] Call Trace:
[66035.221875]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66035.221988]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66035.222056]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66035.222140]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66035.222208]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66035.222276]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66035.222346]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66035.222414]  <EOI>  [<ffffffff81079a3e>] ? smp_call_function_many+0x10c/0x1f5
[66035.222528]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66035.222596]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66035.222664]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66035.222733]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66035.222801]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66035.222869]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66035.222936]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66035.223003]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66035.223071]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66035.223139]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66035.223206] CPU3:
[66035.223267]  ffff880627cc3ed0 ffff880627cc3ee8 0000000000000046 0000000000000003
[66035.223520]  ffff880627cc3f60 00000000ffffffff ffff880627d52ee8 ffff880627cc3f18
[66035.223771]  ffffffff8101294b ffff880627cc3f38 ffffffff811fb6cc 0000000000000000
[66035.224024] Call Trace:
[66035.224085]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66035.224199]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66035.224267]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66035.224356]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66035.224424]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66035.224492]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66035.224562]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66035.224629]  <EOI>  [<ffffffff81079a40>] ? smp_call_function_many+0x10e/0x1f5
[66035.224744]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66035.224813]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66035.224882]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66035.224951]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66035.225019]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66035.225086]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66035.225154]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66035.225221]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66035.225289]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66035.225357]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66035.225423] CPU6:
[66035.225501]  ffff880c3fc83ed0 ffff880c3fc83ee8 0000000000000046 0000000000000006
[66035.225805]  ffff880c3fc83f60 00000000ffffffff ffff880627d52ee8 ffff880c3fc83f18
[66035.226109]  ffffffff8101294b ffff880c3fc83f38 ffffffff811fb6cc 000000000000000f
[66035.226412] Call Trace:
[66035.226489]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66035.226628]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66035.226717]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66035.226835]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66035.226925]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66035.227013]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66035.227106]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66035.227196]  <EOI>  [<ffffffff81079b10>] ? smp_call_function_many+0x1de/0x1f5
[66035.227338]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66035.227426]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66035.227516]  [<ffffffff81079b5e>] ? on_each_cpu+0x16/0x45
[66035.227604]  [<ffffffff81034343>] ? flush_tlb_all+0x17/0x19
[66035.227693]  [<ffffffff810e5c14>] ? __purge_vmap_area_lazy+0x132/0x18d
[66035.227785]  [<ffffffff810e5cc3>] ? free_vmap_area_noflush+0x54/0x5b
[66035.227877]  [<ffffffff810e71dc>] ? free_unmap_vmap_area+0x20/0x24
[66035.227968]  [<ffffffff810e7244>] ? remove_vm_area+0x64/0x78
[66035.228057]  [<ffffffff810e7353>] ? __vunmap+0x39/0xb8
[66035.228144]  [<ffffffff810e7423>] ? vfree+0x29/0x2b
[66035.228235]  [<ffffffffa02c69ea>] ? kv_free+0x33/0x35 [spl]
[66035.228326]  [<ffffffffa02c6be6>] ? spl_slab_reclaim+0x1fa/0x223 [spl]
[66035.228443]  [<ffffffffa03edf8e>] ? zpl_inode_alloc+0x70/0x70 [zfs]
[66035.228538]  [<ffffffffa02c6c5b>] ? spl_kmem_cache_reap_now+0x4c/0x5f [spl]
[66035.228634]  [<ffffffffa02c6cac>] ? __spl_kmem_cache_generic_shrinker+0x3e/0x89 [spl]
[66035.228756]  [<ffffffffa02c6d24>] ? spl_kmem_reap+0x22/0x24 [spl]
[66035.228871]  [<ffffffffa03edfc3>] ? zpl_prune_sbs+0x35/0x37 [zfs]
[66035.228965]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66035.229068]  [<ffffffffa036e9de>] ? arc_adjust_meta+0x135/0x194 [zfs]
[66035.229162]  [<ffffffffa02c6792>] ? spl_kmem_availrmem+0x19/0x1f [spl]
[66035.229257]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66035.229361]  [<ffffffffa036f670>] ? arc_reclaim_thread+0xb3/0x11e [zfs]
[66035.229464]  [<ffffffffa036f5bd>] ? arc_do_user_evicts+0xc4/0xc4 [zfs]
[66035.229560]  [<ffffffffa02c81ed>] ? thread_generic_wrapper+0x6c/0x79 [spl]
[66035.229703]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66035.229796]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66035.229883]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66035.229974]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66035.230064]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66037.220582] SysRq : Show backtrace of all active CPUs
[66039.220302] SysRq : Show backtrace of all active CPUs
[66039.220537] CPU0:
[66039.220598]  ffff880627c03ed0 ffff880627c03ee8 0000000000000046 0000000000000000
[66039.220853]  ffff880627c03f60 00000000ffffffff ffff880627d92ee8 ffff880627c03f18
[66039.221107]  ffffffff8101294b ffff880627c03f38 ffffffff811fb6cc 0000000000000000
[66039.221362] Call Trace:
[66039.221424]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66039.221539]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66039.221607]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66039.221694]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.221763]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.221832]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66039.221904]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66039.221973]  <EOI>  [<ffffffff81079a46>] ? smp_call_function_many+0x114/0x1f5
[66039.227079]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66039.227148]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66039.227218]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66039.227289]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66039.227359]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66039.227428]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66039.227497]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66039.227565]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66039.227636]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66039.227705]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66039.227774] CPU1:
[66039.227836]  ffff880627c43ed0 ffff880627c43ee8 0000000000000046 0000000000000001
[66039.228094]  ffff880627c43f60 00000000ffffffff ffff880627d92ee8 ffff880627c43f18
[66039.228350]  ffffffff8101294b ffff880627c43f38 ffffffff811fb6cc 0000000000000000
[66039.228606] Call Trace:
[66039.228667]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66039.228782]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66039.228851]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66039.228937]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.229007]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.229076]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66039.229148]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66039.229216]  <EOI>  [<ffffffff81079a40>] ? smp_call_function_many+0x10e/0x1f5
[66039.229333]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66039.229403]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66039.229472]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66039.229543]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66039.229613]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66039.229682]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66039.229751]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66039.229819]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66039.229889]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66039.229959]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66039.230027] CPU3:
[66039.230089]  ffff880627cc3ed0 ffff880627cc3ee8 0000000000000046 0000000000000003
[66039.230351]  ffff880627cc3f60 00000000ffffffff ffff880627d92ee8 ffff880627cc3f18
[66039.230604]  ffffffff8101294b ffff880627cc3f38 ffffffff811fb6cc 0000000000000000
[66039.230859] Call Trace:
[66039.230919]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66039.231032]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66039.231100]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66039.231184]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.231252]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.231320]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66039.231390]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66039.231458]  <EOI>  [<ffffffff81079a40>] ? smp_call_function_many+0x10e/0x1f5
[66039.231575]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66039.231643]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66039.231711]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66039.231780]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66039.231848]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66039.231916]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66039.231984]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66039.232050]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66039.232119]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66039.232187]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66039.232254] CPU6:
[66039.232333]  ffff880c3fc83ed0 ffff880c3fc83ee8 0000000000000046 0000000000000006
[66039.232641]  ffff880c3fc83f60 00000000ffffffff ffff880627d92ee8 ffff880c3fc83f18
[66039.232944]  ffffffff8101294b ffff880c3fc83f38 ffffffff811fb6cc 000000000000000f
[66039.233247] Call Trace:
[66039.233325]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66039.233464]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66039.233552]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66039.233671]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.233761]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66039.233849]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66039.233941]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66039.234032]  <EOI>  [<ffffffff81079b10>] ? smp_call_function_many+0x1de/0x1f5
[66039.234174]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66039.234261]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66039.234353]  [<ffffffff81079b5e>] ? on_each_cpu+0x16/0x45
[66039.234441]  [<ffffffff81034343>] ? flush_tlb_all+0x17/0x19
[66039.234530]  [<ffffffff810e5c14>] ? __purge_vmap_area_lazy+0x132/0x18d
[66039.234622]  [<ffffffff810e5cc3>] ? free_vmap_area_noflush+0x54/0x5b
[66039.234714]  [<ffffffff810e71dc>] ? free_unmap_vmap_area+0x20/0x24
[66039.234805]  [<ffffffff810e7244>] ? remove_vm_area+0x64/0x78
[66039.234894]  [<ffffffff810e7353>] ? __vunmap+0x39/0xb8
[66039.234981]  [<ffffffff810e7423>] ? vfree+0x29/0x2b
[66039.235071]  [<ffffffffa02c69ea>] ? kv_free+0x33/0x35 [spl]
[66039.235163]  [<ffffffffa02c6be6>] ? spl_slab_reclaim+0x1fa/0x223 [spl]
[66039.235280]  [<ffffffffa03edf8e>] ? zpl_inode_alloc+0x70/0x70 [zfs]
[66039.235374]  [<ffffffffa02c6c5b>] ? spl_kmem_cache_reap_now+0x4c/0x5f [spl]
[66039.235470]  [<ffffffffa02c6cac>] ? __spl_kmem_cache_generic_shrinker+0x3e/0x89 [spl]
[66039.235593]  [<ffffffffa02c6d24>] ? spl_kmem_reap+0x22/0x24 [spl]
[66039.235708]  [<ffffffffa03edfc3>] ? zpl_prune_sbs+0x35/0x37 [zfs]
[66039.235802]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66039.235906]  [<ffffffffa036e9de>] ? arc_adjust_meta+0x135/0x194 [zfs]
[66039.236000]  [<ffffffffa02c6792>] ? spl_kmem_availrmem+0x19/0x1f [spl]
[66039.236096]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66039.236199]  [<ffffffffa036f670>] ? arc_reclaim_thread+0xb3/0x11e [zfs]
[66039.236302]  [<ffffffffa036f5bd>] ? arc_do_user_evicts+0xc4/0xc4 [zfs]
[66039.236398]  [<ffffffffa02c81ed>] ? thread_generic_wrapper+0x6c/0x79 [spl]
[66039.236494]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66039.236586]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66039.236674]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66039.236765]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66039.236855]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66039.236943] CPU9:
[66039.237005]  ffff880627d43ed0 ffff880627d43ee8 0000000000000046 0000000000000009
[66039.237262]  ffff880627d43f60 00000000ffffffff ffff880627d92ee8 ffff880627d43f18
[66039.237516]  ffffffff8101294b ffff880627d43f38 ffffffff811fb6cc 0000000000000000
[66039.237771] Call Trace:
[66039.237832]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66039.237945]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66039.238014]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66039.238100]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.238169]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.238238]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66039.238308]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66039.238377]  <EOI>  [<ffffffff81079a40>] ? smp_call_function_many+0x10e/0x1f5
[66039.238493]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66039.238563]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66039.238632]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66039.238703]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66039.238772]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66039.238841]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66039.238910]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66039.238978]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66039.239048]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66039.239117]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66041.227876] SysRq : Show backtrace of all active CPUs
[66043.227025] SysRq : Show backtrace of all active CPUs

Also:

# uptime
 12:59:03 up 18:18,  2 users,  load average: 27.00, 26.98, 24.39
# ps -ostat,pid,time,wchan='WCHAN-xxxxxxxxxxxxxxxxxxxx',cmd ax | grep ^R
R       23 00:00:00 -                          [migration/6]
R<    2571 00:18:36 -                          [arc_reclaim]
Rs    6113 00:00:00 ep_poll                    /usr/lib/postfix/master
RN    6908 00:04:40 -                          [z_rd_int/6]
RN    6910 00:05:43 taskq_thread               [z_rd_int/8]
RN    6924 00:03:21 -                          [z_wr_iss/6]
RN    6926 00:02:47 taskq_thread               [z_wr_iss/8]
RN    6945 00:01:17 -                          [z_wr_int/6]
RN    6947 00:01:20 taskq_thread               [z_wr_int/8]
RN    6968 00:00:05 taskq_thread               [z_fr_iss/8]
R    18853 00:00:06 -                          [kworker/6:2]
R+   21835 00:00:00 -                          ps -ostat,pid,time,wchan=WCHAN-xxxxxxxxxxxxxxxxxxxx,cmd ax

And, after a reboot and triggering the problem again (note: the "9 users" were all me with multiple windows open):

# uptime
 14:00:59 up 8 min,  9 users,  load average: 7.12, 2.61, 1.04
# ps -ostat,pid,time,wchan='WCHAN-xxxxxxxxxxxxxxxxxxxx',cmd ax 
STAT   PID     TIME WCHAN-xxxxxxxxxxxxxxxxxxxx CMD
Ss       1 00:00:02 poll_schedule_timeout      init [2]              
S        2 00:00:00 kthreadd                   [kthreadd]
S        3 00:00:00 run_ksoftirqd              [ksoftirqd/0]
S        4 00:00:00 worker_thread              [kworker/0:0]
S        5 00:00:00 worker_thread              [kworker/u:0]
S        6 00:00:00 cpu_stopper_thread         [migration/0]
S        7 00:00:00 cpu_stopper_thread         [migration/1]
S        9 00:00:00 run_ksoftirqd              [ksoftirqd/1]
S       10 00:00:00 worker_thread              [kworker/0:1]
S       11 00:00:00 cpu_stopper_thread         [migration/2]
S       12 00:00:00 worker_thread              [kworker/2:0]
S       13 00:00:00 run_ksoftirqd              [ksoftirqd/2]
S       14 00:00:00 cpu_stopper_thread         [migration/3]
S       16 00:00:00 run_ksoftirqd              [ksoftirqd/3]
S       17 00:00:00 cpu_stopper_thread         [migration/4]
S       18 00:00:00 worker_thread              [kworker/4:0]
S       19 00:00:00 run_ksoftirqd              [ksoftirqd/4]
S       20 00:00:00 cpu_stopper_thread         [migration/5]
S       21 00:00:00 worker_thread              [kworker/5:0]
S       22 00:00:00 run_ksoftirqd              [ksoftirqd/5]
S       23 00:00:00 cpu_stopper_thread         [migration/6]
S       24 00:00:00 worker_thread              [kworker/6:0]
S       25 00:00:00 run_ksoftirqd              [ksoftirqd/6]
S       26 00:00:00 cpu_stopper_thread         [migration/7]
S       28 00:00:00 run_ksoftirqd              [ksoftirqd/7]
S       29 00:00:00 cpu_stopper_thread         [migration/8]
S       30 00:00:00 worker_thread              [kworker/8:0]
S       31 00:00:00 run_ksoftirqd              [ksoftirqd/8]
S       32 00:00:00 cpu_stopper_thread         [migration/9]
S       34 00:00:00 run_ksoftirqd              [ksoftirqd/9]
S       35 00:00:00 cpu_stopper_thread         [migration/10]
S       36 00:00:00 worker_thread              [kworker/10:0]
S       37 00:00:00 run_ksoftirqd              [ksoftirqd/10]
S       38 00:00:00 cpu_stopper_thread         [migration/11]
S       39 00:00:00 worker_thread              [kworker/11:0]
S       40 00:00:00 run_ksoftirqd              [ksoftirqd/11]
S       41 00:00:00 cpu_stopper_thread         [migration/12]
S       43 00:00:00 run_ksoftirqd              [ksoftirqd/12]
S       44 00:00:00 cpu_stopper_thread         [migration/13]
S       45 00:00:00 worker_thread              [kworker/13:0]
S       46 00:00:00 run_ksoftirqd              [ksoftirqd/13]
S       47 00:00:00 cpu_stopper_thread         [migration/14]
S       49 00:00:00 run_ksoftirqd              [ksoftirqd/14]
S       50 00:00:00 cpu_stopper_thread         [migration/15]
S       51 00:00:00 worker_thread              [kworker/15:0]
S       52 00:00:00 run_ksoftirqd              [ksoftirqd/15]
S<      53 00:00:00 rescuer_thread             [cpuset]
S<      54 00:00:00 rescuer_thread             [khelper]
S       55 00:00:00 worker_thread              [kworker/u:1]
S      142 00:00:00 bdi_sync_supers            [sync_supers]
S      144 00:00:00 bdi_forker_thread          [bdi-default]
S<     146 00:00:00 rescuer_thread             [kblockd]
S      344 00:00:00 hub_thread                 [khubd]
S      615 00:00:00 kswapd                     [kswapd0]
S      616 00:00:00 kswapd                     [kswapd1]
SN     678 00:00:00 khugepaged                 [khugepaged]
S      679 00:00:00 fsnotify_mark_destroy      [fsnotify_mark]
S<     685 00:00:00 rescuer_thread             [crypto]
S      796 00:00:00 worker_thread              [kworker/1:1]
S      797 00:00:00 worker_thread              [kworker/2:1]
S      798 00:00:00 worker_thread              [kworker/3:1]
S      799 00:00:00 worker_thread              [kworker/5:1]
S      800 00:00:00 worker_thread              [kworker/6:1]
S      801 00:00:00 worker_thread              [kworker/7:1]
S      802 00:00:00 worker_thread              [kworker/8:1]
S      803 00:00:00 worker_thread              [kworker/9:1]
S      804 00:00:00 worker_thread              [kworker/10:1]
S      805 00:00:00 worker_thread              [kworker/11:1]
S      806 00:00:00 worker_thread              [kworker/12:1]
S      807 00:00:00 worker_thread              [kworker/13:1]
S      808 00:00:00 worker_thread              [kworker/14:1]
S      809 00:00:00 worker_thread              [kworker/15:1]
S<    1125 00:00:00 rescuer_thread             [kpsmoused]
S     1127 00:00:00 scsi_error_handler         [scsi_eh_0]
S<    1128 00:00:00 rescuer_thread             [fw_event0]
S<    1203 00:00:00 rescuer_thread             [poll_0_status]
S     1363 00:00:00 worker_thread              [kworker/u:2]
S     1372 00:00:00 worker_thread              [kworker/u:3]
S     1373 00:00:00 worker_thread              [kworker/u:4]
S     1379 00:00:00 worker_thread              [kworker/u:5]
S     1381 00:00:00 worker_thread              [kworker/u:7]
S     1382 00:00:00 worker_thread              [kworker/u:8]
S     1383 00:00:00 worker_thread              [kworker/u:9]
S     1384 00:00:00 worker_thread              [kworker/u:10]
S     1385 00:00:00 worker_thread              [kworker/u:11]
S     1386 00:00:00 worker_thread              [kworker/u:12]
D     1387 00:00:00 kthread_stop               [kworker/u:13]
S     1388 00:00:00 worker_thread              [kworker/u:14]
S     1389 00:00:00 worker_thread              [kworker/u:15]
S     1390 00:00:00 worker_thread              [kworker/u:16]
S     1391 00:00:00 worker_thread              [kworker/u:17]
S     1392 00:00:00 worker_thread              [kworker/u:18]
S     1394 00:00:00 worker_thread              [kworker/u:19]
S     1396 00:00:00 worker_thread              [kworker/u:20]
R     1397 00:00:00 worker_thread              [kworker/u:21]
S     1398 00:00:00 worker_thread              [kworker/u:22]
S     1399 00:00:00 worker_thread              [kworker/u:23]
S     1400 00:00:00 worker_thread              [kworker/u:24]
S     1402 00:00:00 worker_thread              [kworker/u:26]
S     1447 00:00:00 worker_thread              [kworker/0:3]
S     1460 00:00:00 worker_thread              [kworker/8:2]
R     1493 00:00:00 -                          [kworker/0:4]
S     1546 00:00:00 worker_thread              [kworker/4:2]
S     1566 00:00:00 worker_thread              [kworker/12:2]
S<    1610 00:00:00 rescuer_thread             [rpciod]
S<    1611 00:00:00 rescuer_thread             [nfsiod]
Ss    1672 00:00:00 ep_poll                    udevd --daemon
S     1834 00:00:00 worker_thread              [kworker/3:2]
S     2006 00:00:00 ep_poll                    udevd --daemon
S     2007 00:00:00 ep_poll                    udevd --daemon
S     2214 00:00:00 bdi_writeback_thread       [flush-0:16]
S     2289 00:00:00 worker_thread              [kworker/9:2]
S     2291 00:00:00 worker_thread              [kworker/14:2]
S     2292 00:00:00 worker_thread              [kworker/1:2]
S     2293 00:00:00 worker_thread              [kworker/7:2]
S<    2525 00:00:00 rescuer_thread             [kdmflush]
S<    2527 00:00:00 rescuer_thread             [kcryptd_io]
S<    2528 00:00:00 rescuer_thread             [kcryptd]
S<    2542 00:00:00 taskq_thread               [spl_system_task]
S<    2543 00:00:00 taskq_thread               [spl_system_task]
S<    2544 00:00:00 taskq_thread               [spl_system_task]
S<    2545 00:00:00 taskq_thread               [spl_system_task]
S<    2546 00:00:00 taskq_thread               [spl_system_task]
S<    2547 00:00:00 taskq_thread               [spl_system_task]
S<    2548 00:00:00 taskq_thread               [spl_system_task]
S<    2549 00:00:00 taskq_thread               [spl_system_task]
S<    2550 00:00:00 taskq_thread               [spl_system_task]
S<    2551 00:00:00 taskq_thread               [spl_system_task]
S<    2552 00:00:00 taskq_thread               [spl_system_task]
S<    2553 00:00:00 taskq_thread               [spl_system_task]
S<    2554 00:00:00 taskq_thread               [spl_system_task]
S<    2555 00:00:00 taskq_thread               [spl_system_task]
S<    2556 00:00:00 taskq_thread               [spl_system_task]
S<    2557 00:00:00 taskq_thread               [spl_system_task]
S<    2561 00:00:00 cv_timedwait_common        [arc_reclaim]
S<    2562 00:00:00 cv_timedwait_common        [l2arc_feed]
SN    2563 00:00:00 taskq_thread               [zvol/0]
SN    2564 00:00:00 taskq_thread               [zvol/1]
SN    2565 00:00:00 taskq_thread               [zvol/2]
SN    2566 00:00:00 taskq_thread               [zvol/3]
SN    2567 00:00:00 taskq_thread               [zvol/4]
SN    2568 00:00:00 taskq_thread               [zvol/5]
SN    2569 00:00:00 taskq_thread               [zvol/6]
SN    2570 00:00:00 taskq_thread               [zvol/7]
SN    2571 00:00:00 taskq_thread               [zvol/8]
SN    2572 00:00:00 taskq_thread               [zvol/9]
SN    2573 00:00:00 taskq_thread               [zvol/10]
SN    2574 00:00:00 taskq_thread               [zvol/11]
SN    2575 00:00:00 taskq_thread               [zvol/12]
SN    2576 00:00:00 taskq_thread               [zvol/13]
SN    2577 00:00:00 taskq_thread               [zvol/14]
SN    2578 00:00:00 taskq_thread               [zvol/15]
S<    2767 00:00:00 rescuer_thread             [kdmflush]
S<    2770 00:00:00 rescuer_thread             [kcryptd_io]
S<    2771 00:00:00 rescuer_thread             [kcryptd]
S<    2847 00:00:00 rescuer_thread             [kdmflush]
S<    2850 00:00:00 rescuer_thread             [kcryptd_io]
S<    2851 00:00:00 rescuer_thread             [kcryptd]
S<    2933 00:00:00 rescuer_thread             [kdmflush]
S<    2936 00:00:00 rescuer_thread             [kcryptd_io]
S<    2937 00:00:00 rescuer_thread             [kcryptd]
S<    3013 00:00:00 rescuer_thread             [kdmflush]
S<    3016 00:00:00 rescuer_thread             [kcryptd_io]
S<    3017 00:00:00 rescuer_thread             [kcryptd]
S<    3093 00:00:00 rescuer_thread             [kdmflush]
S<    3096 00:00:00 rescuer_thread             [kcryptd_io]
S<    3097 00:00:00 rescuer_thread             [kcryptd]
S<    3179 00:00:00 rescuer_thread             [kdmflush]
S<    3182 00:00:00 rescuer_thread             [kcryptd_io]
S<    3183 00:00:00 rescuer_thread             [kcryptd]
S<    3259 00:00:00 rescuer_thread             [kdmflush]
S<    3262 00:00:00 rescuer_thread             [kcryptd_io]
S<    3263 00:00:00 rescuer_thread             [kcryptd]
S<    3339 00:00:00 rescuer_thread             [kdmflush]
S<    3341 00:00:00 rescuer_thread             [kcryptd_io]
S<    3343 00:00:00 rescuer_thread             [kcryptd]
S<    3425 00:00:00 rescuer_thread             [kdmflush]
S<    3428 00:00:00 rescuer_thread             [kcryptd_io]
S<    3429 00:00:00 rescuer_thread             [kcryptd]
S<    3511 00:00:00 rescuer_thread             [kdmflush]
S<    3514 00:00:00 rescuer_thread             [kcryptd_io]
S<    3515 00:00:00 rescuer_thread             [kcryptd]
S<    3591 00:00:00 rescuer_thread             [kdmflush]
S<    3594 00:00:00 rescuer_thread             [kcryptd_io]
S<    3595 00:00:00 rescuer_thread             [kcryptd]
S<    3671 00:00:00 rescuer_thread             [kdmflush]
S<    3673 00:00:00 rescuer_thread             [kcryptd_io]
S<    3674 00:00:00 rescuer_thread             [kcryptd]
S<    3757 00:00:00 rescuer_thread             [kdmflush]
S<    3760 00:00:00 rescuer_thread             [kcryptd_io]
S<    3761 00:00:00 rescuer_thread             [kcryptd]
S<    3837 00:00:00 rescuer_thread             [kdmflush]
S<    3840 00:00:00 rescuer_thread             [kcryptd_io]
S<    3841 00:00:00 rescuer_thread             [kcryptd]
S<    3917 00:00:00 rescuer_thread             [kdmflush]
S<    3920 00:00:00 rescuer_thread             [kcryptd_io]
S<    3921 00:00:00 rescuer_thread             [kcryptd]
S<    4003 00:00:00 rescuer_thread             [kdmflush]
S<    4006 00:00:00 rescuer_thread             [kcryptd_io]
S<    4007 00:00:00 rescuer_thread             [kcryptd]
S<    4089 00:00:00 rescuer_thread             [kdmflush]
S<    4092 00:00:00 rescuer_thread             [kcryptd_io]
S<    4093 00:00:00 rescuer_thread             [kcryptd]
S<    4169 00:00:00 rescuer_thread             [kdmflush]
S<    4172 00:00:00 rescuer_thread             [kcryptd_io]
S<    4173 00:00:00 rescuer_thread             [kcryptd]
S<    4255 00:00:00 rescuer_thread             [kdmflush]
S<    4258 00:00:00 rescuer_thread             [kcryptd_io]
S<    4259 00:00:00 rescuer_thread             [kcryptd]
S<    4335 00:00:00 rescuer_thread             [kdmflush]
S<    4338 00:00:00 rescuer_thread             [kcryptd_io]
S<    4339 00:00:00 rescuer_thread             [kcryptd]
S<    4421 00:00:00 rescuer_thread             [kdmflush]
S<    4424 00:00:00 rescuer_thread             [kcryptd_io]
S<    4425 00:00:00 rescuer_thread             [kcryptd]
S<    4507 00:00:00 rescuer_thread             [kdmflush]
S<    4510 00:00:00 rescuer_thread             [kcryptd_io]
S<    4511 00:00:00 rescuer_thread             [kcryptd]
S<    4587 00:00:00 rescuer_thread             [kdmflush]
S<    4590 00:00:00 rescuer_thread             [kcryptd_io]
S<    4591 00:00:00 rescuer_thread             [kcryptd]
S<    4673 00:00:00 rescuer_thread             [kdmflush]
S<    4676 00:00:00 rescuer_thread             [kcryptd_io]
S<    4677 00:00:00 rescuer_thread             [kcryptd]
S<    4759 00:00:00 rescuer_thread             [kdmflush]
S<    4762 00:00:00 rescuer_thread             [kcryptd_io]
S<    4763 00:00:00 rescuer_thread             [kcryptd]
S<    4839 00:00:00 rescuer_thread             [kdmflush]
S<    4842 00:00:00 rescuer_thread             [kcryptd_io]
S<    4843 00:00:00 rescuer_thread             [kcryptd]
Ss    5727 00:00:00 poll_schedule_timeout      /sbin/rpcbind -w
Ss    5744 00:00:00 poll_schedule_timeout      /sbin/rpc.statd
Sl    5883 00:00:00 poll_schedule_timeout      /usr/sbin/rsyslogd -c5
Ss    5969 00:00:00 hrtimer_nanosleep          /usr/sbin/atd
Ss    5993 00:00:00 poll_schedule_timeout      /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 107:113
Ss    6001 00:00:00 hrtimer_nanosleep          /usr/sbin/cron
Ss    6005 00:00:00 poll_schedule_timeout      /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d
Ss    6030 00:00:00 poll_schedule_timeout      /usr/bin/dbus-daemon --system
S<    6038 00:00:00 transport_processing_threa [LIO_rd_mcp]
Ss    6057 00:00:00 poll_schedule_timeout      /usr/sbin/sshd
Rs    6115 00:00:00 ep_poll                    /usr/lib/postfix/master
Z     6121 00:00:00 exit                       [pickup] <defunct>
S     6122 00:00:00 ep_poll                    qmgr -l -t fifo -u
Ss    6250 00:00:00 poll_schedule_timeout      /usr/sbin/munin-node
Ss+   6265 00:00:00 n_tty_read                 /sbin/getty 38400 tty1
Ss+   6266 00:00:00 n_tty_read                 /sbin/getty 38400 tty2
Ss+   6267 00:00:00 n_tty_read                 /sbin/getty 38400 tty3
Ss+   6268 00:00:00 n_tty_read                 /sbin/getty 38400 tty4
Ss+   6269 00:00:00 n_tty_read                 /sbin/getty 38400 tty5
Ss+   6270 00:00:00 n_tty_read                 /sbin/getty 38400 tty6
Rs    6745 00:00:00 poll_schedule_timeout      sshd: root@pts/0 
Ss    6749 00:00:00 wait                       -bash
S+    6755 00:00:00 hrtimer_nanosleep          zpool iostat 60
RN    6877 00:00:00 -                          [z_null_iss/0]
SN    6878 00:00:00 taskq_thread               [z_null_int/0]
SN    6879 00:00:00 taskq_thread               [z_rd_iss/0]
SN    6880 00:00:00 taskq_thread               [z_rd_iss/1]
SN    6881 00:00:00 taskq_thread               [z_rd_iss/2]
SN    6882 00:00:00 taskq_thread               [z_rd_iss/3]
SN    6883 00:00:00 taskq_thread               [z_rd_iss/4]
SN    6884 00:00:00 taskq_thread               [z_rd_iss/5]
SN    6885 00:00:00 taskq_thread               [z_rd_iss/6]
SN    6886 00:00:00 taskq_thread               [z_rd_iss/7]
SN    6887 00:00:00 taskq_thread               [z_rd_int/0]
SN    6888 00:00:00 taskq_thread               [z_rd_int/1]
SN    6889 00:00:00 taskq_thread               [z_rd_int/2]
SN    6890 00:00:00 taskq_thread               [z_rd_int/3]
SN    6891 00:00:00 taskq_thread               [z_rd_int/4]
SN    6892 00:00:00 taskq_thread               [z_rd_int/5]
SN    6893 00:00:00 taskq_thread               [z_rd_int/6]
SN    6894 00:00:00 taskq_thread               [z_rd_int/7]
SN    6895 00:00:00 taskq_thread               [z_rd_int/8]
SN    6896 00:00:00 taskq_thread               [z_rd_int/9]
SN    6897 00:00:00 taskq_thread               [z_rd_int/10]
SN    6898 00:00:00 taskq_thread               [z_rd_int/11]
SN    6899 00:00:00 taskq_thread               [z_rd_int/12]
SN    6900 00:00:00 taskq_thread               [z_rd_int/13]
SN    6901 00:00:00 taskq_thread               [z_rd_int/14]
SN    6902 00:00:00 taskq_thread               [z_rd_int/15]
SN    6903 00:00:00 taskq_thread               [z_wr_iss/0]
SN    6904 00:00:00 taskq_thread               [z_wr_iss/1]
SN    6905 00:00:00 taskq_thread               [z_wr_iss/2]
SN    6906 00:00:00 taskq_thread               [z_wr_iss/3]
SN    6907 00:00:00 taskq_thread               [z_wr_iss/4]
SN    6908 00:00:00 taskq_thread               [z_wr_iss/5]
SN    6909 00:00:00 taskq_thread               [z_wr_iss/6]
SN    6910 00:00:00 taskq_thread               [z_wr_iss/7]
SN    6911 00:00:00 taskq_thread               [z_wr_iss/8]
SN    6912 00:00:00 taskq_thread               [z_wr_iss/9]
SN    6913 00:00:00 taskq_thread               [z_wr_iss/10]
SN    6914 00:00:00 taskq_thread               [z_wr_iss/11]
SN    6915 00:00:00 taskq_thread               [z_wr_iss/12]
SN    6916 00:00:00 taskq_thread               [z_wr_iss/13]
SN    6917 00:00:00 taskq_thread               [z_wr_iss/14]
SN    6918 00:00:00 taskq_thread               [z_wr_iss/15]
SN    6919 00:00:00 taskq_thread               [z_wr_iss_h/0]
SN    6920 00:00:00 taskq_thread               [z_wr_iss_h/1]
SN    6921 00:00:00 taskq_thread               [z_wr_iss_h/2]
SN    6922 00:00:00 taskq_thread               [z_wr_iss_h/3]
SN    6923 00:00:00 taskq_thread               [z_wr_iss_h/4]
SN    6924 00:00:00 taskq_thread               [z_wr_int/0]
SN    6925 00:00:00 taskq_thread               [z_wr_int/1]
SN    6926 00:00:00 taskq_thread               [z_wr_int/2]
SN    6927 00:00:00 taskq_thread               [z_wr_int/3]
SN    6928 00:00:00 taskq_thread               [z_wr_int/4]
SN    6929 00:00:00 taskq_thread               [z_wr_int/5]
SN    6930 00:00:00 taskq_thread               [z_wr_int/6]
SN    6931 00:00:00 taskq_thread               [z_wr_int/7]
SN    6932 00:00:00 taskq_thread               [z_wr_int/8]
SN    6933 00:00:00 taskq_thread               [z_wr_int/9]
SN    6934 00:00:00 taskq_thread               [z_wr_int/10]
SN    6935 00:00:00 taskq_thread               [z_wr_int/11]
SN    6936 00:00:00 taskq_thread               [z_wr_int/12]
SN    6937 00:00:00 taskq_thread               [z_wr_int/13]
SN    6938 00:00:00 taskq_thread               [z_wr_int/14]
SN    6939 00:00:00 taskq_thread               [z_wr_int/15]
SN    6940 00:00:00 taskq_thread               [z_wr_int_h/0]
SN    6941 00:00:00 taskq_thread               [z_wr_int_h/1]
SN    6942 00:00:00 taskq_thread               [z_wr_int_h/2]
SN    6943 00:00:00 taskq_thread               [z_wr_int_h/3]
SN    6944 00:00:00 taskq_thread               [z_wr_int_h/4]
SN    6945 00:00:00 taskq_thread               [z_fr_iss/0]
SN    6946 00:00:00 taskq_thread               [z_fr_iss/1]
SN    6947 00:00:00 taskq_thread               [z_fr_iss/2]
SN    6948 00:00:00 taskq_thread               [z_fr_iss/3]
SN    6949 00:00:00 taskq_thread               [z_fr_iss/4]
SN    6950 00:00:00 taskq_thread               [z_fr_iss/5]
SN    6951 00:00:00 taskq_thread               [z_fr_iss/6]
SN    6952 00:00:00 taskq_thread               [z_fr_iss/7]
SN    6953 00:00:00 taskq_thread               [z_fr_iss/8]
SN    6954 00:00:00 taskq_thread               [z_fr_iss/9]
SN    6955 00:00:00 taskq_thread               [z_fr_iss/10]
SN    6956 00:00:00 taskq_thread               [z_fr_iss/11]
SN    6957 00:00:00 taskq_thread               [z_fr_iss/12]
SN    6958 00:00:00 taskq_thread               [z_fr_iss/13]
SN    6959 00:00:00 taskq_thread               [z_fr_iss/14]
SN    6960 00:00:00 taskq_thread               [z_fr_iss/15]
SN    6961 00:00:00 taskq_thread               [z_fr_int/0]
SN    6962 00:00:00 taskq_thread               [z_cl_iss/0]
SN    6963 00:00:00 taskq_thread               [z_cl_int/0]
SN    6964 00:00:00 taskq_thread               [z_ioctl_iss/0]
SN    6965 00:00:00 taskq_thread               [z_ioctl_int/0]
S<    6996 00:00:00 taskq_thread               [zfs_iput_taskq/]
S<    6997 00:00:00 cv_wait_common             [txg_quiesce]
D<    6998 00:00:00 cv_wait_common             [txg_sync]
Ss    6999 00:00:00 poll_schedule_timeout      sshd: root@pts/1 
Ss    7003 00:00:00 wait                       -bash
S<    7015 00:00:00 taskq_thread               [zil_clean/0]
S<    7021 00:00:00 taskq_thread               [zil_clean/0]
S<    7027 00:00:00 taskq_thread               [zil_clean/0]
S<    7033 00:00:00 taskq_thread               [zil_clean/0]
Ss    7035 00:00:00 poll_schedule_timeout      sshd: root@pts/2 
Ss+   7039 00:00:00 n_tty_read                 -bash
Ss    7045 00:00:00 poll_schedule_timeout      sshd: root@pts/3 
Ss+   7049 00:00:00 n_tty_read                 -bash
Ss    7055 00:00:00 poll_schedule_timeout      sshd: root@pts/4 
Ss+   7059 00:00:00 n_tty_read                 -bash
Ss    7065 00:00:00 poll_schedule_timeout      sshd: root@pts/5 
Ss    7069 00:00:00 wait                       -bash
Ss    7075 00:00:00 poll_schedule_timeout      sshd: root@pts/6 
Ss+   7079 00:00:00 n_tty_read                 -bash
Ss    7085 00:00:00 poll_schedule_timeout      sshd: root@pts/7 
Ss+   7089 00:00:00 n_tty_read                 -bash
Ss    7095 00:00:00 poll_schedule_timeout      sshd: root@pts/8 
Ss    7099 00:00:00 wait                       -bash
D+    7105 00:02:34 -                          getfattr -dh /pool/rsnapshot/colossus/storage/rsnapshot/.sync/colossus/etc/ssl/certs/Entrust.ne
D+    7125 00:00:00 sleep_on_page              tail -f /var/log/kern.log
D     7126 00:00:00 rpc_wait_bit_killable      /USR/SBIN/CRON
D     7127 00:00:00 sleep_on_page              /usr/sbin/munin-node [10.200.63.5]
Ds    7128 00:00:00 sleep_on_page              /usr/sbin/munin-node [10.200.63.5]
R+    7131 00:00:00 -                          ps -ostat,pid,time,wchan=WCHAN-xxxxxxxxxxxxxxxxxxxx,cmd ax

machtech · 2012-02-11T06:55:52Z

Well done Chris, and no - I am not using xattr's
Paul

-----Original Message-----
From: chrisrd [mailto:[email protected]]
Sent: Saturday, 11 February 2012 1:42 PM
To: Paul Pettigrew
Subject: Re: [zfs] CPU stall / hung - only able to clear via hard power cycle (#557)

OK, looks like I can reproduce the stalls at will by doing a getfattr -dh a_particular_symlink. Note that this filesystem is used as a gluster "brick" so all dirs and files (including symlinks) have xattrs, and at one point it was mounted xattr=sa (ref: #513 (comment)). Man, I'm lovin' those xattrs! (ref: #503, #513)

Hmmm... Paul, are you using xattrs?

I had an open window to the system which was still responsive (for a while) and I was able to get some stack traces using echo l > /proc/sysrq-trigger:

[63729.740222] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=15002 jiffies)
[63909.596496] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=60034 jiffies)
[64089.452753] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=105066 jiffies)
[64269.309020] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=150098 jiffies)
[64449.165298] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=195130 jiffies)
[64629.021552] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=240162 jiffies)
[64808.877821] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=285194 jiffies)
[64988.734134] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=330226 jiffies)
[65168.590498] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=375258 jiffies)
[65348.446856] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=420290 jiffies)
[65528.303221] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=465322 jiffies)
[65622.672282] SysRq : Show backtrace of all active CPUs
[65622.672684] CPU6:
[65622.672762]  ffff880c3fc83ed0 ffff880c3fc83ee8 0000000000000046 0000000000000006
[65622.673067]  ffff880c3fc83f60 00000000ffffffff ffff880627c12ee8 ffff880c3fc83f18
[65622.673370]  ffffffff8101294b ffff880c3fc83f38 ffffffff811fb6cc 000000000000000f
[65622.673673] Call Trace:
[65622.673751]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[65622.673898]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[65622.673990]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[65622.674109]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[65622.674200]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[65622.674290]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[65622.674385]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[65622.674475]  <EOI>  [<ffffffff81079b10>] ? smp_call_function_many+0x1de/0x1f5
[65622.674617]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[65622.674705]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[65622.674795]  [<ffffffff81079b5e>] ? on_each_cpu+0x16/0x45
[65622.674883]  [<ffffffff81034343>] ? flush_tlb_all+0x17/0x19
[65622.674974]  [<ffffffff810e5c14>] ? __purge_vmap_area_lazy+0x132/0x18d
[65622.675067]  [<ffffffff810e5cc3>] ? free_vmap_area_noflush+0x54/0x5b
[65622.675159]  [<ffffffff810e71dc>] ? free_unmap_vmap_area+0x20/0x24
[65622.675250]  [<ffffffff810e7244>] ? remove_vm_area+0x64/0x78
[65622.675339]  [<ffffffff810e7353>] ? __vunmap+0x39/0xb8
[65622.675426]  [<ffffffff810e7423>] ? vfree+0x29/0x2b
[65622.675522]  [<ffffffffa02c69ea>] ? kv_free+0x33/0x35 [spl]
[65622.675614]  [<ffffffffa02c6be6>] ? spl_slab_reclaim+0x1fa/0x223 [spl]
[65622.675740]  [<ffffffffa03edf8e>] ? zpl_inode_alloc+0x70/0x70 [zfs]
[65622.675834]  [<ffffffffa02c6c5b>] ? spl_kmem_cache_reap_now+0x4c/0x5f [spl]
[65622.675931]  [<ffffffffa02c6cac>] ? __spl_kmem_cache_generic_shrinker+0x3e/0x89 [spl]
[65622.676053]  [<ffffffffa02c6d24>] ? spl_kmem_reap+0x22/0x24 [spl]
[65622.676167]  [<ffffffffa03edfc3>] ? zpl_prune_sbs+0x35/0x37 [zfs]
[65622.676261]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[65622.676364]  [<ffffffffa036e9de>] ? arc_adjust_meta+0x135/0x194 [zfs]
[65622.676459]  [<ffffffffa02c6792>] ? spl_kmem_availrmem+0x19/0x1f [spl]
[65622.676554]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[65622.676659]  [<ffffffffa036f670>] ? arc_reclaim_thread+0xb3/0x11e [zfs]
[65622.676762]  [<ffffffffa036f5bd>] ? arc_do_user_evicts+0xc4/0xc4 [zfs]
[65622.676858]  [<ffffffffa02c81ed>] ? thread_generic_wrapper+0x6c/0x79 [spl]
[65622.676955]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[65622.677049]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[65622.677136]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[65622.677227]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[65622.677318]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[65708.159577] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=510354 jiffies)
[65888.015937] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 8} (detected by 13, t=555386 jiffies)
[66023.207799] SysRq : Show backtrace of all active CPUs
[66025.212028] SysRq : Show backtrace of all active CPUs
[66025.212269] CPU0:
[66025.212332]  ffff880627c03ed0 ffff880627c03ee8 0000000000000046 0000000000000000
[66025.212587]  ffff880627c03f60 00000000ffffffff ffff880627c92ee8 ffff880627c03f18
[66025.212843]  ffffffff8101294b ffff880627c03f38 ffffffff811fb6cc 0000000000000000
[66025.213098] Call Trace:
[66025.213160]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66025.213280]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66025.213351]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66025.213437]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66025.213506]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66025.213577]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66025.213650]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66025.213719]  <EOI>  [<ffffffff81079a46>] ? smp_call_function_many+0x114/0x1f5
[66025.213836]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66025.213906]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66025.213975]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66025.214047]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66025.214117]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66025.214186]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66025.214255]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66025.214324]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66025.214394]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66025.214464]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66025.214532] CPU6:
[66025.214610]  ffff880c3fc83ed0 ffff880c3fc83ee8 0000000000000046 0000000000000006
[66025.214916]  ffff880c3fc83f60 00000000ffffffff ffff880627c92ee8 ffff880c3fc83f18
[66025.215219]  ffffffff8101294b ffff880c3fc83f38 ffffffff811fb6cc 000000000000000f
[66025.215522] Call Trace:
[66025.215600]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66025.215739]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66025.215828]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66025.215946]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66025.216036]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66025.222051]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66025.222144]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66025.222235]  <EOI>  [<ffffffff81079b10>] ? smp_call_function_many+0x1de/0x1f5
[66025.222377]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66025.222464]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66025.222554]  [<ffffffff81079b5e>] ? on_each_cpu+0x16/0x45
[66025.222642]  [<ffffffff81034343>] ? flush_tlb_all+0x17/0x19
[66025.222731]  [<ffffffff810e5c14>] ? __purge_vmap_area_lazy+0x132/0x18d
[66025.222824]  [<ffffffff810e5cc3>] ? free_vmap_area_noflush+0x54/0x5b
[66025.222915]  [<ffffffff810e71dc>] ? free_unmap_vmap_area+0x20/0x24
[66025.223006]  [<ffffffff810e7244>] ? remove_vm_area+0x64/0x78
[66025.223095]  [<ffffffff810e7353>] ? __vunmap+0x39/0xb8
[66025.223183]  [<ffffffff810e7423>] ? vfree+0x29/0x2b
[66025.223273]  [<ffffffffa02c69ea>] ? kv_free+0x33/0x35 [spl]
[66025.223365]  [<ffffffffa02c6be6>] ? spl_slab_reclaim+0x1fa/0x223 [spl]
[66025.223484]  [<ffffffffa03edf8e>] ? zpl_inode_alloc+0x70/0x70 [zfs]
[66025.223578]  [<ffffffffa02c6c5b>] ? spl_kmem_cache_reap_now+0x4c/0x5f [spl]
[66025.223674]  [<ffffffffa02c6cac>] ? __spl_kmem_cache_generic_shrinker+0x3e/0x89 [spl]
[66025.223797]  [<ffffffffa02c6d24>] ? spl_kmem_reap+0x22/0x24 [spl]
[66025.223911]  [<ffffffffa03edfc3>] ? zpl_prune_sbs+0x35/0x37 [zfs]
[66025.224005]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66025.224108]  [<ffffffffa036e9de>] ? arc_adjust_meta+0x135/0x194 [zfs]
[66025.224203]  [<ffffffffa02c6792>] ? spl_kmem_availrmem+0x19/0x1f [spl]
[66025.224298]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66025.224401]  [<ffffffffa036f670>] ? arc_reclaim_thread+0xb3/0x11e [zfs]
[66025.224505]  [<ffffffffa036f5bd>] ? arc_do_user_evicts+0xc4/0xc4 [zfs]
[66025.224600]  [<ffffffffa02c81ed>] ? thread_generic_wrapper+0x6c/0x79 [spl]
[66025.224697]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66025.224789]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66025.224877]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66025.224968]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66025.225059]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66027.213815] SysRq : Show backtrace of all active CPUs
[66027.214035] CPU0:
[66027.214097]  ffff880627c03ed0 ffff880627c03ee8 0000000000000046 0000000000000000
[66027.214353]  ffff880627c03f60 00000000ffffffff ffff880627c52ee8 ffff880627c03f18
[66027.214608]  ffffffff8101294b ffff880627c03f38 ffffffff811fb6cc 0000000000000000
[66027.214863] Call Trace:
[66027.214924]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66027.215039]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66027.215108]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66027.215194]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66027.215263]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66027.215333]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66027.215404]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66027.215473]  <EOI>  [<ffffffff81079a40>] ? smp_call_function_many+0x10e/0x1f5
[66027.215591]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66027.215661]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66027.215730]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66027.215801]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66027.215871]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66027.215940]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66027.216009]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66027.216077]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66027.216148]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66027.216217]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66027.216285] CPU6:
[66027.216362]  ffff880c3fc83ed0 ffff880c3fc83ee8 0000000000000046 0000000000000006
[66027.216666]  ffff880c3fc83f60 00000000ffffffff ffff880627c52ee8 ffff880c3fc83f18
[66027.216969]  ffffffff8101294b ffff880c3fc83f38 ffffffff811fb6cc 000000000000000f
[66027.217272] Call Trace:
[66027.217350]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66027.217489]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66027.217578]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66027.217697]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66027.217787]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66027.217875]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66027.217968]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66027.218059]  <EOI>  [<ffffffff81079b10>] ? smp_call_function_many+0x1de/0x1f5
[66027.218202]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66027.218289]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66027.218380]  [<ffffffff81079b5e>] ? on_each_cpu+0x16/0x45
[66027.218467]  [<ffffffff81034343>] ? flush_tlb_all+0x17/0x19
[66027.218557]  [<ffffffff810e5c14>] ? __purge_vmap_area_lazy+0x132/0x18d
[66027.218649]  [<ffffffff810e5cc3>] ? free_vmap_area_noflush+0x54/0x5b
[66027.218740]  [<ffffffff810e71dc>] ? free_unmap_vmap_area+0x20/0x24
[66027.218831]  [<ffffffff810e7244>] ? remove_vm_area+0x64/0x78
[66027.218920]  [<ffffffff810e7353>] ? __vunmap+0x39/0xb8
[66027.219008]  [<ffffffff810e7423>] ? vfree+0x29/0x2b
[66027.219098]  [<ffffffffa02c69ea>] ? kv_free+0x33/0x35 [spl]
[66027.219189]  [<ffffffffa02c6be6>] ? spl_slab_reclaim+0x1fa/0x223 [spl]
[66027.219306]  [<ffffffffa03edf8e>] ? zpl_inode_alloc+0x70/0x70 [zfs]
[66027.219401]  [<ffffffffa02c6c5b>] ? spl_kmem_cache_reap_now+0x4c/0x5f [spl]
[66027.219497]  [<ffffffffa02c6cac>] ? __spl_kmem_cache_generic_shrinker+0x3e/0x89 [spl]
[66027.219619]  [<ffffffffa02c6d24>] ? spl_kmem_reap+0x22/0x24 [spl]
[66027.219734]  [<ffffffffa03edfc3>] ? zpl_prune_sbs+0x35/0x37 [zfs]
[66027.219828]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66027.219931]  [<ffffffffa036e9de>] ? arc_adjust_meta+0x135/0x194 [zfs]
[66027.220026]  [<ffffffffa02c6792>] ? spl_kmem_availrmem+0x19/0x1f [spl]
[66027.220121]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66027.220224]  [<ffffffffa036f670>] ? arc_reclaim_thread+0xb3/0x11e [zfs]
[66027.220328]  [<ffffffffa036f5bd>] ? arc_do_user_evicts+0xc4/0xc4 [zfs]
[66027.220423]  [<ffffffffa02c81ed>] ? thread_generic_wrapper+0x6c/0x79 [spl]
[66027.220519]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66027.220611]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66027.220699]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66027.220789]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66027.220880]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66029.215880] SysRq : Show backtrace of all active CPUs
[66031.216303] SysRq : Show backtrace of all active CPUs
[66031.216526] CPU0:
[66031.216588]  ffff880627c03ed0 ffff880627c03ee8 0000000000000046 0000000000000000
[66031.216844]  ffff880627c03f60 00000000ffffffff ffff880627cd2ee8 ffff880627c03f18
[66031.217099]  ffffffff8101294b ffff880627c03f38 ffffffff811fb6cc 0000000000000000
[66031.217354] Call Trace:
[66031.217416]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66031.217531]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66031.217600]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66031.217686]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66031.217755]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66031.217825]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66031.217896]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66031.217965]  <EOI>  [<ffffffff81079a40>] ? smp_call_function_many+0x10e/0x1f5
[66031.218082]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66031.218152]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66031.218222]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66031.218292]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66031.218362]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66031.218431]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66031.218501]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66031.218569]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66031.218639]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66031.218709]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66031.218776] CPU1:
[66031.218837]  ffff880627c43ed0 ffff880627c43ee8 0000000000000046 0000000000000001
[66031.219089]  ffff880627c43f60 00000000ffffffff ffff880627cd2ee8 ffff880627c43f18
[66031.219341]  ffffffff8101294b ffff880627c43f38 ffffffff811fb6cc 0000000000000000
[66031.219593] Call Trace:
[66031.219654]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66031.219769]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66031.219837]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66031.219924]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66031.219994]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66031.220062]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66031.220131]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66031.220199]  <EOI>  [<ffffffff81079a46>] ? smp_call_function_many+0x114/0x1f5
[66031.220314]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66031.220382]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66031.220450]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66031.220519]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66031.220587]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66031.220655]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66031.220722]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66031.220789]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66031.220857]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66031.220925]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66031.220992] CPU6:
[66031.221069]  ffff880c3fc83ed0 ffff880c3fc83ee8 0000000000000046 0000000000000006
[66031.221372]  ffff880c3fc83f60 00000000ffffffff ffff880627cd2ee8 ffff880c3fc83f18
[66031.221675]  ffffffff8101294b ffff880c3fc83f38 ffffffff811fb6cc 000000000000000f
[66031.221977] Call Trace:
[66031.222055]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66031.222194]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66031.222282]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66031.222402]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66031.222492]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66031.222579]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66031.222672]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66031.228680]  <EOI>  [<ffffffff81079b16>] ? smp_call_function_many+0x1e4/0x1f5
[66031.228822]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66031.228910]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66031.229000]  [<ffffffff81079b5e>] ? on_each_cpu+0x16/0x45
[66031.229088]  [<ffffffff81034343>] ? flush_tlb_all+0x17/0x19
[66031.229177]  [<ffffffff810e5c14>] ? __purge_vmap_area_lazy+0x132/0x18d
[66031.229269]  [<ffffffff810e5cc3>] ? free_vmap_area_noflush+0x54/0x5b
[66031.229361]  [<ffffffff810e71dc>] ? free_unmap_vmap_area+0x20/0x24
[66031.229452]  [<ffffffff810e7244>] ? remove_vm_area+0x64/0x78
[66031.229542]  [<ffffffff810e7353>] ? __vunmap+0x39/0xb8
[66031.229630]  [<ffffffff810e7423>] ? vfree+0x29/0x2b
[66031.229720]  [<ffffffffa02c69ea>] ? kv_free+0x33/0x35 [spl]
[66031.229812]  [<ffffffffa02c6be6>] ? spl_slab_reclaim+0x1fa/0x223 [spl]
[66031.229929]  [<ffffffffa03edf8e>] ? zpl_inode_alloc+0x70/0x70 [zfs]
[66031.230024]  [<ffffffffa02c6c5b>] ? spl_kmem_cache_reap_now+0x4c/0x5f [spl]
[66031.230120]  [<ffffffffa02c6cac>] ? __spl_kmem_cache_generic_shrinker+0x3e/0x89 [spl]
[66031.230242]  [<ffffffffa02c6d24>] ? spl_kmem_reap+0x22/0x24 [spl]
[66031.230357]  [<ffffffffa03edfc3>] ? zpl_prune_sbs+0x35/0x37 [zfs]
[66031.230451]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66031.230554]  [<ffffffffa036e9de>] ? arc_adjust_meta+0x135/0x194 [zfs]
[66031.230649]  [<ffffffffa02c6792>] ? spl_kmem_availrmem+0x19/0x1f [spl]
[66031.230744]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66031.230847]  [<ffffffffa036f670>] ? arc_reclaim_thread+0xb3/0x11e [zfs]
[66031.230951]  [<ffffffffa036f5bd>] ? arc_do_user_evicts+0xc4/0xc4 [zfs]
[66031.231046]  [<ffffffffa02c81ed>] ? thread_generic_wrapper+0x6c/0x79 [spl]
[66031.231142]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66031.231234]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66031.231322]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66031.231413]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66031.231503]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66033.218245] SysRq : Show backtrace of all active CPUs
[66035.218500] SysRq : Show backtrace of all active CPUs
[66035.218742] CPU0:
[66035.218803]  ffff880627c03ed0 ffff880627c03ee8 0000000000000046 0000000000000000
[66035.219059]  ffff880627c03f60 00000000ffffffff ffff880627d52ee8 ffff880627c03f18
[66035.219314]  ffffffff8101294b ffff880627c03f38 ffffffff811fb6cc 0000000000000000
[66035.219570] Call Trace:
[66035.219631]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66035.219746]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66035.219815]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66035.219901]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66035.219970]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66035.220039]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66035.220110]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66035.220179]  <EOI>  [<ffffffff81079a3e>] ? smp_call_function_many+0x10c/0x1f5
[66035.220296]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66035.220366]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66035.220435]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66035.220506]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66035.220576]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66035.220645]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66035.220714]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66035.220782]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66035.220853]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66035.220922]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66035.220990] CPU1:
[66035.221051]  ffff880627c43ed0 ffff880627c43ee8 0000000000000046 0000000000000001
[66035.221303]  ffff880627c43f60 00000000ffffffff ffff880627d52ee8 ffff880627c43f18
[66035.221555]  ffffffff8101294b ffff880627c43f38 ffffffff811fb6cc 0000000000000000
[66035.221815] Call Trace:
[66035.221875]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66035.221988]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66035.222056]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66035.222140]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66035.222208]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66035.222276]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66035.222346]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66035.222414]  <EOI>  [<ffffffff81079a3e>] ? smp_call_function_many+0x10c/0x1f5
[66035.222528]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66035.222596]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66035.222664]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66035.222733]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66035.222801]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66035.222869]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66035.222936]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66035.223003]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66035.223071]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66035.223139]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66035.223206] CPU3:
[66035.223267]  ffff880627cc3ed0 ffff880627cc3ee8 0000000000000046 0000000000000003
[66035.223520]  ffff880627cc3f60 00000000ffffffff ffff880627d52ee8 ffff880627cc3f18
[66035.223771]  ffffffff8101294b ffff880627cc3f38 ffffffff811fb6cc 0000000000000000
[66035.224024] Call Trace:
[66035.224085]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66035.224199]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66035.224267]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66035.224356]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66035.224424]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66035.224492]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66035.224562]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66035.224629]  <EOI>  [<ffffffff81079a40>] ? smp_call_function_many+0x10e/0x1f5
[66035.224744]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66035.224813]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66035.224882]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66035.224951]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66035.225019]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66035.225086]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66035.225154]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66035.225221]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66035.225289]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66035.225357]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66035.225423] CPU6:
[66035.225501]  ffff880c3fc83ed0 ffff880c3fc83ee8 0000000000000046 0000000000000006
[66035.225805]  ffff880c3fc83f60 00000000ffffffff ffff880627d52ee8 ffff880c3fc83f18
[66035.226109]  ffffffff8101294b ffff880c3fc83f38 ffffffff811fb6cc 000000000000000f
[66035.226412] Call Trace:
[66035.226489]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66035.226628]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66035.226717]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66035.226835]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66035.226925]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66035.227013]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66035.227106]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66035.227196]  <EOI>  [<ffffffff81079b10>] ? smp_call_function_many+0x1de/0x1f5
[66035.227338]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66035.227426]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66035.227516]  [<ffffffff81079b5e>] ? on_each_cpu+0x16/0x45
[66035.227604]  [<ffffffff81034343>] ? flush_tlb_all+0x17/0x19
[66035.227693]  [<ffffffff810e5c14>] ? __purge_vmap_area_lazy+0x132/0x18d
[66035.227785]  [<ffffffff810e5cc3>] ? free_vmap_area_noflush+0x54/0x5b
[66035.227877]  [<ffffffff810e71dc>] ? free_unmap_vmap_area+0x20/0x24
[66035.227968]  [<ffffffff810e7244>] ? remove_vm_area+0x64/0x78
[66035.228057]  [<ffffffff810e7353>] ? __vunmap+0x39/0xb8
[66035.228144]  [<ffffffff810e7423>] ? vfree+0x29/0x2b
[66035.228235]  [<ffffffffa02c69ea>] ? kv_free+0x33/0x35 [spl]
[66035.228326]  [<ffffffffa02c6be6>] ? spl_slab_reclaim+0x1fa/0x223 [spl]
[66035.228443]  [<ffffffffa03edf8e>] ? zpl_inode_alloc+0x70/0x70 [zfs]
[66035.228538]  [<ffffffffa02c6c5b>] ? spl_kmem_cache_reap_now+0x4c/0x5f [spl]
[66035.228634]  [<ffffffffa02c6cac>] ? __spl_kmem_cache_generic_shrinker+0x3e/0x89 [spl]
[66035.228756]  [<ffffffffa02c6d24>] ? spl_kmem_reap+0x22/0x24 [spl]
[66035.228871]  [<ffffffffa03edfc3>] ? zpl_prune_sbs+0x35/0x37 [zfs]
[66035.228965]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66035.229068]  [<ffffffffa036e9de>] ? arc_adjust_meta+0x135/0x194 [zfs]
[66035.229162]  [<ffffffffa02c6792>] ? spl_kmem_availrmem+0x19/0x1f [spl]
[66035.229257]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66035.229361]  [<ffffffffa036f670>] ? arc_reclaim_thread+0xb3/0x11e [zfs]
[66035.229464]  [<ffffffffa036f5bd>] ? arc_do_user_evicts+0xc4/0xc4 [zfs]
[66035.229560]  [<ffffffffa02c81ed>] ? thread_generic_wrapper+0x6c/0x79 [spl]
[66035.229703]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66035.229796]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66035.229883]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66035.229974]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66035.230064]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66037.220582] SysRq : Show backtrace of all active CPUs
[66039.220302] SysRq : Show backtrace of all active CPUs
[66039.220537] CPU0:
[66039.220598]  ffff880627c03ed0 ffff880627c03ee8 0000000000000046 0000000000000000
[66039.220853]  ffff880627c03f60 00000000ffffffff ffff880627d92ee8 ffff880627c03f18
[66039.221107]  ffffffff8101294b ffff880627c03f38 ffffffff811fb6cc 0000000000000000
[66039.221362] Call Trace:
[66039.221424]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66039.221539]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66039.221607]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66039.221694]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.221763]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.221832]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66039.221904]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66039.221973]  <EOI>  [<ffffffff81079a46>] ? smp_call_function_many+0x114/0x1f5
[66039.227079]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66039.227148]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66039.227218]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66039.227289]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66039.227359]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66039.227428]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66039.227497]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66039.227565]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66039.227636]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66039.227705]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66039.227774] CPU1:
[66039.227836]  ffff880627c43ed0 ffff880627c43ee8 0000000000000046 0000000000000001
[66039.228094]  ffff880627c43f60 00000000ffffffff ffff880627d92ee8 ffff880627c43f18
[66039.228350]  ffffffff8101294b ffff880627c43f38 ffffffff811fb6cc 0000000000000000
[66039.228606] Call Trace:
[66039.228667]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66039.228782]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66039.228851]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66039.228937]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.229007]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.229076]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66039.229148]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66039.229216]  <EOI>  [<ffffffff81079a40>] ? smp_call_function_many+0x10e/0x1f5
[66039.229333]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66039.229403]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66039.229472]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66039.229543]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66039.229613]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66039.229682]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66039.229751]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66039.229819]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66039.229889]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66039.229959]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66039.230027] CPU3:
[66039.230089]  ffff880627cc3ed0 ffff880627cc3ee8 0000000000000046 0000000000000003
[66039.230351]  ffff880627cc3f60 00000000ffffffff ffff880627d92ee8 ffff880627cc3f18
[66039.230604]  ffffffff8101294b ffff880627cc3f38 ffffffff811fb6cc 0000000000000000
[66039.230859] Call Trace:
[66039.230919]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66039.231032]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66039.231100]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66039.231184]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.231252]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.231320]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66039.231390]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66039.231458]  <EOI>  [<ffffffff81079a40>] ? smp_call_function_many+0x10e/0x1f5
[66039.231575]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66039.231643]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66039.231711]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66039.231780]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66039.231848]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66039.231916]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66039.231984]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66039.232050]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66039.232119]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66039.232187]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66039.232254] CPU6:
[66039.232333]  ffff880c3fc83ed0 ffff880c3fc83ee8 0000000000000046 0000000000000006
[66039.232641]  ffff880c3fc83f60 00000000ffffffff ffff880627d92ee8 ffff880c3fc83f18
[66039.232944]  ffffffff8101294b ffff880c3fc83f38 ffffffff811fb6cc 000000000000000f
[66039.233247] Call Trace:
[66039.233325]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66039.233464]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66039.233552]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66039.233671]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.233761]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66039.233849]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66039.233941]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66039.234032]  <EOI>  [<ffffffff81079b10>] ? smp_call_function_many+0x1de/0x1f5
[66039.234174]  [<ffffffff810344be>] ? leave_mm+0x3d/0x3d
[66039.234261]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66039.234353]  [<ffffffff81079b5e>] ? on_each_cpu+0x16/0x45
[66039.234441]  [<ffffffff81034343>] ? flush_tlb_all+0x17/0x19
[66039.234530]  [<ffffffff810e5c14>] ? __purge_vmap_area_lazy+0x132/0x18d
[66039.234622]  [<ffffffff810e5cc3>] ? free_vmap_area_noflush+0x54/0x5b
[66039.234714]  [<ffffffff810e71dc>] ? free_unmap_vmap_area+0x20/0x24
[66039.234805]  [<ffffffff810e7244>] ? remove_vm_area+0x64/0x78
[66039.234894]  [<ffffffff810e7353>] ? __vunmap+0x39/0xb8
[66039.234981]  [<ffffffff810e7423>] ? vfree+0x29/0x2b
[66039.235071]  [<ffffffffa02c69ea>] ? kv_free+0x33/0x35 [spl]
[66039.235163]  [<ffffffffa02c6be6>] ? spl_slab_reclaim+0x1fa/0x223 [spl]
[66039.235280]  [<ffffffffa03edf8e>] ? zpl_inode_alloc+0x70/0x70 [zfs]
[66039.235374]  [<ffffffffa02c6c5b>] ? spl_kmem_cache_reap_now+0x4c/0x5f [spl]
[66039.235470]  [<ffffffffa02c6cac>] ? __spl_kmem_cache_generic_shrinker+0x3e/0x89 [spl]
[66039.235593]  [<ffffffffa02c6d24>] ? spl_kmem_reap+0x22/0x24 [spl]
[66039.235708]  [<ffffffffa03edfc3>] ? zpl_prune_sbs+0x35/0x37 [zfs]
[66039.235802]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66039.235906]  [<ffffffffa036e9de>] ? arc_adjust_meta+0x135/0x194 [zfs]
[66039.236000]  [<ffffffffa02c6792>] ? spl_kmem_availrmem+0x19/0x1f [spl]
[66039.236096]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66039.236199]  [<ffffffffa036f670>] ? arc_reclaim_thread+0xb3/0x11e [zfs]
[66039.236302]  [<ffffffffa036f5bd>] ? arc_do_user_evicts+0xc4/0xc4 [zfs]
[66039.236398]  [<ffffffffa02c81ed>] ? thread_generic_wrapper+0x6c/0x79 [spl]
[66039.236494]  [<ffffffffa02c8181>] ? __thread_create+0x125/0x125 [spl]
[66039.236586]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66039.236674]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66039.236765]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66039.236855]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66039.236943] CPU9:
[66039.237005]  ffff880627d43ed0 ffff880627d43ee8 0000000000000046 0000000000000009
[66039.237262]  ffff880627d43f60 00000000ffffffff ffff880627d92ee8 ffff880627d43f18
[66039.237516]  ffffffff8101294b ffff880627d43f38 ffffffff811fb6cc 0000000000000000
[66039.237771] Call Trace:
[66039.237832]  <IRQ>  [<ffffffff8101294b>] ? show_stack+0x17/0x19
[66039.237945]  [<ffffffff811fb6cc>] ? showacpu+0x48/0x5c
[66039.238014]  [<ffffffff81079c2c>] ? generic_smp_call_function_interrupt+0x9f/0x167
[66039.238100]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.238169]  [<ffffffff811fb684>] ? register_sysrq_key+0xd/0xd
[66039.238238]  [<ffffffff81026488>] ? smp_call_function_interrupt+0x22/0x31
[66039.238308]  [<ffffffff812cfb5e>] ? call_function_interrupt+0x6e/0x80
[66039.238377]  <EOI>  [<ffffffff81079a40>] ? smp_call_function_many+0x10e/0x1f5
[66039.238493]  [<ffffffff811fb4b7>] ? sysrq_handle_showmem+0xd/0xd
[66039.238563]  [<ffffffff81079b44>] ? smp_call_function+0x1d/0x21
[66039.238632]  [<ffffffff811fb4cb>] ? sysrq_showregs_othercpus+0x14/0x16
[66039.238703]  [<ffffffff81063a37>] ? process_one_work+0x256/0x3c8
[66039.238772]  [<ffffffff81063fb4>] ? worker_thread+0x1d9/0x359
[66039.238841]  [<ffffffff81063ddb>] ? rescuer_thread+0x232/0x232
[66039.238910]  [<ffffffff81066fd0>] ? kthread+0x7d/0x85
[66039.238978]  [<ffffffff812cff74>] ? kernel_thread_helper+0x4/0x10
[66039.239048]  [<ffffffff81066f53>] ? kthread_worker_fn+0x146/0x146
[66039.239117]  [<ffffffff812cff70>] ? gs_change+0x13/0x13
[66041.227876] SysRq : Show backtrace of all active CPUs
[66043.227025] SysRq : Show backtrace of all active CPUs

Also:

# uptime
 12:59:03 up 18:18,  2 users,  load average: 27.00, 26.98, 24.39
# ps -ostat,pid,time,wchan='WCHAN-xxxxxxxxxxxxxxxxxxxx',cmd ax | grep ^R
R       23 00:00:00 -                          [migration/6]
R<    2571 00:18:36 -                          [arc_reclaim]
Rs    6113 00:00:00 ep_poll                    /usr/lib/postfix/master
RN    6908 00:04:40 -                          [z_rd_int/6]
RN    6910 00:05:43 taskq_thread               [z_rd_int/8]
RN    6924 00:03:21 -                          [z_wr_iss/6]
RN    6926 00:02:47 taskq_thread               [z_wr_iss/8]
RN    6945 00:01:17 -                          [z_wr_int/6]
RN    6947 00:01:20 taskq_thread               [z_wr_int/8]
RN    6968 00:00:05 taskq_thread               [z_fr_iss/8]
R    18853 00:00:06 -                          [kworker/6:2]
R+   21835 00:00:00 -                          ps -ostat,pid,time,wchan=WCHAN-xxxxxxxxxxxxxxxxxxxx,cmd ax

And, after a reboot and triggering the problem again (note: the "9 users" were all me with multiple windows open):

# uptime
 14:00:59 up 8 min,  9 users,  load average: 7.12, 2.61, 1.04
# ps -ostat,pid,time,wchan='WCHAN-xxxxxxxxxxxxxxxxxxxx',cmd ax
STAT   PID     TIME WCHAN-xxxxxxxxxxxxxxxxxxxx CMD
Ss       1 00:00:02 poll_schedule_timeout      init [2]
S        2 00:00:00 kthreadd                   [kthreadd]
S        3 00:00:00 run_ksoftirqd              [ksoftirqd/0]
S        4 00:00:00 worker_thread              [kworker/0:0]
S        5 00:00:00 worker_thread              [kworker/u:0]
S        6 00:00:00 cpu_stopper_thread         [migration/0]
S        7 00:00:00 cpu_stopper_thread         [migration/1]
S        9 00:00:00 run_ksoftirqd              [ksoftirqd/1]
S       10 00:00:00 worker_thread              [kworker/0:1]
S       11 00:00:00 cpu_stopper_thread         [migration/2]
S       12 00:00:00 worker_thread              [kworker/2:0]
S       13 00:00:00 run_ksoftirqd              [ksoftirqd/2]
S       14 00:00:00 cpu_stopper_thread         [migration/3]
S       16 00:00:00 run_ksoftirqd              [ksoftirqd/3]
S       17 00:00:00 cpu_stopper_thread         [migration/4]
S       18 00:00:00 worker_thread              [kworker/4:0]
S       19 00:00:00 run_ksoftirqd              [ksoftirqd/4]
S       20 00:00:00 cpu_stopper_thread         [migration/5]
S       21 00:00:00 worker_thread              [kworker/5:0]
S       22 00:00:00 run_ksoftirqd              [ksoftirqd/5]
S       23 00:00:00 cpu_stopper_thread         [migration/6]
S       24 00:00:00 worker_thread              [kworker/6:0]
S       25 00:00:00 run_ksoftirqd              [ksoftirqd/6]
S       26 00:00:00 cpu_stopper_thread         [migration/7]
S       28 00:00:00 run_ksoftirqd              [ksoftirqd/7]
S       29 00:00:00 cpu_stopper_thread         [migration/8]
S       30 00:00:00 worker_thread              [kworker/8:0]
S       31 00:00:00 run_ksoftirqd              [ksoftirqd/8]
S       32 00:00:00 cpu_stopper_thread         [migration/9]
S       34 00:00:00 run_ksoftirqd              [ksoftirqd/9]
S       35 00:00:00 cpu_stopper_thread         [migration/10]
S       36 00:00:00 worker_thread              [kworker/10:0]
S       37 00:00:00 run_ksoftirqd              [ksoftirqd/10]
S       38 00:00:00 cpu_stopper_thread         [migration/11]
S       39 00:00:00 worker_thread              [kworker/11:0]
S       40 00:00:00 run_ksoftirqd              [ksoftirqd/11]
S       41 00:00:00 cpu_stopper_thread         [migration/12]
S       43 00:00:00 run_ksoftirqd              [ksoftirqd/12]
S       44 00:00:00 cpu_stopper_thread         [migration/13]
S       45 00:00:00 worker_thread              [kworker/13:0]
S       46 00:00:00 run_ksoftirqd              [ksoftirqd/13]
S       47 00:00:00 cpu_stopper_thread         [migration/14]
S       49 00:00:00 run_ksoftirqd              [ksoftirqd/14]
S       50 00:00:00 cpu_stopper_thread         [migration/15]
S       51 00:00:00 worker_thread              [kworker/15:0]
S       52 00:00:00 run_ksoftirqd              [ksoftirqd/15]
S<      53 00:00:00 rescuer_thread             [cpuset]
S<      54 00:00:00 rescuer_thread             [khelper]
S       55 00:00:00 worker_thread              [kworker/u:1]
S      142 00:00:00 bdi_sync_supers            [sync_supers]
S      144 00:00:00 bdi_forker_thread          [bdi-default]
S<     146 00:00:00 rescuer_thread             [kblockd]
S      344 00:00:00 hub_thread                 [khubd]
S      615 00:00:00 kswapd                     [kswapd0]
S      616 00:00:00 kswapd                     [kswapd1]
SN     678 00:00:00 khugepaged                 [khugepaged]
S      679 00:00:00 fsnotify_mark_destroy      [fsnotify_mark]
S<     685 00:00:00 rescuer_thread             [crypto]
S      796 00:00:00 worker_thread              [kworker/1:1]
S      797 00:00:00 worker_thread              [kworker/2:1]
S      798 00:00:00 worker_thread              [kworker/3:1]
S      799 00:00:00 worker_thread              [kworker/5:1]
S      800 00:00:00 worker_thread              [kworker/6:1]
S      801 00:00:00 worker_thread              [kworker/7:1]
S      802 00:00:00 worker_thread              [kworker/8:1]
S      803 00:00:00 worker_thread              [kworker/9:1]
S      804 00:00:00 worker_thread              [kworker/10:1]
S      805 00:00:00 worker_thread              [kworker/11:1]
S      806 00:00:00 worker_thread              [kworker/12:1]
S      807 00:00:00 worker_thread              [kworker/13:1]
S      808 00:00:00 worker_thread              [kworker/14:1]
S      809 00:00:00 worker_thread              [kworker/15:1]
S<    1125 00:00:00 rescuer_thread             [kpsmoused]
S     1127 00:00:00 scsi_error_handler         [scsi_eh_0]
S<    1128 00:00:00 rescuer_thread             [fw_event0]
S<    1203 00:00:00 rescuer_thread             [poll_0_status]
S     1363 00:00:00 worker_thread              [kworker/u:2]
S     1372 00:00:00 worker_thread              [kworker/u:3]
S     1373 00:00:00 worker_thread              [kworker/u:4]
S     1379 00:00:00 worker_thread              [kworker/u:5]
S     1381 00:00:00 worker_thread              [kworker/u:7]
S     1382 00:00:00 worker_thread              [kworker/u:8]
S     1383 00:00:00 worker_thread              [kworker/u:9]
S     1384 00:00:00 worker_thread              [kworker/u:10]
S     1385 00:00:00 worker_thread              [kworker/u:11]
S     1386 00:00:00 worker_thread              [kworker/u:12]
D     1387 00:00:00 kthread_stop               [kworker/u:13]
S     1388 00:00:00 worker_thread              [kworker/u:14]
S     1389 00:00:00 worker_thread              [kworker/u:15]
S     1390 00:00:00 worker_thread              [kworker/u:16]
S     1391 00:00:00 worker_thread              [kworker/u:17]
S     1392 00:00:00 worker_thread              [kworker/u:18]
S     1394 00:00:00 worker_thread              [kworker/u:19]
S     1396 00:00:00 worker_thread              [kworker/u:20]
R     1397 00:00:00 worker_thread              [kworker/u:21]
S     1398 00:00:00 worker_thread              [kworker/u:22]
S     1399 00:00:00 worker_thread              [kworker/u:23]
S     1400 00:00:00 worker_thread              [kworker/u:24]
S     1402 00:00:00 worker_thread              [kworker/u:26]
S     1447 00:00:00 worker_thread              [kworker/0:3]
S     1460 00:00:00 worker_thread              [kworker/8:2]
R     1493 00:00:00 -                          [kworker/0:4]
S     1546 00:00:00 worker_thread              [kworker/4:2]
S     1566 00:00:00 worker_thread              [kworker/12:2]
S<    1610 00:00:00 rescuer_thread             [rpciod]
S<    1611 00:00:00 rescuer_thread             [nfsiod]
Ss    1672 00:00:00 ep_poll                    udevd --daemon
S     1834 00:00:00 worker_thread              [kworker/3:2]
S     2006 00:00:00 ep_poll                    udevd --daemon
S     2007 00:00:00 ep_poll                    udevd --daemon
S     2214 00:00:00 bdi_writeback_thread       [flush-0:16]
S     2289 00:00:00 worker_thread              [kworker/9:2]
S     2291 00:00:00 worker_thread              [kworker/14:2]
S     2292 00:00:00 worker_thread              [kworker/1:2]
S     2293 00:00:00 worker_thread              [kworker/7:2]
S<    2525 00:00:00 rescuer_thread             [kdmflush]
S<    2527 00:00:00 rescuer_thread             [kcryptd_io]
S<    2528 00:00:00 rescuer_thread             [kcryptd]
S<    2542 00:00:00 taskq_thread               [spl_system_task]
S<    2543 00:00:00 taskq_thread               [spl_system_task]
S<    2544 00:00:00 taskq_thread               [spl_system_task]
S<    2545 00:00:00 taskq_thread               [spl_system_task]
S<    2546 00:00:00 taskq_thread               [spl_system_task]
S<    2547 00:00:00 taskq_thread               [spl_system_task]
S<    2548 00:00:00 taskq_thread               [spl_system_task]
S<    2549 00:00:00 taskq_thread               [spl_system_task]
S<    2550 00:00:00 taskq_thread               [spl_system_task]
S<    2551 00:00:00 taskq_thread               [spl_system_task]
S<    2552 00:00:00 taskq_thread               [spl_system_task]
S<    2553 00:00:00 taskq_thread               [spl_system_task]
S<    2554 00:00:00 taskq_thread               [spl_system_task]
S<    2555 00:00:00 taskq_thread               [spl_system_task]
S<    2556 00:00:00 taskq_thread               [spl_system_task]
S<    2557 00:00:00 taskq_thread               [spl_system_task]
S<    2561 00:00:00 cv_timedwait_common        [arc_reclaim]
S<    2562 00:00:00 cv_timedwait_common        [l2arc_feed]
SN    2563 00:00:00 taskq_thread               [zvol/0]
SN    2564 00:00:00 taskq_thread               [zvol/1]
SN    2565 00:00:00 taskq_thread               [zvol/2]
SN    2566 00:00:00 taskq_thread               [zvol/3]
SN    2567 00:00:00 taskq_thread               [zvol/4]
SN    2568 00:00:00 taskq_thread               [zvol/5]
SN    2569 00:00:00 taskq_thread               [zvol/6]
SN    2570 00:00:00 taskq_thread               [zvol/7]
SN    2571 00:00:00 taskq_thread               [zvol/8]
SN    2572 00:00:00 taskq_thread               [zvol/9]
SN    2573 00:00:00 taskq_thread               [zvol/10]
SN    2574 00:00:00 taskq_thread               [zvol/11]
SN    2575 00:00:00 taskq_thread               [zvol/12]
SN    2576 00:00:00 taskq_thread               [zvol/13]
SN    2577 00:00:00 taskq_thread               [zvol/14]
SN    2578 00:00:00 taskq_thread               [zvol/15]
S<    2767 00:00:00 rescuer_thread             [kdmflush]
S<    2770 00:00:00 rescuer_thread             [kcryptd_io]
S<    2771 00:00:00 rescuer_thread             [kcryptd]
S<    2847 00:00:00 rescuer_thread             [kdmflush]
S<    2850 00:00:00 rescuer_thread             [kcryptd_io]
S<    2851 00:00:00 rescuer_thread             [kcryptd]
S<    2933 00:00:00 rescuer_thread             [kdmflush]
S<    2936 00:00:00 rescuer_thread             [kcryptd_io]
S<    2937 00:00:00 rescuer_thread             [kcryptd]
S<    3013 00:00:00 rescuer_thread             [kdmflush]
S<    3016 00:00:00 rescuer_thread             [kcryptd_io]
S<    3017 00:00:00 rescuer_thread             [kcryptd]
S<    3093 00:00:00 rescuer_thread             [kdmflush]
S<    3096 00:00:00 rescuer_thread             [kcryptd_io]
S<    3097 00:00:00 rescuer_thread             [kcryptd]
S<    3179 00:00:00 rescuer_thread             [kdmflush]
S<    3182 00:00:00 rescuer_thread             [kcryptd_io]
S<    3183 00:00:00 rescuer_thread             [kcryptd]
S<    3259 00:00:00 rescuer_thread             [kdmflush]
S<    3262 00:00:00 rescuer_thread             [kcryptd_io]
S<    3263 00:00:00 rescuer_thread             [kcryptd]
S<    3339 00:00:00 rescuer_thread             [kdmflush]
S<    3341 00:00:00 rescuer_thread             [kcryptd_io]
S<    3343 00:00:00 rescuer_thread             [kcryptd]
S<    3425 00:00:00 rescuer_thread             [kdmflush]
S<    3428 00:00:00 rescuer_thread             [kcryptd_io]
S<    3429 00:00:00 rescuer_thread             [kcryptd]
S<    3511 00:00:00 rescuer_thread             [kdmflush]
S<    3514 00:00:00 rescuer_thread             [kcryptd_io]
S<    3515 00:00:00 rescuer_thread             [kcryptd]
S<    3591 00:00:00 rescuer_thread             [kdmflush]
S<    3594 00:00:00 rescuer_thread             [kcryptd_io]
S<    3595 00:00:00 rescuer_thread             [kcryptd]
S<    3671 00:00:00 rescuer_thread             [kdmflush]
S<    3673 00:00:00 rescuer_thread             [kcryptd_io]
S<    3674 00:00:00 rescuer_thread             [kcryptd]
S<    3757 00:00:00 rescuer_thread             [kdmflush]
S<    3760 00:00:00 rescuer_thread             [kcryptd_io]
S<    3761 00:00:00 rescuer_thread             [kcryptd]
S<    3837 00:00:00 rescuer_thread             [kdmflush]
S<    3840 00:00:00 rescuer_thread             [kcryptd_io]
S<    3841 00:00:00 rescuer_thread             [kcryptd]
S<    3917 00:00:00 rescuer_thread             [kdmflush]
S<    3920 00:00:00 rescuer_thread             [kcryptd_io]
S<    3921 00:00:00 rescuer_thread             [kcryptd]
S<    4003 00:00:00 rescuer_thread             [kdmflush]
S<    4006 00:00:00 rescuer_thread             [kcryptd_io]
S<    4007 00:00:00 rescuer_thread             [kcryptd]
S<    4089 00:00:00 rescuer_thread             [kdmflush]
S<    4092 00:00:00 rescuer_thread             [kcryptd_io]
S<    4093 00:00:00 rescuer_thread             [kcryptd]
S<    4169 00:00:00 rescuer_thread             [kdmflush]
S<    4172 00:00:00 rescuer_thread             [kcryptd_io]
S<    4173 00:00:00 rescuer_thread             [kcryptd]
S<    4255 00:00:00 rescuer_thread             [kdmflush]
S<    4258 00:00:00 rescuer_thread             [kcryptd_io]
S<    4259 00:00:00 rescuer_thread             [kcryptd]
S<    4335 00:00:00 rescuer_thread             [kdmflush]
S<    4338 00:00:00 rescuer_thread             [kcryptd_io]
S<    4339 00:00:00 rescuer_thread             [kcryptd]
S<    4421 00:00:00 rescuer_thread             [kdmflush]
S<    4424 00:00:00 rescuer_thread             [kcryptd_io]
S<    4425 00:00:00 rescuer_thread             [kcryptd]
S<    4507 00:00:00 rescuer_thread             [kdmflush]
S<    4510 00:00:00 rescuer_thread             [kcryptd_io]
S<    4511 00:00:00 rescuer_thread             [kcryptd]
S<    4587 00:00:00 rescuer_thread             [kdmflush]
S<    4590 00:00:00 rescuer_thread             [kcryptd_io]
S<    4591 00:00:00 rescuer_thread             [kcryptd]
S<    4673 00:00:00 rescuer_thread             [kdmflush]
S<    4676 00:00:00 rescuer_thread             [kcryptd_io]
S<    4677 00:00:00 rescuer_thread             [kcryptd]
S<    4759 00:00:00 rescuer_thread             [kdmflush]
S<    4762 00:00:00 rescuer_thread             [kcryptd_io]
S<    4763 00:00:00 rescuer_thread             [kcryptd]
S<    4839 00:00:00 rescuer_thread             [kdmflush]
S<    4842 00:00:00 rescuer_thread             [kcryptd_io]
S<    4843 00:00:00 rescuer_thread             [kcryptd]
Ss    5727 00:00:00 poll_schedule_timeout      /sbin/rpcbind -w
Ss    5744 00:00:00 poll_schedule_timeout      /sbin/rpc.statd
Sl    5883 00:00:00 poll_schedule_timeout      /usr/sbin/rsyslogd -c5
Ss    5969 00:00:00 hrtimer_nanosleep          /usr/sbin/atd
Ss    5993 00:00:00 poll_schedule_timeout      /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 107:113
Ss    6001 00:00:00 hrtimer_nanosleep          /usr/sbin/cron
Ss    6005 00:00:00 poll_schedule_timeout      /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d
Ss    6030 00:00:00 poll_schedule_timeout      /usr/bin/dbus-daemon --system
S<    6038 00:00:00 transport_processing_threa [LIO_rd_mcp]
Ss    6057 00:00:00 poll_schedule_timeout      /usr/sbin/sshd
Rs    6115 00:00:00 ep_poll                    /usr/lib/postfix/master
Z     6121 00:00:00 exit                       [pickup] <defunct>
S     6122 00:00:00 ep_poll                    qmgr -l -t fifo -u
Ss    6250 00:00:00 poll_schedule_timeout      /usr/sbin/munin-node
Ss+   6265 00:00:00 n_tty_read                 /sbin/getty 38400 tty1
Ss+   6266 00:00:00 n_tty_read                 /sbin/getty 38400 tty2
Ss+   6267 00:00:00 n_tty_read                 /sbin/getty 38400 tty3
Ss+   6268 00:00:00 n_tty_read                 /sbin/getty 38400 tty4
Ss+   6269 00:00:00 n_tty_read                 /sbin/getty 38400 tty5
Ss+   6270 00:00:00 n_tty_read                 /sbin/getty 38400 tty6
Rs    6745 00:00:00 poll_schedule_timeout      sshd: root@pts/0
Ss    6749 00:00:00 wait                       -bash
S+    6755 00:00:00 hrtimer_nanosleep          zpool iostat 60
RN    6877 00:00:00 -                          [z_null_iss/0]
SN    6878 00:00:00 taskq_thread               [z_null_int/0]
SN    6879 00:00:00 taskq_thread               [z_rd_iss/0]
SN    6880 00:00:00 taskq_thread               [z_rd_iss/1]
SN    6881 00:00:00 taskq_thread               [z_rd_iss/2]
SN    6882 00:00:00 taskq_thread               [z_rd_iss/3]
SN    6883 00:00:00 taskq_thread               [z_rd_iss/4]
SN    6884 00:00:00 taskq_thread               [z_rd_iss/5]
SN    6885 00:00:00 taskq_thread               [z_rd_iss/6]
SN    6886 00:00:00 taskq_thread               [z_rd_iss/7]
SN    6887 00:00:00 taskq_thread               [z_rd_int/0]
SN    6888 00:00:00 taskq_thread               [z_rd_int/1]
SN    6889 00:00:00 taskq_thread               [z_rd_int/2]
SN    6890 00:00:00 taskq_thread               [z_rd_int/3]
SN    6891 00:00:00 taskq_thread               [z_rd_int/4]
SN    6892 00:00:00 taskq_thread               [z_rd_int/5]
SN    6893 00:00:00 taskq_thread               [z_rd_int/6]
SN    6894 00:00:00 taskq_thread               [z_rd_int/7]
SN    6895 00:00:00 taskq_thread               [z_rd_int/8]
SN    6896 00:00:00 taskq_thread               [z_rd_int/9]
SN    6897 00:00:00 taskq_thread               [z_rd_int/10]
SN    6898 00:00:00 taskq_thread               [z_rd_int/11]
SN    6899 00:00:00 taskq_thread               [z_rd_int/12]
SN    6900 00:00:00 taskq_thread               [z_rd_int/13]
SN    6901 00:00:00 taskq_thread               [z_rd_int/14]
SN    6902 00:00:00 taskq_thread               [z_rd_int/15]
SN    6903 00:00:00 taskq_thread               [z_wr_iss/0]
SN    6904 00:00:00 taskq_thread               [z_wr_iss/1]
SN    6905 00:00:00 taskq_thread               [z_wr_iss/2]
SN    6906 00:00:00 taskq_thread               [z_wr_iss/3]
SN    6907 00:00:00 taskq_thread               [z_wr_iss/4]
SN    6908 00:00:00 taskq_thread               [z_wr_iss/5]
SN    6909 00:00:00 taskq_thread               [z_wr_iss/6]
SN    6910 00:00:00 taskq_thread               [z_wr_iss/7]
SN    6911 00:00:00 taskq_thread               [z_wr_iss/8]
SN    6912 00:00:00 taskq_thread               [z_wr_iss/9]
SN    6913 00:00:00 taskq_thread               [z_wr_iss/10]
SN    6914 00:00:00 taskq_thread               [z_wr_iss/11]
SN    6915 00:00:00 taskq_thread               [z_wr_iss/12]
SN    6916 00:00:00 taskq_thread               [z_wr_iss/13]
SN    6917 00:00:00 taskq_thread               [z_wr_iss/14]
SN    6918 00:00:00 taskq_thread               [z_wr_iss/15]
SN    6919 00:00:00 taskq_thread               [z_wr_iss_h/0]
SN    6920 00:00:00 taskq_thread               [z_wr_iss_h/1]
SN    6921 00:00:00 taskq_thread               [z_wr_iss_h/2]
SN    6922 00:00:00 taskq_thread               [z_wr_iss_h/3]
SN    6923 00:00:00 taskq_thread               [z_wr_iss_h/4]
SN    6924 00:00:00 taskq_thread               [z_wr_int/0]
SN    6925 00:00:00 taskq_thread               [z_wr_int/1]
SN    6926 00:00:00 taskq_thread               [z_wr_int/2]
SN    6927 00:00:00 taskq_thread               [z_wr_int/3]
SN    6928 00:00:00 taskq_thread               [z_wr_int/4]
SN    6929 00:00:00 taskq_thread               [z_wr_int/5]
SN    6930 00:00:00 taskq_thread               [z_wr_int/6]
SN    6931 00:00:00 taskq_thread               [z_wr_int/7]
SN    6932 00:00:00 taskq_thread               [z_wr_int/8]
SN    6933 00:00:00 taskq_thread               [z_wr_int/9]
SN    6934 00:00:00 taskq_thread               [z_wr_int/10]
SN    6935 00:00:00 taskq_thread               [z_wr_int/11]
SN    6936 00:00:00 taskq_thread               [z_wr_int/12]
SN    6937 00:00:00 taskq_thread               [z_wr_int/13]
SN    6938 00:00:00 taskq_thread               [z_wr_int/14]
SN    6939 00:00:00 taskq_thread               [z_wr_int/15]
SN    6940 00:00:00 taskq_thread               [z_wr_int_h/0]
SN    6941 00:00:00 taskq_thread               [z_wr_int_h/1]
SN    6942 00:00:00 taskq_thread               [z_wr_int_h/2]
SN    6943 00:00:00 taskq_thread               [z_wr_int_h/3]
SN    6944 00:00:00 taskq_thread               [z_wr_int_h/4]
SN    6945 00:00:00 taskq_thread               [z_fr_iss/0]
SN    6946 00:00:00 taskq_thread               [z_fr_iss/1]
SN    6947 00:00:00 taskq_thread               [z_fr_iss/2]
SN    6948 00:00:00 taskq_thread               [z_fr_iss/3]
SN    6949 00:00:00 taskq_thread               [z_fr_iss/4]
SN    6950 00:00:00 taskq_thread               [z_fr_iss/5]
SN    6951 00:00:00 taskq_thread               [z_fr_iss/6]
SN    6952 00:00:00 taskq_thread               [z_fr_iss/7]
SN    6953 00:00:00 taskq_thread               [z_fr_iss/8]
SN    6954 00:00:00 taskq_thread               [z_fr_iss/9]
SN    6955 00:00:00 taskq_thread               [z_fr_iss/10]
SN    6956 00:00:00 taskq_thread               [z_fr_iss/11]
SN    6957 00:00:00 taskq_thread               [z_fr_iss/12]
SN    6958 00:00:00 taskq_thread               [z_fr_iss/13]
SN    6959 00:00:00 taskq_thread               [z_fr_iss/14]
SN    6960 00:00:00 taskq_thread               [z_fr_iss/15]
SN    6961 00:00:00 taskq_thread               [z_fr_int/0]
SN    6962 00:00:00 taskq_thread               [z_cl_iss/0]
SN    6963 00:00:00 taskq_thread               [z_cl_int/0]
SN    6964 00:00:00 taskq_thread               [z_ioctl_iss/0]
SN    6965 00:00:00 taskq_thread               [z_ioctl_int/0]
S<    6996 00:00:00 taskq_thread               [zfs_iput_taskq/]
S<    6997 00:00:00 cv_wait_common             [txg_quiesce]
D<    6998 00:00:00 cv_wait_common             [txg_sync]
Ss    6999 00:00:00 poll_schedule_timeout      sshd: root@pts/1
Ss    7003 00:00:00 wait                       -bash
S<    7015 00:00:00 taskq_thread               [zil_clean/0]
S<    7021 00:00:00 taskq_thread               [zil_clean/0]
S<    7027 00:00:00 taskq_thread               [zil_clean/0]
S<    7033 00:00:00 taskq_thread               [zil_clean/0]
Ss    7035 00:00:00 poll_schedule_timeout      sshd: root@pts/2
Ss+   7039 00:00:00 n_tty_read                 -bash
Ss    7045 00:00:00 poll_schedule_timeout      sshd: root@pts/3
Ss+   7049 00:00:00 n_tty_read                 -bash
Ss    7055 00:00:00 poll_schedule_timeout      sshd: root@pts/4
Ss+   7059 00:00:00 n_tty_read                 -bash
Ss    7065 00:00:00 poll_schedule_timeout      sshd: root@pts/5
Ss    7069 00:00:00 wait                       -bash
Ss    7075 00:00:00 poll_schedule_timeout      sshd: root@pts/6
Ss+   7079 00:00:00 n_tty_read                 -bash
Ss    7085 00:00:00 poll_schedule_timeout      sshd: root@pts/7
Ss+   7089 00:00:00 n_tty_read                 -bash
Ss    7095 00:00:00 poll_schedule_timeout      sshd: root@pts/8
Ss    7099 00:00:00 wait                       -bash
D+    7105 00:02:34 -                          getfattr -dh /pool/rsnapshot/colossus/storage/rsnapshot/.sync/colossus/etc/ssl/certs/Entrust.ne
D+    7125 00:00:00 sleep_on_page              tail -f /var/log/kern.log
D     7126 00:00:00 rpc_wait_bit_killable      /USR/SBIN/CRON
D     7127 00:00:00 sleep_on_page              /usr/sbin/munin-node [10.200.63.5]
Ds    7128 00:00:00 sleep_on_page              /usr/sbin/munin-node [10.200.63.5]
R+    7131 00:00:00 -                          ps -ostat,pid,time,wchan=WCHAN-xxxxxxxxxxxxxxxxxxxx,cmd ax

Reply to this email directly or view it on GitHub:
#557 (comment)

chrisrd · 2012-02-12T03:00:13Z

OK, I got a stall on linux-3.2.5 (was previously on 3.1.10), openzfs/spl@3c6ed54, b4b599d, and this time it gave me a trace (below). Once again it's related xattrs, but this time on a newly created pool and zfs, with xattr=dir i.e. it had never run with xattr=sa (although sa_lookup shows up in the trace??). This was running without zfs_dedup_prefetch=0 (ref: #513 (comment)).

[ 1720.321723] INFO: rcu_sched detected stall on CPU 0 (t=15000 jiffies)
[ 1720.321769] Pid: 10223, comm: getfattr Tainted: P           O 3.2.5-otn-00001-g67de2fa #1
[ 1720.321814] Call Trace:
[ 1720.321834]  <IRQ>  [<ffffffff810a008c>] __rcu_pending+0x84/0x34e
[ 1720.321870]  [<ffffffff810a044c>] rcu_check_callbacks+0xf6/0x15a
[ 1720.321900]  [<ffffffff8105967b>] update_process_times+0x3c/0x72
[ 1720.321931]  [<ffffffff810754ae>] tick_sched_timer+0x6d/0x96
[ 1720.321959]  [<ffffffff8106abeb>] __run_hrtimer+0xda/0x165
[ 1720.321986]  [<ffffffff81075441>] ? tick_nohz_handler+0xd6/0xd6
[ 1720.322014]  [<ffffffff8106aea0>] hrtimer_interrupt+0xd5/0x19f
[ 1720.322044]  [<ffffffff81027da7>] smp_apic_timer_interrupt+0x81/0x94
[ 1720.322074]  [<ffffffff812df1de>] apic_timer_interrupt+0x6e/0x80
[ 1720.322101]  <EOI>  [<ffffffffa040035d>] ? zap_get_leaf_byblk+0x1b2/0x243 [zfs]
[ 1720.322215]  [<ffffffffa040227d>] ? zap_leaf_lookup_closest+0xd5/0x114 [zfs]
[ 1720.322272]  [<ffffffffa0400949>] fzap_cursor_retrieve+0xae/0x239 [zfs]
[ 1720.322329]  [<ffffffffa0405852>] zap_cursor_retrieve+0xf3/0x247 [zfs]
[ 1720.322383]  [<ffffffffa03e486c>] ? sa_lookup_impl+0xe/0x10 [zfs]
[ 1720.322437]  [<ffffffffa03e48f6>] ? sa_lookup+0x41/0x4b [zfs]
[ 1720.322492]  [<ffffffffa041ebf6>] zfs_readdir+0x1ec/0x39a [zfs]
[ 1720.322546]  [<ffffffffa0407336>] ? zfs_zaccess_aces_check+0x1c1/0x1ef [zfs]
[ 1720.322602]  [<ffffffffa042f9bf>] ? zpl_xattr_trusted_get+0x7d/0x7d [zfs]
[ 1720.322658]  [<ffffffffa04074a8>] ? zfs_zaccess_common+0x144/0x157 [zfs]
[ 1720.322714]  [<ffffffffa0407704>] ? zfs_zaccess+0x136/0x1d5 [zfs]
[ 1720.322748]  [<ffffffffa030cddf>] ? tsd_hash_search+0x69/0x9e [spl]
[ 1720.322782]  [<ffffffffa030ce8a>] ? tsd_exit+0x3c/0x182 [spl]
[ 1720.322834]  [<ffffffffa03e256a>] ? rrw_exit+0xf4/0xf9 [zfs]
[ 1720.322888]  [<ffffffffa042fc0f>] zpl_xattr_list+0x196/0x243 [zfs]
[ 1720.322918]  [<ffffffff8111ee4d>] vfs_listxattr+0x1c/0x1e
[ 1720.322945]  [<ffffffff8111f081>] listxattr+0x5c/0xb4
[ 1720.322971]  [<ffffffff8111f190>] sys_llistxattr+0x3f/0x5a
[ 1720.322999]  [<ffffffff812de712>] system_call_fastpath+0x16/0x1b

thulle · 2012-02-16T14:37:59Z

I can contribute with a trace from ubuntu 11.10, kernel 3.0.0, running 0.6.0.48-0ubuntu1~oneiric1 from ppa.
Getting it mostly during long rsyncs.

[958419.164001] INFO: rcu_sched_state detected stall on CPU 1 (t=15000 jiffies)
[958560.780112] INFO: task z_fr_iss/0:3713 blocked for more than 120 seconds.
[958560.780230] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[958560.780352] z_fr_iss/0      D 0000000000000000     0  3713      2 0x00000000
[958560.780363]  ffff8801a6b07d00 0000000000000046 ffffffff81606cce ffff8801a6b07d80
[958560.780373]  ffff8801a6b07fd8 ffff8801a6b07fd8 ffff8801a6b07fd8 0000000000012a40
[958560.780382]  ffff8802028fdc80 ffff8801d9bb4560 ffff8801a6b07fd8 ffff8801817c7288
[958560.780390] Call Trace:
[958560.780407]  [] ? common_interrupt+0xe/0x13
[958560.780416]  [] schedule+0x3f/0x60
[958560.780424]  [] __mutex_lock_slowpath+0xd7/0x150
[958560.780433]  [] mutex_lock+0x22/0x40
[958560.780531]  [] zio_ready+0x1e0/0x3b0 [zfs]
[958560.780602]  [] zio_execute+0x9f/0xf0 [zfs]
[958560.780626]  [] taskq_thread+0x1b1/0x430 [spl]
[958560.780635]  [] ? try_to_wake_up+0x200/0x200
[958560.780654]  [] ? task_alloc+0x160/0x160 [spl]
[958560.780663]  [] kthread+0x8c/0xa0
[958560.780671]  [] kernel_thread_helper+0x4/0x10
[958560.780679]  [] ? flush_kthread_worker+0xa0/0xa0
[958560.780687]  [] ? gs_change+0x13/0x13
[958560.780693] INFO: task z_fr_iss/2:3715 blocked for more than 120 seconds.
[958560.780799] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[958560.780920] z_fr_iss/2      D 0000000000000002     0  3715      2 0x00000000
[958560.780929]  ffff8801a1a87d00 0000000000000046 ffffffff8160f54e ffff8801a1a87d80
[958560.780938]  ffff8801a1a87fd8 ffff8801a1a87fd8 ffff8801a1a87fd8 0000000000012a40
[958560.780947]  ffff8801fc720000 ffff8801dc4c8000 ffff8801817c7288 ffff8801817c7288
[958560.780955] Call Trace:
[958560.780964]  [] ? apic_timer_interrupt+0xe/0x20
[958560.780973]  [] schedule+0x3f/0x60
[958560.780981]  [] __mutex_lock_slowpath+0xd7/0x150
[958560.780990]  [] mutex_lock+0x22/0x40
[958560.781056]  [] zio_ready+0x1e0/0x3b0 [zfs]
[958560.781123]  [] zio_execute+0x9f/0xf0 [zfs]
[958560.781143]  [] taskq_thread+0x1b1/0x430 [spl]
[958560.781151]  [] ? try_to_wake_up+0x200/0x200
[958560.781170]  [] ? task_alloc+0x160/0x160 [spl]
[958560.781178]  [] kthread+0x8c/0xa0
[958560.781185]  [] kernel_thread_helper+0x4/0x10
[958560.781194]  [] ? flush_kthread_worker+0xa0/0xa0
[958560.781201]  [] ? gs_change+0x13/0x13
[958560.781208] INFO: task txg_sync:3730 blocked for more than 120 seconds.
[958560.784608] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[958560.787970] txg_sync        D 0000000000000002     0  3730      2 0x00000000
[958560.787979]  ffff880197fdfaf0 0000000000000046 ffffffff8160f54e ffff880197fdfb70
[958560.787988]  ffff880197fdffd8 ffff880197fdffd8 ffff880197fdffd8 0000000000012a40
[958560.788015]  ffff8801dc4c8000 ffff8801c059c560 0000000000000000 ffff8801817c7288
[958560.788024] Call Trace:
[958560.788033]  [] ? apic_timer_interrupt+0xe/0x20
[958560.788042]  [] schedule+0x3f/0x60
[958560.788050]  [] __mutex_lock_slowpath+0xd7/0x150
[958560.788059]  [] ? __kmalloc+0x31/0x160
[958560.788067]  [] mutex_lock+0x22/0x40
[958560.788137]  [] zio_add_child+0x61/0x120 [zfs]
[958560.788210]  [] zio_create+0x426/0x520 [zfs]
[958560.788283]  [] zio_free_sync+0x76/0x80 [zfs]
[958560.788357]  [] spa_free_sync_cb+0x43/0x60 [zfs]
[958560.788433]  [] ? bpobj_enqueue_cb+0x20/0x20 [zfs]
[958560.788491]  [] bplist_iterate+0x7a/0xb0 [zfs]
[958560.788566]  [] spa_sync+0x3c3/0xa00 [zfs]
[958560.788579]  [] ? default_wake_function+0x12/0x20
[958560.788594]  [] ? autoremove_wake_function+0x16/0x40
[958560.788608]  [] ? __wake_up+0x53/0x70
[958560.788683]  [] txg_sync_thread+0x216/0x390 [zfs]
[958560.788760]  [] ? txg_init+0x260/0x260 [zfs]
[958560.788835]  [] ? txg_init+0x260/0x260 [zfs]
[958560.788862]  [] thread_generic_wrapper+0x78/0x90 [spl]
[958560.788886]  [] ? __thread_create+0x160/0x160 [spl]
[958560.788899]  [] kthread+0x8c/0xa0
[958560.788913]  [] kernel_thread_helper+0x4/0x10
[958560.788927]  [] ? flush_kthread_worker+0xa0/0xa0
[958560.788940]  [] ? gs_change+0x13/0x13
[958560.788953] INFO: task rsync:19346 blocked for more than 120 seconds.
[958560.792342] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[958560.795766] rsync           D ffffffff81805120     0 19346  19345 0x00000000
[958560.795776]  ffff88018f711b88 0000000000000082 0000000000000001 ffff8801f00302e0
[958560.795785]  ffff88018f711fd8 ffff88018f711fd8 ffff88018f711fd8 0000000000012a40
[958560.795794]  ffff880202935c80 ffff8801febc0000 ffff88018f711b98 ffff8801f0030330
[958560.795802] Call Trace:
[958560.795810]  [] schedule+0x3f/0x60
[958560.795833]  [] cv_wait_common+0x77/0xd0 [spl]
[958560.795842]  [] ? add_wait_queue+0x60/0x60
[958560.795862]  [] __cv_wait+0x13/0x20 [spl]
[958560.795934]  [] txg_wait_open+0x73/0xa0 [zfs]
[958560.795993]  [] dmu_tx_wait+0xed/0xf0 [zfs]
[958560.796076]  [] zfs_write+0x377/0xc50 [zfs]
[958560.796093]  [] ? perf_event_task_sched_out+0x2e/0xa0
[958560.796164]  [] zpl_write_common+0x52/0x80 [zfs]
[958560.796236]  [] zpl_write+0x68/0xa0 [zfs]
[958560.796251]  [] vfs_write+0xb3/0x180
[958560.796265]  [] sys_write+0x4a/0x90
[958560.796279]  [] system_call_fastpath+0x16/0x1b
[958599.283997] INFO: rcu_sched_state detected stall on CPU 1 (t=60030 jiffies)
[958680.796112] INFO: task z_fr_iss/0:3713 blocked for more than 120 seconds.
[958680.799596] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[958680.803078] z_fr_iss/0      D 0000000000000000     0  3713      2 0x00000000
[958680.803090]  ffff8801a6b07d00 0000000000000046 ffffffff81606cce ffff8801a6b07d80
[958680.803100]  ffff8801a6b07fd8 ffff8801a6b07fd8 ffff8801a6b07fd8 0000000000012a40
[958680.803109]  ffff8802028fdc80 ffff8801d9bb4560 ffff8801a6b07fd8 ffff8801817c7288
[958680.803119] Call Trace:
[958680.803136]  [] ? common_interrupt+0xe/0x13
[958680.803151]  [] schedule+0x3f/0x60
[958680.803165]  [] __mutex_lock_slowpath+0xd7/0x150
[958680.803180]  [] mutex_lock+0x22/0x40
[958680.803286]  [] zio_ready+0x1e0/0x3b0 [zfs]
[958680.803363]  [] zio_execute+0x9f/0xf0 [zfs]
[958680.803394]  [] taskq_thread+0x1b1/0x430 [spl]
[958680.803408]  [] ? try_to_wake_up+0x200/0x200
[958680.803432]  [] ? task_alloc+0x160/0x160 [spl]
[958680.803447]  [] kthread+0x8c/0xa0
[958680.803460]  [] kernel_thread_helper+0x4/0x10
[958680.803473]  [] ? flush_kthread_worker+0xa0/0xa0
[958680.803487]  [] ? gs_change+0x13/0x13
[958680.803497] INFO: task z_fr_iss/2:3715 blocked for more than 120 seconds.
[958680.806982] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[958680.810472] z_fr_iss/2      D 0000000000000002     0  3715      2 0x00000000
[958680.810482]  ffff8801a1a87d00 0000000000000046 ffffffff8160f54e ffff8801a1a87d80
[958680.810492]  ffff8801a1a87fd8 ffff8801a1a87fd8 ffff8801a1a87fd8 0000000000012a40
[958680.810501]  ffff8801fc720000 ffff8801dc4c8000 ffff8801817c7288 ffff8801817c7288
[958680.810510] Call Trace:
[958680.810519]  [] ? apic_timer_interrupt+0xe/0x20
[958680.810528]  [] schedule+0x3f/0x60
[958680.810543]  [] __mutex_lock_slowpath+0xd7/0x150
[958680.810557]  [] mutex_lock+0x22/0x40
[958680.810632]  [] zio_ready+0x1e0/0x3b0 [zfs]
[958680.810705]  [] zio_execute+0x9f/0xf0 [zfs]
[958680.810731]  [] taskq_thread+0x1b1/0x430 [spl]
[958680.810744]  [] ? try_to_wake_up+0x200/0x200
[958680.810768]  [] ? task_alloc+0x160/0x160 [spl]
[958680.810781]  [] kthread+0x8c/0xa0
[958680.810794]  [] kernel_thread_helper+0x4/0x10
[958680.810808]  [] ? flush_kthread_worker+0xa0/0xa0
[958680.810821]  [] ? gs_change+0x13/0x13
[958680.810832] INFO: task txg_sync:3730 blocked for more than 120 seconds.
[958680.814340] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[958680.817912] txg_sync        D 0000000000000002     0  3730      2 0x00000000
[958680.817922]  ffff880197fdfaf0 0000000000000046 ffffffff8160f54e ffff880197fdfb70
[958680.817931]  ffff880197fdffd8 ffff880197fdffd8 ffff880197fdffd8 0000000000012a40
[958680.817940]  ffff8801dc4c8000 ffff8801c059c560 0000000000000000 ffff8801817c7288
[958680.817948] Call Trace:
[958680.817957]  [] ? apic_timer_interrupt+0xe/0x20
[958680.817966]  [] schedule+0x3f/0x60
[958680.817979]  [] __mutex_lock_slowpath+0xd7/0x150
[958680.817994]  [] ? __kmalloc+0x31/0x160
[958680.818007]  [] mutex_lock+0x22/0x40
[958680.818082]  [] zio_add_child+0x61/0x120 [zfs]
[958680.818155]  [] zio_create+0x426/0x520 [zfs]
[958680.818227]  [] zio_free_sync+0x76/0x80 [zfs]
[958680.818302]  [] spa_free_sync_cb+0x43/0x60 [zfs]
[958680.818377]  [] ? bpobj_enqueue_cb+0x20/0x20 [zfs]
[958680.818434]  [] bplist_iterate+0x7a/0xb0 [zfs]
[958680.818508]  [] spa_sync+0x3c3/0xa00 [zfs]
[958680.818522]  [] ? default_wake_function+0x12/0x20
[958680.818536]  [] ? autoremove_wake_function+0x16/0x40
[958680.818549]  [] ? __wake_up+0x53/0x70
[958680.818624]  [] txg_sync_thread+0x216/0x390 [zfs]
[958680.818700]  [] ? txg_init+0x260/0x260 [zfs]
[958680.818775]  [] ? txg_init+0x260/0x260 [zfs]
[958680.818801]  [] thread_generic_wrapper+0x78/0x90 [spl]
[958680.818825]  [] ? __thread_create+0x160/0x160 [spl]
[958680.818839]  [] kthread+0x8c/0xa0
[958680.818852]  [] kernel_thread_helper+0x4/0x10
[958680.818865]  [] ? flush_kthread_worker+0xa0/0xa0
[958680.818878]  [] ? gs_change+0x13/0x13
[958680.818890] INFO: task rsync:19346 blocked for more than 120 seconds.
[958680.822496] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[958680.826156] rsync           D ffffffff81805120     0 19346  19345 0x00000000
[958680.826166]  ffff88018f711b88 0000000000000082 0000000000000001 ffff8801f00302e0
[958680.826174]  ffff88018f711fd8 ffff88018f711fd8 ffff88018f711fd8 0000000000012a40
[958680.826183]  ffff880202935c80 ffff8801febc0000 ffff88018f711b98 ffff8801f0030330
[958680.826191] Call Trace:
[958680.826200]  [] schedule+0x3f/0x60
[958680.826223]  [] cv_wait_common+0x77/0xd0 [spl]
[958680.826237]  [] ? add_wait_queue+0x60/0x60
[958680.826263]  [] __cv_wait+0x13/0x20 [spl]
[958680.826339]  [] txg_wait_open+0x73/0xa0 [zfs]
[958680.826403]  [] dmu_tx_wait+0xed/0xf0 [zfs]
[958680.826477]  [] zfs_write+0x377/0xc50 [zfs]
[958680.826493]  [] ? perf_event_task_sched_out+0x2e/0xa0
[958680.826564]  [] zpl_write_common+0x52/0x80 [zfs]
[958680.826635]  [] zpl_write+0x68/0xa0 [zfs]
[958680.826650]  [] vfs_write+0xb3/0x180
[958680.826664]  [] sys_write+0x4a/0x90
[958680.826678]  [] system_call_fastpath+0x16/0x1b
[958779.403997] INFO: rcu_sched_state detected stall on CPU 1 (t=105060 jiffies)
[958800.824113] INFO: task z_fr_iss/0:3713 blocked for more than 120 seconds.
[958800.827877] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[958800.831644] z_fr_iss/0      D 0000000000000000     0  3713      2 0x00000000
[958800.831655]  ffff8801a6b07d00 0000000000000046 ffffffff81606cce ffff8801a6b07d80
[958800.831665]  ffff8801a6b07fd8 ffff8801a6b07fd8 ffff8801a6b07fd8 0000000000012a40
[958800.831675]  ffff8802028fdc80 ffff8801d9bb4560 ffff8801a6b07fd8 ffff8801817c7288
[958800.831684] Call Trace:
[958800.831701]  [] ? common_interrupt+0xe/0x13
[958800.831716]  [] schedule+0x3f/0x60
[958800.831730]  [] __mutex_lock_slowpath+0xd7/0x150
[958800.831744]  [] mutex_lock+0x22/0x40
[958800.831848]  [] zio_ready+0x1e0/0x3b0 [zfs]
[958800.831925]  [] zio_execute+0x9f/0xf0 [zfs]
[958800.831955]  [] taskq_thread+0x1b1/0x430 [spl]
[958800.831969]  [] ? try_to_wake_up+0x200/0x200
[958800.831994]  [] ? task_alloc+0x160/0x160 [spl]
[958800.832012]  [] kthread+0x8c/0xa0
[958800.832021]  [] kernel_thread_helper+0x4/0x10
[958800.832035]  [] ? flush_kthread_worker+0xa0/0xa0
[958800.832048]  [] ? gs_change+0x13/0x13
[958800.832059] INFO: task z_fr_iss/2:3715 blocked for more than 120 seconds.
[958800.835841] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[958800.839694] z_fr_iss/2      D 0000000000000002     0  3715      2 0x00000000
[958800.839704]  ffff8801a1a87d00 0000000000000046 ffffffff8160f54e ffff8801a1a87d80
[958800.839714]  ffff8801a1a87fd8 ffff8801a1a87fd8 ffff8801a1a87fd8 0000000000012a40
[958800.839723]  ffff8801fc720000 ffff8801dc4c8000 ffff8801817c7288 ffff8801817c7288
[958800.839732] Call Trace:
[958800.839742]  [] ? apic_timer_interrupt+0xe/0x20
[958800.839751]  [] schedule+0x3f/0x60
[958800.839765]  [] __mutex_lock_slowpath+0xd7/0x150
[958800.839778]  [] mutex_lock+0x22/0x40
[958800.839852]  [] zio_ready+0x1e0/0x3b0 [zfs]
[958800.839925]  [] zio_execute+0x9f/0xf0 [zfs]
[958800.839951]  [] taskq_thread+0x1b1/0x430 [spl]
[958800.839963]  [] ? try_to_wake_up+0x200/0x200
[958800.839987]  [] ? task_alloc+0x160/0x160 [spl]
[958800.839999]  [] kthread+0x8c/0xa0
[958800.840015]  [] kernel_thread_helper+0x4/0x10
[958800.840024]  [] ? flush_kthread_worker+0xa0/0xa0
[958800.840037]  [] ? gs_change+0x13/0x13
[958959.523996] INFO: rcu_sched_state detected stall on CPU 1 (t=150090 jiffies)
[959139.643996] INFO: rcu_sched_state detected stall on CPU 1 (t=195120 jiffies)
[959319.763996] INFO: rcu_sched_state detected stall on CPU 1 (t=240150 jiffies)

myers · 2012-02-23T22:17:37Z

In this thread it was suggested to up the /proc/sys/vm/min_free_kbytes

echo 135168 > /proc/sys/vm/min_free_kbytes

behlendorf · 2012-03-02T20:23:57Z

Thanks for the additional debugging on this. Many of the stacks suggest this is related to ZFS's heavy usage of vmalloc(), so the suggestion of increasing min_free_kbytes to take some pressure off the VM may be helpful.

Related to this here's a link to the kernel docs for this warning.

http://www.kernel.org/doc/Documentation/RCU/stallwarn.txt

behlendorf · 2012-03-15T17:16:50Z

For those suffering from this problem I'd appreciate it you could give the following patch a try. It's a significant step towards integrating more tightly with Linux's memory reclaim mechanisms and shedding the Solaris VM baggage. Thus far it has performed well for me but I'd like to see it get more use with a wide variety of workloads before I consider merging it. I believe it may help with the memory issues described here.

behlendorf/zfs@2a349bd

chrisrd · 2012-03-15T22:46:38Z

Excuse my git / github ignorance, but how to I get to commit behlendorf/zfs@062b89c via a local clone of https://github.com/behlendorf/zfs.git?

I found an almost-identical commit behlendorf/zfs@2a349bd in the vm branch and can see it in a local clone, e.g. (starting from scratch for clarity):

git clone https://github.com/behlendorf/zfs.git zfs-behlendorf
cd zfs-behlendorf
git show 2a349bd

The commit behlendorf/zfs@062b89c definitely exists in github because clicking on the link takes you to it, but when I try to see it in a local clone it behaves as if that commit doesn't exist:

git clone https://github.com/behlendorf/zfs.git zfs-behlendorf
cd zfs-behlendorf
git show 062b89c
fatal: ambiguous argument '062b89c': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions

behlendorf · 2012-03-15T22:59:22Z

Sorry, my fault. I meant to update the link, I force updated the branch so the commit id changed. Just go ahead and use the https://github.com/behlendorf/zfs/tree/vm branch or cherry pick the behlendorf/zfs@2a349bd commit.

Using a new repo:

git clone https://github.com/behlendorf/zfs.git zfs-behlendorf
cd zfs-behlendorf
git checkout -b vm origin/vm

or adding my repo as a remote and cherry-picking.

git remote add behlendorf https://github.com/behlendorf/zfs.git
git fetch behlendorf
git cherry-pick 2a349bd9380b18efa71c504baa3b1103c48e7205

chrisrd · 2012-03-16T03:45:13Z

OK, got it, will start testing and report back...

chrisrd · 2012-03-19T07:03:27Z

Just got these soft lockups below, with spl@a3a69b7, zfs@42cb381 + behlendorf/zfs@2a349bd, linux-3.3.0-rc7

[168154.343036] BUG: soft lockup - CPU#0 stuck for 23s! [zfs_iput_taskq/:17600]
[168154.343069] Modules linked in: ext4 mbcache jbd2 dm_snapshot sha256_generic aesni_intel cryptd aes_x86_64 aes_generic cbc raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx sd_mod zfs(PO) zcommon(PO) znvpair(PO) zavl(PO) zunicode(PO) spl(O) zlib_deflate fuse nfsd exportfs dm_crypt dm_mod md_mod nfs lockd auth_rpcgss nfs_acl sunrpc bridge stp llc usbhid hid uhci_hcd psmouse mpt2sas scsi_transport_sas raid_class scsi_mod i2c_i801 i2c_core ehci_hcd ioatdma igb dca button processor thermal_sys
[168154.343387] CPU 0 
[168154.343393] Modules linked in: ext4 mbcache jbd2 dm_snapshot sha256_generic aesni_intel cryptd aes_x86_64 aes_generic cbc raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx sd_mod zfs(PO) zcommon(PO) znvpair(PO) zavl(PO) zunicode(PO) spl(O) zlib_deflate fuse nfsd exportfs dm_crypt dm_mod md_mod nfs lockd auth_rpcgss nfs_acl sunrpc bridge stp llc usbhid hid uhci_hcd psmouse mpt2sas scsi_transport_sas raid_class scsi_mod i2c_i801 i2c_core ehci_hcd ioatdma igb dca button processor thermal_sys
[168154.343715] 
[168154.343735] Pid: 17600, comm: zfs_iput_taskq/ Tainted: P           O 3.3.0-rc7-otn-00119-g02056bc #1 Supermicro X8DTH-i/6/iF/6F/X8DTH
[168154.343794] RIP: 0010:[<ffffffff813ca144>]  [<ffffffff813ca144>] mutex_unlock+0x14/0x20
[168154.343847] RSP: 0018:ffff8806007bf850  EFLAGS: 00000202
[168154.343873] RAX: ffff880ba4cd62e0 RBX: ffff8804f8bc9378 RCX: ffffffffa0481740
[168154.343916] RDX: 0000000000000000 RSI: ffffffffa0481cb0 RDI: ffff880ba4cd6338
[168154.343959] RBP: ffff8806007bf850 R08: ffffffffa0481740 R09: dead000000100100
[168154.344002] R10: ffff8806007bf940 R11: 3337623639343832 R12: ffff8806007bf810
[168154.344046] R13: ffffffffa03c2299 R14: ffff8806007bf800 R15: ffff88062015e800
[168154.344089] FS:  0000000000000000(0000) GS:ffff880627c00000(0000) knlGS:0000000000000000
[168154.344134] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[168154.344161] CR2: 0000000000dc9cf8 CR3: 0000000001605000 CR4: 00000000000006f0
[168154.344204] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[168154.344248] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[168154.344291] Process zfs_iput_taskq/ (pid: 17600, threadinfo ffff8806007be000, task ffff880600d242f0)
[168154.344337] Stack:
[168154.344357]  ffff8806007bf8a0 ffffffffa03c4821 ffffffffa045ede2 ffff880770717838
[168154.344411]  ffff8806007bf8f0 ffff880c20dbd378 0000000000000001 00000000000284a2
[168154.344463]  0000000000000001 0000000000000000 ffff8806007bf8f0 ffffffffa03c4a86
[168154.344516] Call Trace:
[168154.344561]  [<ffffffffa03c4821>] __dbuf_hold_impl+0x2a1/0x480 [zfs]
[168154.344606]  [<ffffffffa03c4a86>] dbuf_hold_impl+0x86/0xc0 [zfs]
[168154.344647]  [<ffffffffa03c5e03>] dbuf_prefetch+0xf3/0x2b0 [zfs]
[168154.344693]  [<ffffffffa03d8eb1>] dmu_zfetch_dofetch+0xf1/0x160 [zfs]
[168154.344738]  [<ffffffffa03d99f6>] dmu_zfetch+0x796/0xd80 [zfs]
[168154.344779]  [<ffffffffa03c4501>] dbuf_read+0x691/0x710 [zfs]
[168154.344809]  [<ffffffff81038d59>] ? default_spin_lock_flags+0x9/0x10
[168154.344856]  [<ffffffffa03dce7e>] dnode_hold_impl+0x10e/0x550 [zfs]
[168154.344886]  [<ffffffff81159add>] ? iput+0x16d/0x230
[168154.344936]  [<ffffffffa0426eb7>] ? zap_cursor_retrieve+0x117/0x2a0 [zfs]
[168154.344983]  [<ffffffffa03dd2d9>] dnode_hold+0x19/0x20 [zfs]
[168154.345025]  [<ffffffffa03cb168>] dmu_object_info+0x28/0x60 [zfs]
[168154.345076]  [<ffffffffa042cf7e>] zfs_unlinked_drain+0x7e/0x110 [zfs]
[168154.345107]  [<ffffffff811326c6>] ? __slab_free+0x1e6/0x350
[168154.345136]  [<ffffffff81080458>] ? update_curr+0x168/0x170
[168154.345163]  [<ffffffff81080960>] ? dequeue_task_fair+0x170/0x1e0
[168154.345198]  [<ffffffffa030f47b>] ? kmem_free_debug+0x4b/0x150 [spl]
[168154.345227]  [<ffffffff81132ab0>] ? kfree+0xf0/0x120
[168154.345258]  [<ffffffffa03122b7>] taskq_thread+0x237/0x5e0 [spl]
[168154.345288]  [<ffffffff8107b480>] ? try_to_wake_up+0x2d0/0x2d0
[168154.345321]  [<ffffffffa0312080>] ? task_alloc+0x380/0x380 [spl]
[168154.345350]  [<ffffffff8106b35e>] kthread+0x9e/0xb0
[168154.345378]  [<ffffffff813d5c64>] kernel_thread_helper+0x4/0x10
[168154.345407]  [<ffffffff8106b2c0>] ? kthread_freezable_should_stop+0x70/0x70
[168154.345436]  [<ffffffff813d5c60>] ? gs_change+0x13/0x13
[168154.345462] Code: c5 00 00 4c 8b 64 24 08 48 89 43 18 48 8b 1c 24 c9 c3 0f 1f 44 00 00 55 48 89 e5 66 66 66 66 90 48 c7 47 18 00 00 00 00 f0 ff 07 <7f> 05 e8 05 00 00 00 c9 c3 0f 1f 00 55 48 89 e5 48 83 ec 10 48 
[168154.345679] Call Trace:
[168154.345716]  [<ffffffffa03c4821>] __dbuf_hold_impl+0x2a1/0x480 [zfs]
[168154.345760]  [<ffffffffa03c4a86>] dbuf_hold_impl+0x86/0xc0 [zfs]
[168154.345801]  [<ffffffffa03c5e03>] dbuf_prefetch+0xf3/0x2b0 [zfs]
[168154.345845]  [<ffffffffa03d8eb1>] dmu_zfetch_dofetch+0xf1/0x160 [zfs]
[168154.345891]  [<ffffffffa03d99f6>] dmu_zfetch+0x796/0xd80 [zfs]
[168154.345931]  [<ffffffffa03c4501>] dbuf_read+0x691/0x710 [zfs]
[168154.345961]  [<ffffffff81038d59>] ? default_spin_lock_flags+0x9/0x10
[168154.346006]  [<ffffffffa03dce7e>] dnode_hold_impl+0x10e/0x550 [zfs]
[168154.346035]  [<ffffffff81159add>] ? iput+0x16d/0x230
[168154.346084]  [<ffffffffa0426eb7>] ? zap_cursor_retrieve+0x117/0x2a0 [zfs]
[168154.346131]  [<ffffffffa03dd2d9>] dnode_hold+0x19/0x20 [zfs]
[168154.346173]  [<ffffffffa03cb168>] dmu_object_info+0x28/0x60 [zfs]
[168154.346224]  [<ffffffffa042cf7e>] zfs_unlinked_drain+0x7e/0x110 [zfs]
[168154.346253]  [<ffffffff811326c6>] ? __slab_free+0x1e6/0x350
[168154.346281]  [<ffffffff81080458>] ? update_curr+0x168/0x170
[168154.346308]  [<ffffffff81080960>] ? dequeue_task_fair+0x170/0x1e0
[168154.346341]  [<ffffffffa030f47b>] ? kmem_free_debug+0x4b/0x150 [spl]
[168154.346370]  [<ffffffff81132ab0>] ? kfree+0xf0/0x120
[168154.346400]  [<ffffffffa03122b7>] taskq_thread+0x237/0x5e0 [spl]
[168154.346429]  [<ffffffff8107b480>] ? try_to_wake_up+0x2d0/0x2d0
[168154.346462]  [<ffffffffa0312080>] ? task_alloc+0x380/0x380 [spl]
[168154.346490]  [<ffffffff8106b35e>] kthread+0x9e/0xb0
[168154.346516]  [<ffffffff813d5c64>] kernel_thread_helper+0x4/0x10
[168154.346545]  [<ffffffff8106b2c0>] ? kthread_freezable_should_stop+0x70/0x70
[168154.346575]  [<ffffffff813d5c60>] ? gs_change+0x13/0x13

[168182.300847] BUG: soft lockup - CPU#0 stuck for 22s! [zfs_iput_taskq/:17600]
[168182.301984] Modules linked in: ext4 mbcache jbd2 dm_snapshot sha256_generic aesni_intel cryptd aes_x86_64 aes_generic cbc raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx sd_mod zfs(PO) zcommon(PO) znvpair(PO) zavl(PO) zunicode(PO) spl(O) zlib_deflate fuse nfsd exportfs dm_crypt dm_mod md_mod nfs lockd auth_rpcgss nfs_acl sunrpc bridge stp llc usbhid hid uhci_hcd psmouse mpt2sas scsi_transport_sas raid_class scsi_mod i2c_i801 i2c_core ehci_hcd ioatdma igb dca button processor thermal_sys
[168182.302300] CPU 0 
[168182.302306] Modules linked in: ext4 mbcache jbd2 dm_snapshot sha256_generic aesni_intel cryptd aes_x86_64 aes_generic cbc raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx sd_mod zfs(PO) zcommon(PO) znvpair(PO) zavl(PO) zunicode(PO) spl(O) zlib_deflate fuse nfsd exportfs dm_crypt dm_mod md_mod nfs lockd auth_rpcgss nfs_acl sunrpc bridge stp llc usbhid hid uhci_hcd psmouse mpt2sas scsi_transport_sas raid_class scsi_mod i2c_i801 i2c_core ehci_hcd ioatdma igb dca button processor thermal_sys
[168182.302627] 
[168182.302647] Pid: 17600, comm: zfs_iput_taskq/ Tainted: P           O 3.3.0-rc7-otn-00119-g02056bc #1 Supermicro X8DTH-i/6/iF/6F/X8DTH
[168182.302706] RIP: 0010:[<ffffffff813ca144>]  [<ffffffff813ca144>] mutex_unlock+0x14/0x20
[168182.302758] RSP: 0018:ffff8806007bf8f0  EFLAGS: 00000202
[168182.302785] RAX: ffff880647a13e70 RBX: ffff8806007bf8f0 RCX: 0000000000000142
[168182.302828] RDX: ffffffffa0487050 RSI: 0000000000000000 RDI: ffff880647a13ec8
[168182.302871] RBP: ffff8806007bf8f0 R08: 00d8fb97fba364c4 R09: ffff8806007bf9f0
[168182.302914] R10: 0000000000000001 R11: 6432373438343832 R12: 0000000000000001
[168182.302957] R13: 00000000014243e2 R14: 0000000000000003 R15: ffff880c20dbd378
[168182.303001] FS:  0000000000000000(0000) GS:ffff880627c00000(0000) knlGS:0000000000000000
[168182.303046] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[168182.303073] CR2: 0000000000dc9cf8 CR3: 0000000001605000 CR4: 00000000000006f0
[168182.303116] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[168182.303159] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[168182.303203] Process zfs_iput_taskq/ (pid: 17600, threadinfo ffff8806007be000, task ffff880600d242f0)
[168182.303249] Stack:
[168182.303269]  ffff8806007bf970 ffffffffa03c5d7f ffff880c00000006 ffff880c00000000
[168182.303322]  0000000001420eb4 0000000000000001 0000000000000000 0000000000000000
[168182.303375]  0000000000000000 0000000000000000 ffff880647a13e70 0000000000000000
[168182.303427] Call Trace:
[168182.303474]  [<ffffffffa03c5d7f>] dbuf_prefetch+0x6f/0x2b0 [zfs]
[168182.303520]  [<ffffffffa03d8eb1>] dmu_zfetch_dofetch+0xf1/0x160 [zfs]
[168182.303565]  [<ffffffffa03d99f6>] dmu_zfetch+0x796/0xd80 [zfs]
[168182.303606]  [<ffffffffa03c4501>] dbuf_read+0x691/0x710 [zfs]
[168182.303637]  [<ffffffff81038d59>] ? default_spin_lock_flags+0x9/0x10
[168182.303683]  [<ffffffffa03dce7e>] dnode_hold_impl+0x10e/0x550 [zfs]
[168182.303714]  [<ffffffff81159add>] ? iput+0x16d/0x230
[168182.303765]  [<ffffffffa0426eb7>] ? zap_cursor_retrieve+0x117/0x2a0 [zfs]
[168182.303812]  [<ffffffffa03dd2d9>] dnode_hold+0x19/0x20 [zfs]
[168182.303854]  [<ffffffffa03cb168>] dmu_object_info+0x28/0x60 [zfs]
[168182.303905]  [<ffffffffa042cf7e>] zfs_unlinked_drain+0x7e/0x110 [zfs]
[168182.303935]  [<ffffffff811326c6>] ? __slab_free+0x1e6/0x350
[168182.303965]  [<ffffffff81080458>] ? update_curr+0x168/0x170
[168182.303993]  [<ffffffff81080960>] ? dequeue_task_fair+0x170/0x1e0
[168182.304027]  [<ffffffffa030f47b>] ? kmem_free_debug+0x4b/0x150 [spl]
[168182.304056]  [<ffffffff81132ab0>] ? kfree+0xf0/0x120
[168182.304087]  [<ffffffffa03122b7>] taskq_thread+0x237/0x5e0 [spl]
[168182.304118]  [<ffffffff8107b480>] ? try_to_wake_up+0x2d0/0x2d0
[168182.304150]  [<ffffffffa0312080>] ? task_alloc+0x380/0x380 [spl]
[168182.304180]  [<ffffffff8106b35e>] kthread+0x9e/0xb0
[168182.304207]  [<ffffffff813d5c64>] kernel_thread_helper+0x4/0x10
[168182.304236]  [<ffffffff8106b2c0>] ? kthread_freezable_should_stop+0x70/0x70
[168182.304266]  [<ffffffff813d5c60>] ? gs_change+0x13/0x13
[168182.304291] Code: c5 00 00 4c 8b 64 24 08 48 89 43 18 48 8b 1c 24 c9 c3 0f 1f 44 00 00 55 48 89 e5 66 66 66 66 90 48 c7 47 18 00 00 00 00 f0 ff 07 <7f> 05 e8 05 00 00 00 c9 c3 0f 1f 00 55 48 89 e5 48 83 ec 10 48 
[168182.304508] Call Trace:
[168182.304543]  [<ffffffffa03c5d7f>] dbuf_prefetch+0x6f/0x2b0 [zfs]
[168182.304587]  [<ffffffffa03d8eb1>] dmu_zfetch_dofetch+0xf1/0x160 [zfs]
[168182.304633]  [<ffffffffa03d99f6>] dmu_zfetch+0x796/0xd80 [zfs]
[168182.304674]  [<ffffffffa03c4501>] dbuf_read+0x691/0x710 [zfs]
[168182.304702]  [<ffffffff81038d59>] ? default_spin_lock_flags+0x9/0x10
[168182.304748]  [<ffffffffa03dce7e>] dnode_hold_impl+0x10e/0x550 [zfs]
[168182.304777]  [<ffffffff81159add>] ? iput+0x16d/0x230
[168182.304826]  [<ffffffffa0426eb7>] ? zap_cursor_retrieve+0x117/0x2a0 [zfs]
[168182.304873]  [<ffffffffa03dd2d9>] dnode_hold+0x19/0x20 [zfs]
[168182.304916]  [<ffffffffa03cb168>] dmu_object_info+0x28/0x60 [zfs]
[168182.304966]  [<ffffffffa042cf7e>] zfs_unlinked_drain+0x7e/0x110 [zfs]
[168182.304996]  [<ffffffff811326c6>] ? __slab_free+0x1e6/0x350
[168182.305023]  [<ffffffff81080458>] ? update_curr+0x168/0x170
[168182.305051]  [<ffffffff81080960>] ? dequeue_task_fair+0x170/0x1e0
[168182.305083]  [<ffffffffa030f47b>] ? kmem_free_debug+0x4b/0x150 [spl]
[168182.305112]  [<ffffffff81132ab0>] ? kfree+0xf0/0x120
[168182.305143]  [<ffffffffa03122b7>] taskq_thread+0x237/0x5e0 [spl]
[168182.305172]  [<ffffffff8107b480>] ? try_to_wake_up+0x2d0/0x2d0
[168182.305204]  [<ffffffffa0312080>] ? task_alloc+0x380/0x380 [spl]
[168182.305233]  [<ffffffff8106b35e>] kthread+0x9e/0xb0
[168182.305259]  [<ffffffff813d5c64>] kernel_thread_helper+0x4/0x10
[168182.305288]  [<ffffffff8106b2c0>] ? kthread_freezable_should_stop+0x70/0x70
[168182.305318]  [<ffffffff813d5c60>] ? gs_change+0x13/0x13

behlendorf · 2012-03-19T18:27:29Z

The soft lockups are due to issue #457.

chrisrd · 2012-03-29T04:07:37Z

FYI, I've been running with behlendorf/zfs@2a349bd for the last 2 weeks with moderate to severe load and haven't experienced any problems apart from the #457 soft lockups.

The load has been mostly parallel rsyncs and tar-to-tar copies into ZFS (i.e. data writes with lots of metadata including dir-based xattr read/write), with some of the data pulled from separate machines on the same local network and some from ext4 on md/lvm on the same machine, and occasionally running zpool scrub and mdadm/checkarray (separately or both together) at the same time.

I.e. for my load, pull request #618 is looking good!

...hmmm, except that #618 has changed the reclaimable pages calculation in arc_evictable_memory() compared to the version I'm running.

I'll switch to that code base and continue testing, and report over in #618.

behlendorf · 2012-03-29T16:29:23Z

Great, please keep me posted and update to the latest patch. We found a few issues in the initial version which is why we refreshed it.

vanbcguy · 2012-06-13T12:56:28Z

Has this patch been rolled in to the nightly builds yet? I'm experiencing this problem right now...

behlendorf · 2012-06-13T15:59:05Z

Yes, these patches have been merged in to master

vanbcguy · 2012-06-13T17:04:28Z

Any idea when? I was running a nightly from probably Thursday last week (0.60.63 from looking at my apt logs), though I just updated to this morning's version. There had been several days of very heavy I/O before the problem was triggered.

behlendorf · 2012-06-13T17:37:23Z

These changes have been in the ubuntu daily release since 0.6.0.62.

vanbcguy · 2012-06-14T15:27:07Z

OK - after reviewing my logs I still had a 0.6.0.58 release of SPL running at the time even though my ZFS module was up to 0.6.0.63. Everything is at 0.6.0.64 now with no repeats thus far, will update if anything reoccurs.

behlendorf · 2012-08-07T19:10:05Z

Since I haven't heard anything for a few months now I'm going to assume that everythings OK with the updated code so I'm closing the issue. We can reopen it if that's not the case, or just go ahead a file a new bug with the latest symptoms.

This splat_vprint is using tq_arg->name after tq_arg is freed. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#557

chrisrd mentioned this issue Mar 15, 2012

listxattr EFAULT #503

Closed

behlendorf closed this as completed Aug 7, 2012

pcd1193182 pushed a commit to pcd1193182/zfs that referenced this issue Sep 26, 2023

Allow allocation from removing device when space is low (openzfs#557)

128c851

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CPU stall / hung - only able to clear via hard power cycle #557

CPU stall / hung - only able to clear via hard power cycle #557

machtech commented Feb 3, 2012

chrisrd commented Feb 10, 2012

machtech commented Feb 10, 2012

chrisrd commented Feb 11, 2012

chrisrd commented Feb 11, 2012

machtech commented Feb 11, 2012

chrisrd commented Feb 12, 2012

thulle commented Feb 16, 2012

myers commented Feb 23, 2012

behlendorf commented Mar 2, 2012

behlendorf commented Mar 15, 2012

chrisrd commented Mar 15, 2012

behlendorf commented Mar 15, 2012

chrisrd commented Mar 16, 2012

chrisrd commented Mar 19, 2012

behlendorf commented Mar 19, 2012

chrisrd commented Mar 29, 2012

behlendorf commented Mar 29, 2012

vanbcguy commented Jun 13, 2012

behlendorf commented Jun 13, 2012

vanbcguy commented Jun 13, 2012

behlendorf commented Jun 13, 2012

vanbcguy commented Jun 14, 2012

behlendorf commented Aug 7, 2012

CPU stall / hung - only able to clear via hard power cycle #557

CPU stall / hung - only able to clear via hard power cycle #557

Comments

machtech commented Feb 3, 2012

chrisrd commented Feb 10, 2012

machtech commented Feb 10, 2012

chrisrd commented Feb 11, 2012

chrisrd commented Feb 11, 2012

machtech commented Feb 11, 2012

chrisrd commented Feb 12, 2012

thulle commented Feb 16, 2012

myers commented Feb 23, 2012

behlendorf commented Mar 2, 2012

behlendorf commented Mar 15, 2012

chrisrd commented Mar 15, 2012

behlendorf commented Mar 15, 2012

chrisrd commented Mar 16, 2012

chrisrd commented Mar 19, 2012

behlendorf commented Mar 19, 2012

chrisrd commented Mar 29, 2012

behlendorf commented Mar 29, 2012

vanbcguy commented Jun 13, 2012

behlendorf commented Jun 13, 2012

vanbcguy commented Jun 13, 2012

behlendorf commented Jun 13, 2012

vanbcguy commented Jun 14, 2012

behlendorf commented Aug 7, 2012