Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fedora 20 CPU soft lockup under load #2679

Closed
greg-hydrogen opened this issue Sep 8, 2014 · 15 comments
Closed

Fedora 20 CPU soft lockup under load #2679

greg-hydrogen opened this issue Sep 8, 2014 · 15 comments
Milestone

Comments

@greg-hydrogen
Copy link

Hello Everyone,

I experienced a lockup during multiple VM suspends from an ESX 5.5 host. I am currently running kernel 3.16.0 with a zfs git build from the 16 August with the Linux AIO patch (wasn't mainlined at the time). I was able to capture the soft lockup as detailed below

[500514.662210] ISO 9660 Extensions: RRIP_1991A
[865464.526832] BUG: soft lockup - CPU#3 stuck for 22s! [z_wr_iss/0:2332]
[865464.526921] Modules linked in: nls_utf8 isofs loop bridge stp llc btrfs kvm_amd kvm nfsd ppdev crct10dif_pclmul crc32_pclmul auth_rpcgss crc32c_intel ghash_clmulni_intel xor r8169 raid6_pq nfs_acl tg3 lockd microcode edac_core serio_raw ptp sp5100_tco pcspkr mii fam15h_power pps_core k10temp edac_mce_amd i2c_piix4 shpchp parport_pc joydev parport asus_atk0110 sunrpc wmi acpi_cpufreq radeon ata_generic pata_acpi i2c_algo_bit drm_kms_helper ttm drm pata_atiixp i2c_core zfs(POE) zunicode(POE) zavl(POE) zcommon(POE) znvpair(POE) spl(OE) [last unloaded: iptable_raw]
[865464.526954] CPU: 3 PID: 2332 Comm: z_wr_iss/0 Tainted: P OE 3.16.0 #1
[865464.526956] Hardware name: System manufacturer System Product Name/M5A78L-M LX PLUS, BIOS 1501 04/27/2013
[865464.526958] task: ffff8800c3fcaee0 ti: ffff8800c3fe0000 task.ti: ffff8800c3fe0000
[865464.526959] RIP: 0010:[] [] _raw_spin_lock+0x1e/0x30
[865464.526964] RSP: 0018:ffff8800c3fe3d48 EFLAGS: 00000202
[865464.526966] RAX: 0000000000000002 RBX: ffffffffa0105559 RCX: ffff88013eec4c20
[865464.526967] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88013eec4f04
[865464.526968] RBP: ffff8800c3fe3d48 R08: 0000000000000000 R09: 0000000000000001
[865464.526970] R10: 0000000000000004 R11: 0000000000000001 R12: ffff88030ad61b80
[865464.526971] R13: ffff8803fc872000 R14: ffff880037a5cb40 R15: 0000000100000000
[865464.526973] FS: 00007fa366cca840(0000) GS:ffff88041ecc0000(0000) knlGS:0000000000000000
[865464.526974] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[865464.526976] CR2: 00007fa366cf7000 CR3: 0000000001c11000 CR4: 00000000000407e0
[865464.526977] Stack:
[865464.526978] ffff8800c3fe3d60 ffffffff8170740c ffff880154d63b10 ffff8800c3fe3d70
[865464.526981] ffffffff8170744b ffff8800c3fe3dc8 ffffffffa0154f7c ffff8800c3fcaee0
[865464.526983] 0000000000000000 ffff8800c3fcaee0 ffff88013eec4f00 0000000000010000
[865464.526985] Call Trace:
[865464.526990] [] __mutex_unlock_slowpath+0x1c/0x40
[865464.526993] [] mutex_unlock+0x1b/0x20
[865464.527031] [] zio_ready+0x1bc/0x400 [zfs]
[865464.527059] [] zio_execute+0xa6/0x140 [zfs]
[865464.527070] [] taskq_thread+0x22d/0x4b0 [spl]
[865464.527074] [] ? wake_up_state+0x20/0x20
[865464.527082] [] ? taskq_cancel_id+0x1f0/0x1f0 [spl]
[865464.527085] [] kthread+0xe1/0x100
[865464.527087] [] ? insert_kthread_work+0x40/0x40
[865464.527090] [] ret_from_fork+0x7c/0xb0
[865464.527092] [] ? insert_kthread_work+0x40/0x40
[865464.527094] Code: b6 17 38 d0 75 f7 eb f0 0f 1f 44 00 00 66 66 66 66 90 55 48 89 e5 b8 00 01 00 00 f0 66 0f c1 07 0f b6 d4 38 c2 75 04 5d c3 f3 90 <0f> b6 07 38 d0 75 f7 5d c3 66 0f 1f 84 00 00 00 00 00 66 66 66
[865492.534998] BUG: soft lockup - CPU#3 stuck for 22s! [z_wr_iss/0:2332]
[865492.535086] Modules linked in: nls_utf8 isofs loop bridge stp llc btrfs kvm_amd kvm nfsd ppdev crct10dif_pclmul crc32_pclmul auth_rpcgss crc32c_intel ghash_clmulni_intel xor r8169 raid6_pq nfs_acl tg3 lockd microcode edac_core serio_raw ptp sp5100_tco pcspkr mii fam15h_power pps_core k10temp edac_mce_amd i2c_piix4 shpchp parport_pc joydev parport asus_atk0110 sunrpc wmi acpi_cpufreq radeon ata_generic pata_acpi i2c_algo_bit drm_kms_helper ttm drm pata_atiixp i2c_core zfs(POE) zunicode(POE) zavl(POE) zcommon(POE) znvpair(POE) spl(OE) [last unloaded: iptable_raw]
[865492.535116] CPU: 3 PID: 2332 Comm: z_wr_iss/0 Tainted: P OE 3.16.0 #1
[865492.535117] Hardware name: System manufacturer System Product Name/M5A78L-M LX PLUS, BIOS 1501 04/27/2013
[865492.535119] task: ffff8800c3fcaee0 ti: ffff8800c3fe0000 task.ti: ffff8800c3fe0000
[865492.535121] RIP: 0010:[] [] _raw_spin_lock+0x1e/0x30
[865492.535125] RSP: 0018:ffff8800c3fe3d48 EFLAGS: 00000202
[865492.535127] RAX: 0000000000000002 RBX: ffffffffa0105559 RCX: ffff88013eec4c20
[865492.535128] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88013eec4f04
[865492.535129] RBP: ffff8800c3fe3d48 R08: 0000000000000000 R09: 0000000000000001
[865492.535130] R10: 0000000000000004 R11: 0000000000000001 R12: ffff88030ad61b80
[865492.535132] R13: ffff8803fc872000 R14: ffff880037a5cb40 R15: 0000000100000000
[865492.535133] FS: 00007fa366cca840(0000) GS:ffff88041ecc0000(0000) knlGS:0000000000000000
[865492.535135] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[865492.535136] CR2: 00007fa366cf7000 CR3: 0000000001c11000 CR4: 00000000000407e0
[865492.535137] Stack:
[865492.535138] ffff8800c3fe3d60 ffffffff8170740c ffff880154d63b10 ffff8800c3fe3d70
[865492.535140] ffffffff8170744b ffff8800c3fe3dc8 ffffffffa0154f7c ffff8800c3fcaee0
[865492.535142] 0000000000000000 ffff8800c3fcaee0 ffff88013eec4f00 0000000000010000
[865492.535145] Call Trace:
[865492.535149] [] __mutex_unlock_slowpath+0x1c/0x40
[865492.535152] [] mutex_unlock+0x1b/0x20
[865492.535191] [] zio_ready+0x1bc/0x400 [zfs]
[865492.535216] [] zio_execute+0xa6/0x140 [zfs]
[865492.535226] [] taskq_thread+0x22d/0x4b0 [spl]
[865492.535231] [] ? wake_up_state+0x20/0x20
[865492.535237] [] ? taskq_cancel_id+0x1f0/0x1f0 [spl]
[865492.535240] [] kthread+0xe1/0x100
[865492.535243] [] ? insert_kthread_work+0x40/0x40
[865492.535245] [] ret_from_fork+0x7c/0xb0
[865492.535247] [] ? insert_kthread_work+0x40/0x40
[865492.535248] Code: b6 17 38 d0 75 f7 eb f0 0f 1f 44 00 00 66 66 66 66 90 55 48 89 e5 b8 00 01 00 00 f0 66 0f c1 07 0f b6 d4 38 c2 75 04 5d c3 f3 90 <0f> b6 07 38 d0 75 f7 5d c3 66 0f 1f 84 00 00 00 00 00 66 66 66
[865498.595764] INFO: rcu_sched self-detected stall on CPU { 3} (t=60000 jiffies g=7250499 c=7250498 q=12970)
[865498.595896] sending NMI to all CPUs:
[865498.595901] NMI backtrace for cpu 3
[865498.595903] CPU: 3 PID: 2332 Comm: z_wr_iss/0 Tainted: P OE 3.16.0 #1
[865498.595905] Hardware name: System manufacturer System Product Name/M5A78L-M LX PLUS, BIOS 1501 04/27/2013
[865498.595908] task: ffff8800c3fcaee0 ti: ffff8800c3fe0000 task.ti: ffff8800c3fe0000
[865498.595910] RIP: 0010:[] [] __const_udelay+0x12/0x30
[865498.595914] RSP: 0018:ffff88041ecc3de8 EFLAGS: 00000006
[865498.595916] RAX: 0000000001062560 RBX: 0000000000002710 RCX: 0000000000000006
[865498.595917] RDX: 0000000000329399 RSI: 0000000000000002 RDI: 0000000000418958
[865498.595919] RBP: ffff88041ecc3e00 R08: 000000000000000a R09: 0000000000000415
[865498.595920] R10: 0000000000000000 R11: ffff88041ecc3b06 R12: ffffffff81c51b40
[865498.595922] R13: ffffffff81d03f38 R14: ffffffff81c51b40 R15: 0000000000000003
[865498.595924] FS: 00007fa366cca840(0000) GS:ffff88041ecc0000(0000) knlGS:0000000000000000
[865498.595925] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[865498.595927] CR2: 00007fa366cf7000 CR3: 0000000001c11000 CR4: 00000000000407e0
[865498.595928] Stack:
[865498.595929] ffff88041ecc3e00 ffffffff81049132 ffff88041ecce8c0 ffff88041ecc3e58
[865498.595932] ffffffff810f1803 ffff88041ecc3e58 ffffffff810c4331 000312efd075e43e
[865498.595934] 00000000000032aa ffff8800c3fcaee0 0000000000000000 0000000000000003
[865498.595937] Call Trace:
[865498.595939]

[865498.595943] [] ? arch_trigger_all_cpu_backtrace+0xa2/0xd0
[865498.595947] [] rcu_check_callbacks+0x403/0x670
[865498.595950] [] ? account_process_tick+0x121/0x1b0
[865498.595953] [] update_process_times+0x47/0x70
[865498.595957] [] tick_sched_handle.isra.16+0x25/0x60
[865498.595959] [] tick_sched_timer+0x41/0x60
[865498.595962] [] __run_hrtimer+0x73/0x1d0
[865498.595964] [] ? tick_sched_handle.isra.16+0x60/0x60
[865498.595967] [] hrtimer_interrupt+0x107/0x250
[865498.595971] [] local_apic_timer_interrupt+0x37/0x60
[865498.595974] [] smp_apic_timer_interrupt+0x3f/0x60
[865498.595976] [] apic_timer_interrupt+0x6d/0x80
[865498.595978]

[865498.596007] [] ? dva_get_dsize_sync+0x39/0x50 [zfs]
[865498.596010] [] ? _raw_spin_lock+0x1e/0x30
[865498.596013] [] __mutex_unlock_slowpath+0x1c/0x40
[865498.596016] [] mutex_unlock+0x1b/0x20
[865498.596044] [] zio_ready+0x1bc/0x400 [zfs]
[865498.596072] [] zio_execute+0xa6/0x140 [zfs]
[865498.596080] [] taskq_thread+0x22d/0x4b0 [spl]
[865498.596084] [] ? wake_up_state+0x20/0x20
[865498.596091] [] ? taskq_cancel_id+0x1f0/0x1f0 [spl]
[865498.596094] [] kthread+0xe1/0x100
[865498.596097] [] ? insert_kthread_work+0x40/0x40
[865498.596099] [] ret_from_fork+0x7c/0xb0
[865498.596102] [] ? insert_kthread_work+0x40/0x40
[865498.596103] Code: 89 e5 ff 15 49 55 93 00 5d c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 8d 04 bd 00 00 00 00 65 48 8b 14 25 a0 3c 01 00 <48> 89 e5 48 69 d2 fa 00 00 00 f7 e2 48 8d 7a 01 ff 15 10 55 93
[865498.596129] NMI backtrace for cpu 2
[865498.596132] CPU: 2 PID: 0 Comm: swapper/2 Tainted: P OE 3.16.0 #1
[865498.596134] Hardware name: System manufacturer System Product Name/M5A78L-M LX PLUS, BIOS 1501 04/27/2013
[865498.596136] task: ffff8804088ae720 ti: ffff880408914000 task.ti: ffff880408914000
[865498.596138] RIP: 0010:[] [] acpi_idle_do_entry+0x21/0x2b
[865498.596144] RSP: 0018:ffff880408917e20 EFLAGS: 00000093
[865498.596146] RAX: 0003132a985ab400 RBX: ffff8804084094a8 RCX: ffff88041ec91918
[865498.596147] RDX: 0000000000000815 RSI: 0000000000000002 RDI: ffff8804084094a8
[865498.596149] RBP: ffff880408917e20 R08: 00000000ffffffff R09: 0000000000001ccf
[865498.596151] R10: 0000000000001bbf R11: 0000000000000008 R12: ffff880408409400
[865498.596152] R13: 0000000000000002 R14: ffff880407d85200 R15: ffffffff81ca4028
[865498.596154] FS: 00007fa366cca840(0000) GS:ffff88041ec80000(0000) knlGS:0000000000000000
[865498.596156] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[865498.596158] CR2: 00007fa366cf7000 CR3: 00000000c5323000 CR4: 00000000000407e0
[865498.596159] Stack:
[865498.596161] ffff880408917e58 ffffffff81406bdf ffffffff81ca3f60 ffff880407d85200
[865498.596164] ffffffff81ca3f60 000312efd085551c 0000000000000002 ffff880408917e90
[865498.596167] ffffffff81599c80 ffffffff81d03f30 ffff880407d85200 0000000000000002
[865498.596170] Call Trace:
[865498.596174] [] acpi_idle_enter_simple+0xa9/0xd1
[865498.596178] [] cpuidle_enter_state+0x40/0xc0
[865498.596181] [] cpuidle_enter+0x17/0x20
[865498.596184] [] cpu_startup_entry+0x32d/0x460
[865498.596187] [] ? tick_check_new_device+0xda/0xf0
[865498.596192] [] start_secondary+0x220/0x300
[865498.596194] Code: ff fa 66 66 90 66 66 90 5d c3 8a 47 08 55 48 89 e5 3c 01 75 07 e8 c4 cc c3 ff eb 17 3c 02 75 07 e8 ba ff ff ff eb 0c 8b 57 04 ec <48> 8b 15 90 2e bc 00 ed 5d c3 66 66 66 66 90 55 48 63 c2 48 8d
[865498.596230] NMI backtrace for cpu 5
[865498.596240] CPU: 5 PID: 0 Comm: swapper/5 Tainted: P OE 3.16.0 #1
[865498.596246] Hardware name: System manufacturer System Product Name/M5A78L-M LX PLUS, BIOS 1501 04/27/2013
[865498.596251] task: ffff880408920960 ti: ffff880408928000 task.ti: ffff880408928000
[865498.596256] RIP: 0010:[] [] native_safe_halt+0x6/0x10
[865498.596270] RSP: 0018:ffff88040892be00 EFLAGS: 00000246
[865498.596274] RAX: 0000000000000000 RBX: ffff88040840a074 RCX: 0000000000010a78
[865498.596278] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff88040840a074
[865498.596282] RBP: ffff88040892be00 R08: 00000000ffffffff R09: 0000000000002000
[865498.596286] R10: 0000000000001ea1 R11: 0000000000000008 R12: ffff88040840a000
[865498.596290] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff81ca3fd0
[865498.596295] FS: 00007ffd6dafd840(0000) GS:ffff88041ed40000(0000) knlGS:0000000000000000
[865498.596299] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[865498.596302] CR2: 00007ffd6db02000 CR3: 00000000c126e000 CR4: 00000000000407e0
[865498.596306] Stack:
[865498.596309] ffff88040892be10 ffffffff81406185 ffff88040892be20 ffffffff81406a93
[865498.596316] ffff88040892be30 ffffffff81406ab7 ffff88040892be58 ffffffff81406b18
[865498.596323] ffff880407d85800 ffffffff81ca3f60 000312efd08577b9 ffff88040892be90
[865498.596330] Call Trace:
[865498.596342] [] arch_safe_halt+0x9/0xd
[865498.596350] [] acpi_safe_halt+0x22/0x2b
[865498.596357] [] acpi_idle_do_entry+0x1b/0x2b
[865498.596364] [] acpi_idle_enter_c1+0x51/0x6f
[865498.596372] [] cpuidle_enter_state+0x40/0xc0
[865498.596378] [] cpuidle_enter+0x17/0x20
[865498.596386] [] cpu_startup_entry+0x32d/0x460
[865498.596392] [] ? tick_check_new_device+0xda/0xf0
[865498.596402] [] start_secondary+0x220/0x300
[865498.596407] Code: 00 00 00 00 00 55 48 89 e5 fa 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb f4 <5d> c3 0f 1f 84 00 00 00 00 00 55 48 89 e5 f4 5d c3 66 0f 1f 84
[865498.596473] NMI backtrace for cpu 1
[865498.596483] CPU: 1 PID: 0 Comm: swapper/1 Tainted: P OE 3.16.0 #1
[865498.596488] Hardware name: System manufacturer System Product Name/M5A78L-M LX PLUS, BIOS 1501 04/27/2013
[865498.596494] task: ffff8804088addc0 ti: ffff880408910000 task.ti: ffff880408910000
[865498.596498] RIP: 0010:[] [] acpi_idle_do_entry+0x21/0x2b
[865498.596511] RSP: 0018:ffff880408913e20 EFLAGS: 00000093
[865498.596516] RAX: 0003132a985bd200 RBX: ffff8804084090a8 RCX: ffff88041ec51918
[865498.596519] RDX: 0000000000000815 RSI: 0000000000000002 RDI: ffff8804084090a8
[865498.596523] RBP: ffff880408913e20 R08: 00000000ffffffff R09: 0000000000001eb2
[865498.596527] R10: 0000000000001b76 R11: 0000000000000008 R12: ffff880408409000
[865498.596530] R13: 0000000000000002 R14: ffff880407d85000 R15: ffffffff81ca4028
[865498.596535] FS: 00007fa366cca840(0000) GS:ffff88041ec40000(0000) knlGS:0000000000000000
[865498.596539] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[865498.596543] CR2: 00007fa366bd002c CR3: 0000000001c11000 CR4: 00000000000407e0
[865498.596547] Stack:
[865498.596550] ffff880408913e58 ffffffff81406bdf ffffffff81ca3f60 ffff880407d85000
[865498.596557] ffffffff81ca3f60 000312efd08671dc 0000000000000002 ffff880408913e90
[865498.596564] ffffffff81599c80 ffffffff81d03f30 ffff880407d85000 0000000000000002
[865498.596570] Call Trace:
[865498.596580] [] acpi_idle_enter_simple+0xa9/0xd1
[865498.596589] [] cpuidle_enter_state+0x40/0xc0
[865498.596595] [] cpuidle_enter+0x17/0x20
[865498.596603] [] cpu_startup_entry+0x32d/0x460
[865498.596610] [] ? tick_check_new_device+0xda/0xf0
[865498.596620] [] start_secondary+0x220/0x300
[865498.596625] Code: ff fa 66 66 90 66 66 90 5d c3 8a 47 08 55 48 89 e5 3c 01 75 07 e8 c4 cc c3 ff eb 17 3c 02 75 07 e8 ba ff ff ff eb 0c 8b 57 04 ec <48> 8b 15 90 2e bc 00 ed 5d c3 66 66 66 66 90 55 48 63 c2 48 8d
[865498.596690] NMI backtrace for cpu 0
[865498.596698] CPU: 0 PID: 0 Comm: swapper/0 Tainted: P OE 3.16.0 #1
[865498.596701] Hardware name: System manufacturer System Product Name/M5A78L-M LX PLUS, BIOS 1501 04/27/2013
[865498.596705] task: ffffffff81c18460 ti: ffffffff81c00000 task.ti: ffffffff81c00000
[865498.596708] RIP: 0010:[] [] acpi_idle_do_entry+0x21/0x2b
[865498.596716] RSP: 0018:ffffffff81c03e28 EFLAGS: 00000093
[865498.596719] RAX: 0003132a985bca00 RBX: ffff8804084088a8 RCX: ffff88041ec11918
[865498.596722] RDX: 0000000000000815 RSI: 0000000000000002 RDI: ffff8804084088a8
[865498.596725] RBP: ffffffff81c03e28 R08: 00000000ffffffff R09: 0000000000001eab
[865498.596729] R10: 0000000000001b27 R11: 0000000000000008 R12: ffff880408408800
[865498.596732] R13: 0000000000000002 R14: ffff880407d84e00 R15: ffffffff81ca4028
[865498.596736] FS: 00007f0940beb840(0000) GS:ffff88041ec00000(0000) knlGS:0000000000000000
[865498.596739] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[865498.596742] CR2: 00007faf678e6718 CR3: 00000003d28bc000 CR4: 00000000000407f0
[865498.596745] Stack:
[865498.596747] ffffffff81c03e60 ffffffff81406bdf ffffffff81ca3f60 ffff880407d84e00
[865498.596752] ffffffff81ca3f60 000312efd08669bc 0000000000000002 ffffffff81c03e98
[865498.596764] ffffffff81599c80 ffffffff81d03f30 ffff880407d84e00 0000000000000002
[865498.596774] Call Trace:
[865498.596782] [] acpi_idle_enter_simple+0xa9/0xd1
[865498.596788] [] cpuidle_enter_state+0x40/0xc0
[865498.596793] [] cpuidle_enter+0x17/0x20
[865498.596798] [] cpu_startup_entry+0x32d/0x460
[865498.596807] [] rest_init+0x77/0x80
[865498.596815] [] start_kernel+0x465/0x486
[865498.596820] [] ? set_init_arg+0x53/0x53
[865498.596827] [] ? early_idt_handlers+0x120/0x120
[865498.596833] [] x86_64_start_reservations+0x2a/0x2c
[865498.596839] [] x86_64_start_kernel+0x14d/0x170
[865498.596842] Code: ff fa 66 66 90 66 66 90 5d c3 8a 47 08 55 48 89 e5 3c 01 75 07 e8 c4 cc c3 ff eb 17 3c 02 75 07 e8 ba ff ff ff eb 0c 8b 57 04 ec <48> 8b 15 90 2e bc 00 ed 5d c3 66 66 66 66 90 55 48 63 c2 48 8d

Seems to repeat more or less... if there is more that can help please let me know

@prakashsurya
Copy link
Member

Although I haven't looked closely, this immediately reminds me of #2523.

@behlendorf
Copy link
Contributor

Yes, this looks very much like a duplicate of #2523. @greg-hydrogen By chance have you been able to consistently reproduce this? I've had a terrible time reproducing so we can get it resolved.

@behlendorf behlendorf added this to the 0.6.4 milestone Sep 8, 2014
@behlendorf behlendorf added the Bug label Sep 8, 2014
@greg-hydrogen
Copy link
Author

@behlendorf - I haven't tried actually but when I get back from vacation (Monday) I will try again, I did see something like this before so we may be lucky in that I can reproduce... before I try do you want me to add the debug patch?

@behlendorf
Copy link
Contributor

@greg-hydrogen if you could add the debugging patches that would be very helpful. That way we'll learn something if you're able to hit it again.

@greg-hydrogen
Copy link
Author

ok will do ... I will let you know if anything turns up...

@greg-hydrogen
Copy link
Author

I had another lockup as well, but I haven't had the chance to add the debug patch, I will do that today, but for completeness here is the stack output...

Sep 16 10:42:10 zfs-storage kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
Sep 16 10:42:10 zfs-storage kernel: IP: [] __mutex_unlock_slowpath+0x29/0x40
Sep 16 10:42:10 zfs-storage kernel: PGD 3d65c3067 PUD 3d6625067 PMD 0
Sep 16 10:42:10 zfs-storage kernel: Oops: 0000 [#1] SMP
Sep 16 10:42:10 zfs-storage kernel: Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable
Sep 16 10:42:10 zfs-storage kernel: ttm drm pata_atiixp i2c_core zfs(POE) zunicode(POE) zavl(POE) zcommon(POE) znvpair(POE) spl(OE)
Sep 16 10:42:10 zfs-storage kernel: CPU: 4 PID: 2366 Comm: z_fr_iss_2/0 Tainted: P OE 3.16.0 #1
Sep 16 10:42:10 zfs-storage kernel: Hardware name: System manufacturer System Product Name/M5A78L-M LX PLUS, BIOS 1501 04/27/2013
Sep 16 10:42:10 zfs-storage kernel: task: ffff880036a95460 ti: ffff8803e5064000 task.ti: ffff8803e5064000
Sep 16 10:42:10 zfs-storage kernel: RIP: 0010:[] [] __mutex_unlock_slowpath+0x29/0x40
Sep 16 10:42:10 zfs-storage kernel: RSP: 0018:ffff8803e5067d50 EFLAGS: 00010213
Sep 16 10:42:10 zfs-storage kernel: RAX: 0000000000000000 RBX: ffff880270a3c690 RCX: dead000000200200
Sep 16 10:42:10 zfs-storage kernel: RDX: ffff880270a3c698 RSI: 0000000000000246 RDI: ffff880270a3c694
Sep 16 10:42:10 zfs-storage kernel: RBP: ffff8803e5067d58 R08: ffff8803d9753a18 R09: 0000000000000001
Sep 16 10:42:10 zfs-storage kernel: R10: 0000000000015ab9 R11: ffffffff81ed7280 R12: 0000000000000000
Sep 16 10:42:10 zfs-storage kernel: R13: ffff8803e6a4f800 R14: ffff880036a95460 R15: ffff880270a3c690
Sep 16 10:42:10 zfs-storage kernel: FS: 00007fe78fa92700(0000) GS:ffff88041ed00000(0000) knlGS:0000000000000000
Sep 16 10:42:10 zfs-storage kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Sep 16 10:42:10 zfs-storage kernel: CR2: 0000000000000010 CR3: 00000003d6b46000 CR4: 00000000000407e0
Sep 16 10:42:10 zfs-storage kernel: Stack:
Sep 16 10:42:10 zfs-storage kernel: ffff880270a3c380 ffff8803e5067d68 ffffffff8170744b ffff8803e5067dc8
Sep 16 10:42:10 zfs-storage kernel: ffffffffa01557d9 ffffffffa0154f1f ffff8803e5067db8 ffff880270a3c690
Sep 16 10:42:10 zfs-storage kernel: 0000000000004000 ffff880270a3c380 0000000000200000 ffff880270a3c380
Sep 16 10:42:10 zfs-storage kernel: Call Trace:
Sep 16 10:42:10 zfs-storage kernel: [] mutex_unlock+0x1b/0x20
Sep 16 10:42:10 zfs-storage kernel: [] zio_done+0x619/0xcf0 [zfs]
Sep 16 10:42:10 zfs-storage kernel: [] ? zio_ready+0x15f/0x400 [zfs]
Sep 16 10:42:10 zfs-storage kernel: [] zio_execute+0xa6/0x140 [zfs]
Sep 16 10:42:10 zfs-storage kernel: [] taskq_thread+0x22d/0x4b0 [spl]
Sep 16 10:42:10 zfs-storage kernel: [] ? wake_up_state+0x20/0x20
Sep 16 10:42:10 zfs-storage kernel: [] ? taskq_cancel_id+0x1f0/0x1f0 [spl]
Sep 16 10:42:10 zfs-storage kernel: [] kthread+0xe1/0x100
Sep 16 10:42:10 zfs-storage kernel: [] ? insert_kthread_work+0x40/0x40
Sep 16 10:42:10 zfs-storage kernel: [] ret_from_fork+0x7c/0xb0
Sep 16 10:42:10 zfs-storage kernel: [] ? insert_kthread_work+0x40/0x40
Sep 16 10:42:10 zfs-storage kernel: Code: 40 00 66 66 66 66 90 55 48 89 e5 53 48 89 fb c7 07 01 00 00 00 48 8d 7f 04 e8 e4 16 00 00 48 8b 43 08 48 8d 53 08 48 39 d0 74 09 <48> 8b 78 10 e8 3e
Sep 16 10:42:10 zfs-storage kernel: RIP [] __mutex_unlock_slowpath+0x29/0x40
Sep 16 10:42:10 zfs-storage kernel: RSP
Sep 16 10:42:10 zfs-storage kernel: CR2: 0000000000000010
Sep 16 10:42:10 zfs-storage kernel: ---[ end trace 9ca2325f6212816c ]---

@greg-hydrogen
Copy link
Author

ok debug patch has been installed... @behlendorf is there anything I need to do when this locks up to grab all the debug info?

@greg-hydrogen
Copy link
Author

ok I had another failure, unfortunately the picture I took didn't come out, I did manage to get a sysreq wlt so hopefully this is enough info to go by....

http://pastebin.com/QHLqQdTP

@behlendorf
Copy link
Contributor

@greg-hydrogen Thanks for the additional information. I'm still working on determining the root cause here, but I definitely plan to treat this is a blocker for 0.6.4.

@greg-hydrogen
Copy link
Author

Hello Everyone, just had another crash... and here is the dump

I have left the machine in its current state so I can dump further information if needed... I will need to reboot tomorrow night though..

dmesg
[470392.238335] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[470392.238461] IP: [] __mutex_unlock_slowpath+0x29/0x40
[470392.238568] PGD 3d3979067 PUD 40436a067 PMD 0
[470392.238644] Oops: 0000 [#1] SMP
[470392.238699] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6ta
ble_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat n
f_conntrack iptable_mangle iptable_security iptable_raw btrfs kvm_amd kvm nfsd ppdev crct10dif_pclmul crc32_pclmul xor crc32c_intel auth_rpcgss tg3 raid6_pq ghash_clmulni_intel nfs_acl lockd p
cspkr edac_core sp5100_tco r8169 ptp microcode pps_core k10temp serio_raw fam15h_power edac_mce_amd mii i2c_piix4 parport_pc joydev parport shpchp asus_atk0110 sunrpc wmi acpi_cpufreq radeon a
ta_generic pata_acpi i2c_algo_bit drm_kms_helper
[470392.240012] ttm drm pata_atiixp i2c_core zfs(POE) zunicode(POE) zavl(POE) zcommon(POE) znvpair(POE) spl(OE)
[470392.240175] CPU: 2 PID: 2338 Comm: z_null_iss/0 Tainted: P OE 3.16.0 #1
[470392.240280] Hardware name: System manufacturer System Product Name/M5A78L-M LX PLUS, BIOS 1501 04/27/2013
[470392.240416] task: ffff8803ec49e720 ti: ffff8803e7718000 task.ti: ffff8803e7718000
[470392.240523] RIP: 0010:[] [] __mutex_unlock_slowpath+0x29/0x40
[470392.240659] RSP: 0018:ffff8803e771bd50 EFLAGS: 00010207
[470392.240733] RAX: 0000000000000000 RBX: ffff8800aa6f5338 RCX: dead000000200200
[470392.240837] RDX: ffff8800aa6f5340 RSI: 0000000000000246 RDI: ffff8800aa6f533c
[470392.240936] RBP: ffff8803e771bd58 R08: ffff8803d6fe7bd8 R09: 0000000000000001
[470392.241049] R10: 0000000000015ab9 R11: ffffffff81ed7280 R12: 0000000000000000
[470392.241148] R13: ffff8803dfdd8800 R14: ffff8803ec49e720 R15: ffff8800aa6f5338
[470392.241248] FS: 00007ff24cf73740(0000) GS:ffff88041ec80000(0000) knlGS:0000000000000000
[470392.241361] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[470392.241442] CR2: 0000000000000010 CR3: 00000003d4415000 CR4: 00000000000407e0
[470392.241546] Stack:
[470392.241577] ffff8800aa6f5028 ffff8803e771bd68 ffffffff8170744b ffff8803e771bdc8
[470392.241697] ffffffffa01557d9 ffffffffa0154f1f ffff880403f2f000 ffff8800aa6f5338
[470392.241818] ffff8803e771bdf0 ffffffff8170505a 0000000000200000 ffff8800aa6f5028
[470392.241937] Call Trace:
[470392.241992] [] mutex_unlock+0x1b/0x20
[470392.242147] [] zio_done+0x619/0xcf0 [zfs]
[470392.242288] [] ? zio_ready+0x15f/0x400 [zfs]
[470392.242375] [] ? __schedule+0x2ba/0x720
[470392.242514] [] zio_execute+0xa6/0x140 [zfs]
[470392.242616] [] taskq_thread+0x22d/0x4b0 [spl]
[470392.242706] [] ? wake_up_state+0x20/0x20
[470392.242802] [] ? taskq_cancel_id+0x1f0/0x1f0 [spl]
[470392.242897] [] kthread+0xe1/0x100
[470392.242971] [] ? insert_kthread_work+0x40/0x40
[470392.243062] [] ret_from_fork+0x7c/0xb0
[470392.243141] [] ? insert_kthread_work+0x40/0x40
[470392.243227] Code: 40 00 66 66 66 66 90 55 48 89 e5 53 48 89 fb c7 07 01 00 00 00 48 8d 7f 04 e8 e4 16 00 00 48 8b 43 08 48 8d 53 08 48 39 d0 74 09 <48> 8b 78 10 e8 3e 94 9b ff 80 43 04 01
5b 5d c3 0f 1f 80 00 00
[470392.243781] RIP [] __mutex_unlock_slowpath+0x29/0x40
[470392.243881] RSP

journalctl
-- Unit session-135.scope has begun starting up.
Sep 27 06:33:51 zfs-storage kernel: ---[ end trace 2e38b4cbc4d84854 ]---
Sep 27 06:33:51 zfs-storage kernel: CR2: 0000000000000010
Sep 27 06:33:51 zfs-storage kernel: RSP
Sep 27 06:33:51 zfs-storage kernel: RIP [] __mutex_unlock_slowpath+0x29/0x40
Sep 27 06:33:51 zfs-storage kernel: Code: 40 00 66 66 66 66 90 55 48 89 e5 53 48 89 fb c7 07 01 00 00 00 48 8d 7f 04 e8 e4 16 00 00 48 8b 43 08 48 8d 53 08 48 39 d0 74 09 <48> 8b 78 10 e8 3e 9
Sep 27 06:33:51 zfs-storage kernel: [] ? insert_kthread_work+0x40/0x40
Sep 27 06:33:51 zfs-storage kernel: [] ret_from_fork+0x7c/0xb0
Sep 27 06:33:51 zfs-storage kernel: [] ? insert_kthread_work+0x40/0x40
Sep 27 06:33:51 zfs-storage kernel: [] kthread+0xe1/0x100
Sep 27 06:33:51 zfs-storage kernel: [] ? taskq_cancel_id+0x1f0/0x1f0 [spl]
Sep 27 06:33:51 zfs-storage kernel: [] ? wake_up_state+0x20/0x20
Sep 27 06:33:51 zfs-storage kernel: [] taskq_thread+0x22d/0x4b0 [spl]
Sep 27 06:33:51 zfs-storage kernel: [] zio_execute+0xa6/0x140 [zfs]
Sep 27 06:33:51 zfs-storage kernel: [] ? __schedule+0x2ba/0x720
Sep 27 06:33:51 zfs-storage kernel: [] ? zio_ready+0x15f/0x400 [zfs]
Sep 27 06:33:51 zfs-storage kernel: [] zio_done+0x619/0xcf0 [zfs]
Sep 27 06:33:51 zfs-storage kernel: [] mutex_unlock+0x1b/0x20
Sep 27 06:33:51 zfs-storage kernel: Call Trace:
Sep 27 06:33:51 zfs-storage kernel: ffff8803e771bdf0 ffffffff8170505a 0000000000200000 ffff8800aa6f5028
Sep 27 06:33:51 zfs-storage kernel: ffffffffa01557d9 ffffffffa0154f1f ffff880403f2f000 ffff8800aa6f5338
Sep 27 06:33:51 zfs-storage kernel: ffff8800aa6f5028 ffff8803e771bd68 ffffffff8170744b ffff8803e771bdc8
Sep 27 06:33:51 zfs-storage kernel: Stack:
Sep 27 06:33:51 zfs-storage kernel: CR2: 0000000000000010 CR3: 00000003d4415000 CR4: 00000000000407e0
Sep 27 06:33:51 zfs-storage kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Sep 27 06:33:51 zfs-storage kernel: FS: 00007ff24cf73740(0000) GS:ffff88041ec80000(0000) knlGS:0000000000000000
Sep 27 06:33:51 zfs-storage kernel: R13: ffff8803dfdd8800 R14: ffff8803ec49e720 R15: ffff8800aa6f5338
Sep 27 06:33:51 zfs-storage kernel: R10: 0000000000015ab9 R11: ffffffff81ed7280 R12: 0000000000000000
Sep 27 06:33:51 zfs-storage kernel: RBP: ffff8803e771bd58 R08: ffff8803d6fe7bd8 R09: 0000000000000001
Sep 27 06:33:51 zfs-storage kernel: RDX: ffff8800aa6f5340 RSI: 0000000000000246 RDI: ffff8800aa6f533c
Sep 27 06:33:51 zfs-storage kernel: RAX: 0000000000000000 RBX: ffff8800aa6f5338 RCX: dead000000200200
Sep 27 06:33:51 zfs-storage kernel: RSP: 0018:ffff8803e771bd50 EFLAGS: 00010207
Sep 27 06:33:51 zfs-storage kernel: RIP: 0010:[] [] __mutex_unlock_slowpath+0x29/0x40
Sep 27 06:33:51 zfs-storage kernel: task: ffff8803ec49e720 ti: ffff8803e7718000 task.ti: ffff8803e7718000
Sep 27 06:33:51 zfs-storage kernel: Hardware name: System manufacturer System Product Name/M5A78L-M LX PLUS, BIOS 1501 04/27/2013
Sep 27 06:33:51 zfs-storage kernel: CPU: 2 PID: 2338 Comm: z_null_iss/0 Tainted: P OE 3.16.0 #1
Sep 27 06:33:51 zfs-storage kernel: ttm drm pata_atiixp i2c_core zfs(POE) zunicode(POE) zavl(POE) zcommon(POE) znvpair(POE) spl(OE)
Sep 27 06:33:51 zfs-storage kernel: Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_f
Sep 27 06:33:51 zfs-storage kernel: Oops: 0000 [#1] SMP
Sep 27 06:33:51 zfs-storage kernel: PGD 3d3979067 PUD 40436a067 PMD 0
Sep 27 06:33:51 zfs-storage kernel: IP: [] __mutex_unlock_slowpath+0x29/0x40
Sep 27 06:33:51 zfs-storage kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
Sep 27 06:01:01 zfs-storage run-parts[29105]: (/etc/cron.hourly) finished 0anacron
Sep 27 06:01:01 zfs-storage run-parts[29099]: (/etc/cron.hourly) starting 0anacron
Sep 27 06:01:01 zfs-storage CROND[29096]: (root) CMD (run-parts /etc/cron.hourly)
Sep 27 06:01:01 zfs-storage systemd[1]: Started Session 134 of user root.

@behlendorf
Copy link
Contributor

@greg-hydrogen Thanks, any additional debugging is always helpful. In the last case I think it was pretty clear this is somehow a use-after-free bug so I'm going to pursue that angle. Feel free to put your machine back in service.

@liviosoares
Copy link

Hi,

We're seeing several issues with lockups/oops with stacks that look similar the ones in this issue (above). We have a Ceph cluster of 6 machines, all running ZFS with around 10 spindles per zpool (configured with xattr=sa). The machines are running RHEL. I have tried experimenting with kernel versions 2.6.32 and 3.17, using zfs/spl from git.

After some load, we observe either these lockups (which lead to deadlock of any FS activity, even non-zfs file-systems, requiring hard boot), or we observe the machine self rebooting.

Below are some samples of stacks. Let me know if you'd like me to try something or more information:

Nov  6 09:20:17 os-compute-16 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
Nov  6 09:20:17 os-compute-16 kernel: IP: [<ffffffff8165bedd>] __mutex_unlock_slowpath+0x2d/0x50
Nov  6 09:20:17 os-compute-16 kernel: PGD ae68d7067 PUD 5ab9b3067 PMD 0 
Nov  6 09:20:17 os-compute-16 kernel: Oops: 0000 [#1] SMP 
Nov  6 09:20:17 os-compute-16 kernel: Modules linked in: veth xt_nat xt_addrtype xt_conntrack ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack bridge ipv6 ip6_tables iptable_filter ip_tables ebtable_nat ebtables dm_thin_pool dm_bio_prison dm_persistent_data dm_bufio libcrc32c autofs4 cpufreq_onde
mand 8021q garp stp llc vfat fat vhost_net macvtap macvlan vhost tun kvm_intel kvm iTCO_wdt iTCO_vendor_support gpio_ich microcode serio_raw rtc_efi pcspkr joydev i2c_i801 cdc_ether usbnet mii lpc_ich shpchp ipmi_devintf ipmi_si ipmi_msghandler zfs(PO) zcommon(PO) znvpair(PO) zavl(PO) zunicode(PO) spl(O) ioatdma i7core_edac edac_core sg s
es enclosure acpi_cpufreq igb dca hwmon mlx4_en ptp pps_core vxlan udp_tunnel mlx4_core ext4 jbd2 mbcache sd_mod crc_t10dif crct10dif_common pata_acpi ata_generic ata_piix megaraid_sas ttm drm_kms_helper sysimgblt sysfillrect syscopyarea dm_mirror dm_region_hash dm_log dm_mod
Nov  6 09:20:17 os-compute-16 kernel: CPU: 0 PID: 2347 Comm: z_fr_iss_5/0 Tainted: P           O   3.17.2-1.el6.elrepo.x86_64 #1
Nov  6 09:20:17 os-compute-16 kernel: Hardware name: IBM System x3630 M3 -[7377AC1]-/69Y1101, BIOS -[HSE124AUS-1.13]- 07/26/2013
Nov  6 09:20:17 os-compute-16 kernel: task: ffff88183fa9c050 ti: ffff881826280000 task.ti: ffff881826280000
Nov  6 09:20:17 os-compute-16 kernel: RIP: 0010:[<ffffffff8165bedd>]  [<ffffffff8165bedd>] __mutex_unlock_slowpath+0x2d/0x50
Nov  6 09:20:17 os-compute-16 kernel: RSP: 0018:ffff881826283cf8  EFLAGS: 00010203
Nov  6 09:20:17 os-compute-16 kernel: RAX: 0000000000000000 RBX: ffff880463ca8c88 RCX: ffff880463ca8cc0
Nov  6 09:20:17 os-compute-16 kernel: RDX: ffff880463ca8c90 RSI: 0000000000000286 RDI: ffff880463ca8c8c
Nov  6 09:20:17 os-compute-16 kernel: RBP: ffff881826283d08 R08: ffff880c7fc14600 R09: 0000000000000001
Nov  6 09:20:17 os-compute-16 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Nov  6 09:20:17 os-compute-16 kernel: R13: 0000000000000000 R14: ffff880463ca8c88 R15: 0000000000000000
Nov  6 09:20:17 os-compute-16 kernel: FS:  0000000000000000(0000) GS:ffff880c7fc00000(0000) knlGS:0000000000000000
Nov  6 09:20:17 os-compute-16 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Nov  6 09:20:17 os-compute-16 kernel: CR2: 0000000000000010 CR3: 000000096abf2000 CR4: 00000000000007f0
Nov  6 09:20:17 os-compute-16 kernel: Stack:
Nov  6 09:20:17 os-compute-16 kernel: ffff881826283d18 ffff880463ca8978 ffff881826283d18 ffffffff8165bf1b
Nov  6 09:20:17 os-compute-16 kernel: ffff881826283d98 ffffffffa03ed29d ffff880c3f7cd000 ffff880463ca8978
Nov  6 09:20:17 os-compute-16 kernel: ffff881826283d98 ffffffffa03ec9a7 ffff880c7fc14600 0000000000000000
Nov  6 09:20:17 os-compute-16 kernel: Call Trace:
Nov  6 09:20:17 os-compute-16 kernel: [<ffffffff8165bf1b>] mutex_unlock+0x1b/0x20
Nov  6 09:20:17 os-compute-16 kernel: [<ffffffffa03ed29d>] zio_done+0x5ed/0xe50 [zfs]
Nov  6 09:20:17 os-compute-16 kernel: [<ffffffffa03ec9a7>] ? zio_ready+0x177/0x480 [zfs]
Nov  6 09:20:17 os-compute-16 kernel: [<ffffffffa03e8f75>] zio_execute+0xb5/0x140 [zfs]
Nov  6 09:20:17 os-compute-16 kernel: [<ffffffffa02642ed>] taskq_thread+0x22d/0x4f0 [spl]
Nov  6 09:20:17 os-compute-16 kernel: [<ffffffff8109ce60>] ? try_to_wake_up+0x220/0x220
Nov  6 09:20:17 os-compute-16 kernel: [<ffffffffa02640c0>] ? task_expire+0x120/0x120 [spl]
Nov  6 09:20:17 os-compute-16 kernel: [<ffffffff8108edce>] kthread+0xce/0xf0
Nov  6 09:20:17 os-compute-16 kernel: [<ffffffff8108ed00>] ? kthread_freezable_should_stop+0x70/0x70
Nov  6 09:20:17 os-compute-16 kernel: [<ffffffff8165ecbc>] ret_from_fork+0x7c/0xb0
Nov  6 09:20:17 os-compute-16 kernel: [<ffffffff8108ed00>] ? kthread_freezable_should_stop+0x70/0x70
Nov  6 09:20:17 os-compute-16 kernel: Code: 89 e5 53 48 83 ec 08 66 66 66 66 90 48 89 fb c7 07 01 00 00 00 48 8d 7f 04 e8 b0 28 00 00 48 8b 43 08 48 8d 53 08 48 39 d0 74 09 <48> 8b 78 10 e8 ba 0f a4 ff 66 83 43 04 01 48 83 c4 08 5b c9 c3 
Nov  6 09:20:17 os-compute-16 kernel: RIP  [<ffffffff8165bedd>] __mutex_unlock_slowpath+0x2d/0x50
Nov  6 09:20:17 os-compute-16 kernel: RSP <ffff881826283cf8>
Nov  6 09:20:17 os-compute-16 kernel: CR2: 0000000000000010
Nov  6 09:20:17 os-compute-16 kernel: ---[ end trace c3b8a607e3f8974c ]---

another case:

Nov  6 14:19:06 os-compute-16 kernel: NMI watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [z_wr_iss/2:2294]
Nov  6 14:19:06 os-compute-16 kernel: Modules linked in: nfsv3 nfs_acl nfs fscache lockd sunrpc veth xt_nat xt_addrtype xt_conntrack ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack bridge ipv6 ip6_tables iptable_filter ip_tables ebtable_nat ebtables dm_thin_pool dm_bio_prison dm_persistent_data 
dm_bufio libcrc32c autofs4 cpufreq_ondemand 8021q garp stp llc vfat fat vhost_net macvtap macvlan vhost tun kvm_intel kvm iTCO_wdt iTCO_vendor_support gpio_ich microcode serio_raw rtc_efi pcspkr joydev i2c_i801 cdc_ether usbnet mii lpc_ich shpchp ipmi_devintf ipmi_si ipmi_msghandler zfs(PO) zcommon(PO) znvpair(PO) zavl(PO) zunicode(PO) sp
l(O) ioatdma i7core_edac edac_core sg ses enclosure acpi_cpufreq igb dca hwmon mlx4_en ptp pps_core vxlan udp_tunnel mlx4_core ext4 jbd2 mbcache sd_mod crc_t10dif crct10dif_common pata_acpi ata_generic ata_piix megaraid_sas ttm drm_kms_helper sysimgblt sysfillrect syscopyarea dm_mirror dm_region_hash dm_log dm_mod
Nov  6 14:19:06 os-compute-16 kernel: CPU: 6 PID: 2294 Comm: z_wr_iss/2 Tainted: P      D    O L 3.17.2-1.el6.elrepo.x86_64 #1
Nov  6 14:19:06 os-compute-16 kernel: Hardware name: IBM System x3630 M3 -[7377AC1]-/69Y1101, BIOS -[HSE124AUS-1.13]- 07/26/2013
Nov  6 14:19:06 os-compute-16 kernel: task: ffff88183edfc110 ti: ffff88184071c000 task.ti: ffff88184071c000
Nov  6 14:19:06 os-compute-16 kernel: RIP: 0010:[<ffffffff8165e7a0>]  [<ffffffff8165e7a0>] _raw_spin_lock+0x20/0x30
Nov  6 14:19:06 os-compute-16 kernel: RSP: 0018:ffff88184071fd08  EFLAGS: 00000293
Nov  6 14:19:06 os-compute-16 kernel: RAX: 0000000000000005 RBX: ffffffff8165bed0 RCX: 0000000000000008
Nov  6 14:19:06 os-compute-16 kernel: RDX: 0000000000000000 RSI: 0000000000015100 RDI: ffff8815bbb1c45c
Nov  6 14:19:06 os-compute-16 kernel: RBP: ffff88184071fd08 R08: 0000000000000000 R09: ffff880c7f802340
Nov  6 14:19:06 os-compute-16 kernel: R10: dead000000100100 R11: 0000000000000000 R12: 0000000000000000
Nov  6 14:19:06 os-compute-16 kernel: R13: ffff88187fc14678 R14: ffff88184071c010 R15: ffff880c83670fb8
Nov  6 14:19:06 os-compute-16 kernel: FS:  0000000000000000(0000) GS:ffff88187fc00000(0000) knlGS:0000000000000000
Nov  6 14:19:06 os-compute-16 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Nov  6 14:19:06 os-compute-16 kernel: CR2: 000000c27de712f8 CR3: 0000000001c14000 CR4: 00000000000007e0
Nov  6 14:19:06 os-compute-16 kernel: Stack:
Nov  6 14:19:06 os-compute-16 kernel: ffff88184071fd28 ffffffff8165bed0 ffff88184071fd38 ffff8815bbb1c148
Nov  6 14:19:06 os-compute-16 kernel: ffff88184071fd38 ffffffff8165bf1b ffff88184071fd98 ffffffffa03ec9f5
Nov  6 14:19:06 os-compute-16 kernel: ffff8815bbb1c408 ffff8815bbb1c458 ffff88183edfc110 0000000000000003
Nov  6 14:19:06 os-compute-16 kernel: Call Trace:
Nov  6 14:19:06 os-compute-16 kernel: [<ffffffff8165bed0>] __mutex_unlock_slowpath+0x20/0x50
Nov  6 14:19:06 os-compute-16 kernel: [<ffffffff8165bf1b>] mutex_unlock+0x1b/0x20
Nov  6 14:19:06 os-compute-16 kernel: [<ffffffffa03ec9f5>] zio_ready+0x1c5/0x480 [zfs]
Nov  6 14:19:06 os-compute-16 kernel: [<ffffffffa03e8f75>] zio_execute+0xb5/0x140 [zfs]
Nov  6 14:19:06 os-compute-16 kernel: [<ffffffffa02642ed>] taskq_thread+0x22d/0x4f0 [spl]
Nov  6 14:19:06 os-compute-16 kernel: [<ffffffff8109ce60>] ? try_to_wake_up+0x220/0x220
Nov  6 14:19:06 os-compute-16 kernel: [<ffffffffa02640c0>] ? task_expire+0x120/0x120 [spl]
Nov  6 14:19:06 os-compute-16 kernel: [<ffffffff8108edce>] kthread+0xce/0xf0
Nov  6 14:19:06 os-compute-16 kernel: [<ffffffff8108ed00>] ? kthread_freezable_should_stop+0x70/0x70
Nov  6 14:19:06 os-compute-16 kernel: [<ffffffff8165ecbc>] ret_from_fork+0x7c/0xb0
Nov  6 14:19:06 os-compute-16 kernel: [<ffffffff8108ed00>] ? kthread_freezable_should_stop+0x70/0x70

@behlendorf
Copy link
Contributor

This is yet another duplicate of #2523 which is targeted to be addressed in the next tag.

@behlendorf behlendorf added Bug - Major and removed Bug labels Nov 6, 2014
behlendorf pushed a commit to openzfs/spl that referenced this issue Dec 19, 2014
It is known that mutexes in Linux are not safe when using them to
synchronize the freeing of object in which the mutex is embedded:

http://lwn.net/Articles/575477/

The known places in ZFS which are suspected to suffer from the race
condition are zio->io_lock and dbuf->db_mtx.

* zio uses zio->io_lock and zio->io_cv to synchronize freeing
  between zio_wait() and zio_done().
* dbuf uses dbuf->db_mtx to protect reference counting.

This patch fixes this kind of race by forcing serialization on
mutex_exit() with a spin lock, making the mutex safe by sacrificing
a bit of performance and memory overhead.

This issue most commonly manifests itself as a deadlock in the zio
pipeline caused by a process spinning on the damaged mutex.  Similar
deadlocks have been reported for the dbuf->db_mtx mutex.  And it can
also cause a NULL dereference or bad paging request under the right
circumstances.

This issue any many like it are linked off the openzfs/zfs#2523
issue.  Specifically this fix resolves at least the following
outstanding issues:

openzfs/zfs#401
openzfs/zfs#2523
openzfs/zfs#2679
openzfs/zfs#2684
openzfs/zfs#2704
openzfs/zfs#2708
openzfs/zfs#2517
openzfs/zfs#2827
openzfs/zfs#2850
openzfs/zfs#2891
openzfs/zfs#2897
openzfs/zfs#2247
openzfs/zfs#2939

Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #421
@behlendorf
Copy link
Contributor

This issue which is a duplicate of #2523 was resolved by the following commit. Full details can be found in the commit message and related lwn article.

openzfs/spl@a3c1eb7 mutex: force serialization on mutex_exit() to fix races

dajhorn added a commit to zfsonlinux/pkg-spl that referenced this issue Dec 20, 2014
Commit: openzfs/zfs@a3c1eb7
From: Chunwei Chen <[email protected]>
Date: Fri, 19 Dec 2014 11:31:59 +0800
Subject: mutex: force serialization on mutex_exit() to fix races

It is known that mutexes in Linux are not safe when using them to
synchronize the freeing of object in which the mutex is embedded:

http://lwn.net/Articles/575477/

The known places in ZFS which are suspected to suffer from the race
condition are zio->io_lock and dbuf->db_mtx.

* zio uses zio->io_lock and zio->io_cv to synchronize freeing
  between zio_wait() and zio_done().
* dbuf uses dbuf->db_mtx to protect reference counting.

This patch fixes this kind of race by forcing serialization on
mutex_exit() with a spin lock, making the mutex safe by sacrificing
a bit of performance and memory overhead.

This issue most commonly manifests itself as a deadlock in the zio
pipeline caused by a process spinning on the damaged mutex.  Similar
deadlocks have been reported for the dbuf->db_mtx mutex.  And it can
also cause a NULL dereference or bad paging request under the right
circumstances.

This issue any many like it are linked off the openzfs/zfs#2523
issue.  Specifically this fix resolves at least the following
outstanding issues:

openzfs/zfs#401
openzfs/zfs#2523
openzfs/zfs#2679
openzfs/zfs#2684
openzfs/zfs#2704
openzfs/zfs#2708
openzfs/zfs#2517
openzfs/zfs#2827
openzfs/zfs#2850
openzfs/zfs#2891
openzfs/zfs#2897
openzfs/zfs#2247
openzfs/zfs#2939

Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Backported-by: Darik Horn <[email protected]>
Closes #421

Conflicts:
        include/sys/mutex.h
behlendorf pushed a commit to openzfs/spl that referenced this issue Dec 23, 2014
It is known that mutexes in Linux are not safe when using them to
synchronize the freeing of object in which the mutex is embedded:

http://lwn.net/Articles/575477/

The known places in ZFS which are suspected to suffer from the race
condition are zio->io_lock and dbuf->db_mtx.

* zio uses zio->io_lock and zio->io_cv to synchronize freeing
  between zio_wait() and zio_done().
* dbuf uses dbuf->db_mtx to protect reference counting.

This patch fixes this kind of race by forcing serialization on
mutex_exit() with a spin lock, making the mutex safe by sacrificing
a bit of performance and memory overhead.

This issue most commonly manifests itself as a deadlock in the zio
pipeline caused by a process spinning on the damaged mutex.  Similar
deadlocks have been reported for the dbuf->db_mtx mutex.  And it can
also cause a NULL dereference or bad paging request under the right
circumstances.

This issue any many like it are linked off the openzfs/zfs#2523
issue.  Specifically this fix resolves at least the following
outstanding issues:

openzfs/zfs#401
openzfs/zfs#2523
openzfs/zfs#2679
openzfs/zfs#2684
openzfs/zfs#2704
openzfs/zfs#2708
openzfs/zfs#2517
openzfs/zfs#2827
openzfs/zfs#2850
openzfs/zfs#2891
openzfs/zfs#2897
openzfs/zfs#2247
openzfs/zfs#2939

Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #421
@ryao
Copy link
Contributor

ryao commented Feb 10, 2015

@behlendorf This is not a duplicate of #2523. Instead, it is a duplicate of #3091, provided that this deadlock occurred on a multi-socket system.

ryao pushed a commit to ryao/spl that referenced this issue Feb 19, 2015
It is known that mutexes in Linux are not safe when using them to
synchronize the freeing of object in which the mutex is embedded:

http://lwn.net/Articles/575477/

The known places in ZFS which are suspected to suffer from the race
condition are zio->io_lock and dbuf->db_mtx.

* zio uses zio->io_lock and zio->io_cv to synchronize freeing
  between zio_wait() and zio_done().
* dbuf uses dbuf->db_mtx to protect reference counting.

This patch fixes this kind of race by forcing serialization on
mutex_exit() with a spin lock, making the mutex safe by sacrificing
a bit of performance and memory overhead.

This issue most commonly manifests itself as a deadlock in the zio
pipeline caused by a process spinning on the damaged mutex.  Similar
deadlocks have been reported for the dbuf->db_mtx mutex.  And it can
also cause a NULL dereference or bad paging request under the right
circumstances.

This issue any many like it are linked off the openzfs/zfs#2523
issue.  Specifically this fix resolves at least the following
outstanding issues:

openzfs/zfs#401
openzfs/zfs#2523
openzfs/zfs#2679
openzfs/zfs#2684
openzfs/zfs#2704
openzfs/zfs#2708
openzfs/zfs#2517
openzfs/zfs#2827
openzfs/zfs#2850
openzfs/zfs#2891
openzfs/zfs#2897
openzfs/zfs#2247
openzfs/zfs#2939

Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#421

Conflicts:
	include/sys/mutex.h
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants