Build failure on JZ4780 generic target #2

ChrisPWilson · 2014-08-12T10:07:54Z

Building the generic target results in the error:

DTC arch/mips/jz4780/dts/generic.dtb
ERROR (duplicate_label): Duplicate label 'extal' on /clocks/oscillator and /clocks/extal
ERROR: Input tree has errors, aborting (use -f to force output)
make[3]: *** [arch/mips/jz4780/dts/generic.dtb] Error 2
make[2]: *** [arch/mips/jz4780/dts] Error 2
make[1]: *** [arch/mips/jz4780] Error 2
make[1]: *** Waiting for unfinished jobs....

I'm using the IA32 compiler here:

https://sourcery.mentor.com/GNUToolchain/release2791?cmpid=7108

paulburton · 2014-08-12T10:18:46Z

As I mentioned by email, my inclination is to fix this by just removing the generic "board" which I'm not sure gives us very much. Any objections?

ZubairLK · 2014-08-12T10:22:35Z

pretty redundant atm.

generic board can always be added later on if several similar jz4780 based boards prop up..

ChrisPWilson · 2014-08-12T10:27:29Z

I have no objections, other than the fact that a "generic" target might be useful as a reference.
Re. "similar jz4780 boards," we do have an npm801 tablet, support for which would be nice :)

paulburton · 2014-08-12T10:30:43Z

Even then a generic "board" wouldn't be so useful, since essentially all commonality between boards is just due to the SoC & thus already abstracted. Patch submitted.

There are some npm801 tablets in the Leeds office, I'll see if I can nab one at some point :)

paulburton · 2014-08-19T08:30:43Z

Fixed by commit c16c790 "MIPS: remove generic jz4780 "board"".

With the introduction of fair queued rwlock, recursive read_lock() may hang the offending process if there is a write_lock() somewhere in between. With recursive read_lock checking enabled, the following error was reported: ============================================= [ INFO: possible recursive locking detected ] 3.16.0-rc1 MIPS#2 Tainted: G E --------------------------------------------- load_policy/708 is trying to acquire lock: (policy_rwlock){.+.+..}, at: [<ffffffff8125b32a>] security_genfs_sid+0x3a/0x170 but task is already holding lock: (policy_rwlock){.+.+..}, at: [<ffffffff8125b48c>] security_fs_use+0x2c/0x110 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(policy_rwlock); lock(policy_rwlock); This patch fixes the occurrence of recursive read_lock() of policy_rwlock by adding a helper function __security_genfs_sid() which requires caller to take the lock before calling it. The security_fs_use() was then modified to call the new helper function. Signed-off-by: Waiman Long <[email protected]> Acked-by: Stephen Smalley <[email protected]> Signed-off-by: Paul Moore <[email protected]>

We were experiencing occasional "BUG: scheduling while atomic" splats in our testing. Enabling DEBUG_SPINLOCK and DEBUG_LOCKDEP in the kernel exposed a lockdep in the enic driver. enic 0000:0b:00.0 eth2: Link UP ====================================================== [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ] 3.12.0-rc1.x86_64-dbg+ MIPS#2 Tainted: GF W ------------------------------------------------------ NetworkManager/4209 [HC0[0]:SC0[2]:HE1:SE0] is trying to acquire: (&(&enic->devcmd_lock)->rlock){+.+...}, at: [<ffffffffa026b7e4>] enic_dev_packet_filter+0x44/0x90 [enic] The fix was to replace spin_lock with spin_lock_bh for the enic devcmd_lock, so that soft irqs would be disabled while the lock is held. Signed-off-by: Sujith Sankar <[email protected]> Signed-off-by: Tony Camuso <[email protected]> Signed-off-by: Govindarajulu Varadarajan <[email protected]> Signed-off-by: David S. Miller <[email protected]>

Linus Lüssing says: ==================== bridge: multicast snooping exports MIPS#2 Some people pointed out to me that it might be helpful to add stubs for the newly added multicast exports. That way e.g. batman-adv should continue to be compile and useable without having to have a kernel compiled with bridge code in the future. This is what the first patch is supposed to do. The second patch adds a third multicast export for the bridge which e.g. batman-adv is supposed to use, too, soon: Just like the bridge disables its multicast snooping activities if no querier is present, batman-adv needs to do the same if bridges are involved. These three exports should be the final ones needed to marry the bridge multicast snooping with the batman-adv multicast optimizations recently added for the 3.15 kernel, allowing to use these optimzations in common setups having a bridge on top of e.g. bat0, too. So far these bridged setups would fall back to simple flooding through the batman-adv mesh network for any multicast packet entering bat0. More information about the batman-adv multicast optimizations currently implemented can be found here: http://www.open-mesh.org/projects/batman-adv/wiki/Basic-multicast-optimizations The integration on the batman-adv side could afterwards look like this, for instance (now including the third export): http://git.open-mesh.org/batman-adv.git/commitdiff/61e4f6af4b7a21ed4040f2e711d50c778e5b6d93?hp=6ae4281474675fbca5bedcf768972a32db586eb6 ====================

What the patch does: 1. Call pinmux_disable_setting ahead of pinmux_enable_setting each time pinctrl_select_state is called 2. Remove the HW disable operation in pinmux_disable_setting function. 3. Remove the disable ops in struct pinmux_ops 4. Remove all the disable ops users in current code base. Notes: 1. Great thanks for the suggestion from Linus, Tony Lindgren and Stephen Warren and Everyone that shared comments on this patch. 2. The patch also includes comment fixes from Stephen Warren. The reason why we do this: 1. To avoid duplicated calling of the enable_setting operation without disabling operation inbetween which will let the pin descriptor desc->mux_usecount increase monotonously. 2. The HW pin disable operation is not useful for any of the existing platforms. And this can be used to avoid the HW glitch after using the item #1 modification. In the following case, the issue can be reproduced: 1. There is a driver that need to switch pin state dynamically, e.g. between "sleep" and "default" state 2. The pin setting configuration in a DTS node may be like this: component a { pinctrl-names = "default", "sleep"; pinctrl-0 = <&a_grp_setting &c_grp_setting>; pinctrl-1 = <&b_grp_setting &c_grp_setting>; } The "c_grp_setting" config node is totally identical, maybe like following one: c_grp_setting: c_grp_setting { pinctrl-single,pins = <GPIO48 AF6>; } 3. When switching the pin state in the following official pinctrl sequence: pin = pinctrl_get(); state = pinctrl_lookup_state(wanted_state); pinctrl_select_state(state); pinctrl_put(); Test Result: 1. The switch is completed as expected, that is: the device's pin configuration is changed according to the description in the "wanted_state" group setting 2. The "desc->mux_usecount" of the corresponding pins in "c_group" is increased without being decreased, because the "desc" is for each physical pin while the setting is for each setting node in the DTS. Thus, if the "c_grp_setting" in pinctrl-0 is not disabled ahead of enabling "c_grp_setting" in pinctrl-1, the desc->mux_usecount will keep increasing without any chance to be decreased. According to the comments in the original code, only the setting, in old state but not in new state, will be "disabled" (calling pinmux_disable_setting), which is correct logic but not intact. We still need consider case that the setting is in both old state and new state. We can do this in the following two ways: 1. Avoid to "enable"(calling pinmux_enable_setting) the "same pin setting" repeatedly 2. "Disable"(calling pinmux_disable_setting) the "same pin setting", actually two setting instances, ahead of enabling them. Analysis: 1. The solution MIPS#2 is better because it can avoid too much iteration. 2. If we disable all of the settings in the old state and one of the setting(s) exist in the new state, the pins mux function change may happen when some SoC vendors defined the "pinctrl-single,function-off" in their DTS file. old_setting => disabled_setting => new_setting. 3. In the pinmux framework, when a pin state is switched, the setting in the old state should be marked as "disabled". Conclusion: 1. To Remove the HW disabling operation to above the glitch mentioned above. 2. Handle the issue mentioned above by disabling all of the settings in old state and then enable the all of the settings in new state. Signed-off-by: Fan Wu <[email protected]> Acked-by: Stephen Warren <[email protected]> Acked-by: Patrice Chotard <[email protected]> Acked-by: Heiko Stuebner <[email protected]> Acked-by: Maxime Coquelin <[email protected]> Signed-off-by: Linus Walleij <[email protected]>

We got a report of the following warning in Fedora: BUG: sleeping function called from invalid context at mm/slub.c:969 in_atomic(): 1, irqs_disabled(): 0, pid: 533, name: bash 3 locks held by bash/533: #0: (&sp->so_delegreturn_mutex){+.+...}, at: [<ffffffffa033da62>] nfs4_proc_lock+0x262/0x910 [nfsv4] #1: (&nfsi->rwsem){.+.+.+}, at: [<ffffffffa033da6a>] nfs4_proc_lock+0x26a/0x910 [nfsv4] MIPS#2: (&sb->s_type->i_lock_key#23){+.+...}, at: [<ffffffff812998dc>] flock_lock_file_wait+0x8c/0x3a0 CPU: 0 PID: 533 Comm: bash Not tainted 3.15.0-0.rc1.git1.1.fc21.x86_64 #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 0000000000000000 00000000d664ff3c ffff880078b69a70 ffffffff817e82e0 0000000000000000 ffff880078b69a98 ffffffff810cf1a4 0000000000000050 0000000000000050 ffff88007cc01a00 ffff880078b69ad8 ffffffff8121449e Call Trace: [<ffffffff817e82e0>] dump_stack+0x4d/0x66 [<ffffffff810cf1a4>] __might_sleep+0x184/0x240 [<ffffffff8121449e>] kmem_cache_alloc_trace+0x4e/0x330 [<ffffffffa0331124>] ? nfs4_release_lockowner+0x74/0x110 [nfsv4] [<ffffffffa0331124>] nfs4_release_lockowner+0x74/0x110 [nfsv4] [<ffffffffa0352340>] nfs4_put_lock_state+0x90/0xb0 [nfsv4] [<ffffffffa0352375>] nfs4_fl_release_lock+0x15/0x20 [nfsv4] [<ffffffff81297515>] locks_free_lock+0x45/0x90 [<ffffffff8129996c>] flock_lock_file_wait+0x11c/0x3a0 [<ffffffffa033da6a>] ? nfs4_proc_lock+0x26a/0x910 [nfsv4] [<ffffffffa033301e>] do_vfs_lock+0x1e/0x30 [nfsv4] [<ffffffffa033da79>] nfs4_proc_lock+0x279/0x910 [nfsv4] [<ffffffff810dbb26>] ? local_clock+0x16/0x30 [<ffffffff810f5a3f>] ? lock_release_holdtime.part.28+0xf/0x200 [<ffffffffa02f820c>] do_unlk+0x8c/0xc0 [nfs] [<ffffffffa02f85c5>] nfs_flock+0xa5/0xf0 [nfs] [<ffffffff8129a6f6>] locks_remove_file+0xb6/0x1e0 [<ffffffff812159d8>] ? kfree+0xd8/0x2d0 [<ffffffff8123bc63>] __fput+0xd3/0x210 [<ffffffff8123bdee>] ____fput+0xe/0x10 [<ffffffff810bfb6d>] task_work_run+0xcd/0xf0 [<ffffffff81019cd1>] do_notify_resume+0x61/0x90 [<ffffffff817fbea2>] int_signal+0x12/0x17 The problem is that NFSv4 is trying to do an allocation from fl_release_private (in order to send a RELEASE_LOCKOWNER call). That function can be called while holding the inode->i_lock, and it's currently set up to do __GFP_WAIT allocations. v4.1 code has a similar problem. This patch adds a work_struct to the nfs4_lock_state and has the code queue the free_lock_state operation to nfsiod. Reported-by: Josh Stone <[email protected]> Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>

While a queue is being destroyed, all the blkgs are destroyed and its ->root_blkg pointer is set to NULL. If someone else starts to drain while the queue is in this state, the following oops happens. NULL pointer dereference at 0000000000000028 IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230 PGD e4a1067 PUD b773067 PMD 0 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched] CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ MIPS#2 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000 RIP: 0010:[<ffffffff8144e944>] [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230 RSP: 0018:ffff88000efd7bf0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001 R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450 R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28 FS: 00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0 Stack: ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80 ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58 ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450 Call Trace: [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60 [<ffffffff81427641>] __blk_drain_queue+0x71/0x180 [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0 [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120 [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50 [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40 [<ffffffff8142d476>] blk_release_queue+0x26/0xd0 [<ffffffff81454968>] kobject_cleanup+0x38/0x70 [<ffffffff81454848>] kobject_put+0x28/0x60 [<ffffffff81427505>] blk_put_queue+0x15/0x20 [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0 [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0 [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20 [<ffffffff817930e2>] device_release+0x32/0xa0 [<ffffffff81454968>] kobject_cleanup+0x38/0x70 [<ffffffff81454848>] kobject_put+0x28/0x60 [<ffffffff817934d7>] put_device+0x17/0x20 [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0 [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40 [<ffffffff817d1257>] sdev_store_delete+0x27/0x30 [<ffffffff81792ca8>] dev_attr_store+0x18/0x30 [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50 [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170 [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0 [<ffffffff811f69bd>] SyS_write+0x4d/0xc0 [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b 776687b ("block, blk-mq: draining can't be skipped even if bypass_depth was non-zero") made it easier to trigger this bug by making blk_queue_bypass_start() drain even when it loses the first bypass test to blk_cleanup_queue(); however, the bug has always been there even before the commit as blk_queue_bypass_start() could race against queue destruction, win the initial bypass test but perform the actual draining after blk_cleanup_queue() already destroyed all blkgs. Fix it by skippping calling into policy draining if all the blkgs are already gone. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Shirish Pargaonkar <[email protected]> Reported-by: Sasha Levin <[email protected]> Reported-by: Jet Chen <[email protected]> Cc: [email protected] Tested-by: Shirish Pargaonkar <[email protected]> Signed-off-by: Jens Axboe <[email protected]>

We allocate the cpufreq table after calling rcu_read_lock(), which disables preemption. This causes scheduling while atomic warnings. Use GFP_ATOMIC instead of GFP_KERNEL and update for kcalloc while we're here. BUG: sleeping function called from invalid context at mm/slub.c:1246 in_atomic(): 0, irqs_disabled(): 0, pid: 80, name: modprobe 5 locks held by modprobe/80: #0: (&dev->mutex){......}, at: [<c050d484>] __driver_attach+0x48/0x98 #1: (&dev->mutex){......}, at: [<c050d494>] __driver_attach+0x58/0x98 MIPS#2: (subsys mutex#5){+.+.+.}, at: [<c050c114>] subsys_interface_register+0x38/0xc8 MIPS#3: (cpufreq_rwsem){.+.+.+}, at: [<c05a9c8c>] __cpufreq_add_dev.isra.22+0x84/0x92c MIPS#4: (rcu_read_lock){......}, at: [<c05ab24c>] dev_pm_opp_init_cpufreq_table+0x18/0x10c Preemption disabled at:[< (null)>] (null) CPU: 2 PID: 80 Comm: modprobe Not tainted 3.16.0-rc3-next-20140701-00035-g286857f216aa-dirty torvalds#217 [<c0214da8>] (unwind_backtrace) from [<c02123f8>] (show_stack+0x10/0x14) [<c02123f8>] (show_stack) from [<c070141c>] (dump_stack+0x70/0xbc) [<c070141c>] (dump_stack) from [<c02f4cb0>] (__kmalloc+0x124/0x250) [<c02f4cb0>] (__kmalloc) from [<c05ab270>] (dev_pm_opp_init_cpufreq_table+0x3c/0x10c) [<c05ab270>] (dev_pm_opp_init_cpufreq_table) from [<bf000508>] (cpufreq_init+0x48/0x378 [cpufreq_generic]) [<bf000508>] (cpufreq_init [cpufreq_generic]) from [<c05a9e08>] (__cpufreq_add_dev.isra.22+0x200/0x92c) [<c05a9e08>] (__cpufreq_add_dev.isra.22) from [<c050c160>] (subsys_interface_register+0x84/0xc8) [<c050c160>] (subsys_interface_register) from [<c05a9494>] (cpufreq_register_driver+0x108/0x2d8) [<c05a9494>] (cpufreq_register_driver) from [<bf000888>] (generic_cpufreq_probe+0x50/0x74 [cpufreq_generic]) [<bf000888>] (generic_cpufreq_probe [cpufreq_generic]) from [<c050e994>] (platform_drv_probe+0x18/0x48) [<c050e994>] (platform_drv_probe) from [<c050d1f4>] (driver_probe_device+0x128/0x370) [<c050d1f4>] (driver_probe_device) from [<c050d4d0>] (__driver_attach+0x94/0x98) [<c050d4d0>] (__driver_attach) from [<c050b778>] (bus_for_each_dev+0x54/0x88) [<c050b778>] (bus_for_each_dev) from [<c050c894>] (bus_add_driver+0xe8/0x204) [<c050c894>] (bus_add_driver) from [<c050dd48>] (driver_register+0x78/0xf4) [<c050dd48>] (driver_register) from [<c0208870>] (do_one_initcall+0xac/0x1d8) [<c0208870>] (do_one_initcall) from [<c028b6b4>] (load_module+0x190c/0x21e8) [<c028b6b4>] (load_module) from [<c028c034>] (SyS_init_module+0xa4/0x110) [<c028c034>] (SyS_init_module) from [<c020f0c0>] (ret_fast_syscall+0x0/0x48) Fixes: a0dd7b7 (PM / OPP: Move cpufreq specific OPP functions out of generic OPP library) Signed-off-by: Stephen Boyd <[email protected]> Acked-by: Viresh Kumar <[email protected]> Cc: 3.16+ <[email protected]> # 3.16+ Signed-off-by: Rafael J. Wysocki <[email protected]>

This patch integrates creation of sysfs groups and attributes into NILFS file system driver. It was found the issue with nilfs_sysfs_{create/delete}_snapshot_group functions by Michael L Semon <[email protected]> in the first version of the patch: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:579 in_atomic(): 1, irqs_disabled(): 0, pid: 32676, name: umount.nilfs2 2 locks held by umount.nilfs2/32676: #0: (&type->s_umount_key#21){++++..}, at: [<790c18e2>] deactivate_super+0x37/0x58 #1: (&(&nilfs->ns_cptree_lock)->rlock){+.+...}, at: [<791bf659>] nilfs_put_root+0x23/0x5a Preemption disabled at:[<791bf659>] nilfs_put_root+0x23/0x5a CPU: 0 PID: 32676 Comm: umount.nilfs2 Not tainted 3.14.0+ MIPS#2 Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002 Call Trace: dump_stack+0x4b/0x75 __might_sleep+0x111/0x16f mutex_lock_nested+0x1e/0x3ad kernfs_remove+0x12/0x26 sysfs_remove_dir+0x3d/0x62 kobject_del+0x13/0x38 nilfs_sysfs_delete_snapshot_group+0xb/0xd nilfs_put_root+0x2a/0x5a nilfs_detach_log_writer+0x1ab/0x2c1 nilfs_put_super+0x13/0x68 generic_shutdown_super+0x60/0xd1 kill_block_super+0x1d/0x60 deactivate_locked_super+0x22/0x3f deactivate_super+0x3e/0x58 mntput_no_expire+0xe2/0x141 SyS_oldumount+0x70/0xa5 syscall_call+0x7/0xb The reason of the issue was placement of nilfs_sysfs_{create/delete}_snapshot_group() call under nilfs->ns_cptree_lock protection. But this protection is unnecessary and wrong solution. The second version of the patch fixes this issue. [[email protected]: nilfs_sysfs_create_mounted_snapshots_group can be static] Reported-by: Michael L. Semon <[email protected]> Signed-off-by: Vyacheslav Dubeyko <[email protected]> Cc: Vyacheslav Dubeyko <[email protected]> Cc: Ryusuke Konishi <[email protected]> Tested-by: Michael L. Semon <[email protected]> Signed-off-by: Fengguang Wu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>

… queue" This reverts commit a923207. It introduced a dead lock, and did not fix anything. it made netif_tx_lock() be called in IRQ context, but in softirq context, the same lock is locked without disabling IRQ. In fact, the commit a923207 did not fix anything, since netif_stop_queue did not free the any resource [ 10.154920] ================================= [ 10.156026] [ INFO: inconsistent lock state ] [ 10.156026] 3.16.0-rc5+ MIPS#13 Not tainted [ 10.156026] --------------------------------- [ 10.156026] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. [ 10.156026] swapper/1/0 [HC0[0]:SC1[5]:HE1:SE0] takes: [ 10.156026] (_xmit_ETHER){?.-...}, at: [<80948b6a>] sch_direct_xmit+0x7a/0x250 [ 10.156026] {IN-HARDIRQ-W} state was registered at: [ 10.156026] [<804811f0>] __lock_acquire+0x800/0x17a0 [ 10.156026] [<804828ba>] lock_acquire+0x6a/0xf0 [ 10.156026] [<809ed477>] _raw_spin_lock+0x27/0x40 [ 10.156026] [<8088d508>] gether_disconnect+0x68/0x280 [ 10.156026] [<8088e777>] eem_set_alt+0x37/0xc0 [ 10.156026] [<808847ce>] composite_setup+0x30e/0x1240 [ 10.156026] [<8088b8ae>] pch_udc_isr+0xa6e/0xf50 [ 10.156026] [<8048abe8>] handle_irq_event_percpu+0x38/0x1e0 [ 10.156026] [<8048adc1>] handle_irq_event+0x31/0x50 [ 10.156026] [<8048d94b>] handle_fasteoi_irq+0x6b/0x140 [ 10.156026] [<804040a5>] handle_irq+0x65/0x80 [ 10.156026] [<80403cfc>] do_IRQ+0x3c/0xc0 [ 10.156026] [<809ee6ae>] common_interrupt+0x2e/0x34 [ 10.156026] [<804668c5>] finish_task_switch+0x65/0xd0 [ 10.156026] [<809e89df>] __schedule+0x20f/0x7d0 [ 10.156026] [<809e94aa>] schedule_preempt_disabled+0x2a/0x70 [ 10.156026] [<8047bf03>] cpu_startup_entry+0x143/0x410 [ 10.156026] [<809e2e61>] rest_init+0xa1/0xb0 [ 10.156026] [<80ce2a3b>] start_kernel+0x336/0x33b [ 10.156026] [<80ce22ab>] i386_start_kernel+0x79/0x7d [ 10.156026] irq event stamp: 52070 [ 10.156026] hardirqs last enabled at (52070): [<809375de>] neigh_resolve_output+0xee/0x2a0 [ 10.156026] hardirqs last disabled at (52069): [<809375a8>] neigh_resolve_output+0xb8/0x2a0 [ 10.156026] softirqs last enabled at (52020): [<8044401f>] _local_bh_enable+0x1f/0x50 [ 10.156026] softirqs last disabled at (52021): [<80404036>] do_softirq_own_stack+0x26/0x30 [ 10.156026] [ 10.156026] other info that might help us debug this: [ 10.156026] Possible unsafe locking scenario: [ 10.156026] [ 10.156026] CPU0 [ 10.156026] ---- [ 10.156026] lock(_xmit_ETHER); [ 10.156026] <Interrupt> [ 10.156026] lock(_xmit_ETHER); [ 10.156026] [ 10.156026] *** DEADLOCK *** [ 10.156026] [ 10.156026] 4 locks held by swapper/1/0: [ 10.156026] #0: (((&idev->mc_ifc_timer))){+.-...}, at: [<8044b100>] call_timer_fn+0x0/0x190 [ 10.156026] #1: (rcu_read_lock){......}, at: [<a0577c40>] mld_sendpack+0x0/0x590 [ipv6] [ 10.156026] MIPS#2: (rcu_read_lock_bh){......}, at: [<a055680c>] ip6_finish_output2+0x4c/0x7f0 [ipv6] [ 10.156026] MIPS#3: (rcu_read_lock_bh){......}, at: [<8092e510>] __dev_queue_xmit+0x0/0x5f0 [ 10.156026] [ 10.156026] stack backtrace: [ 10.156026] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-rc5+ MIPS#13 [ 10.156026] 811dbb10 00000000 9e919d10 809e6785 9e8b8000 9e919d3c 809e561e 80b95511 [ 10.156026] 80b9545a 80b9543d 80b95450 80b95441 80b957e4 9e8b84e0 00000002 8047f7b0 [ 10.156026] 9e919d5c 8048043b 00000002 00000000 9e8b8000 00000001 00000004 9e8b8000 [ 10.156026] Call Trace: [ 10.156026] [<809e6785>] dump_stack+0x48/0x69 [ 10.156026] [<809e561e>] print_usage_bug+0x18f/0x19c [ 10.156026] [<8047f7b0>] ? print_shortest_lock_dependencies+0x170/0x170 [ 10.156026] [<8048043b>] mark_lock+0x53b/0x5f0 [ 10.156026] [<804810cf>] __lock_acquire+0x6df/0x17a0 [ 10.156026] [<804828ba>] lock_acquire+0x6a/0xf0 [ 10.156026] [<80948b6a>] ? sch_direct_xmit+0x7a/0x250 [ 10.156026] [<809ed477>] _raw_spin_lock+0x27/0x40 [ 10.156026] [<80948b6a>] ? sch_direct_xmit+0x7a/0x250 [ 10.156026] [<80948b6a>] sch_direct_xmit+0x7a/0x250 [ 10.156026] [<8092e6bf>] __dev_queue_xmit+0x1af/0x5f0 [ 10.156026] [<80947fc0>] ? ether_setup+0x80/0x80 [ 10.156026] [<8092eb0f>] dev_queue_xmit+0xf/0x20 [ 10.156026] [<8093764c>] neigh_resolve_output+0x15c/0x2a0 [ 10.156026] [<a0556927>] ip6_finish_output2+0x167/0x7f0 [ipv6] [ 10.156026] [<a0559b05>] ip6_finish_output+0x85/0x1c0 [ipv6] [ 10.156026] [<a0559cb7>] ip6_output+0x77/0x240 [ipv6] [ 10.156026] [<a0578163>] mld_sendpack+0x523/0x590 [ipv6] [ 10.156026] [<80480501>] ? mark_held_locks+0x11/0x90 [ 10.156026] [<a057947d>] mld_ifc_timer_expire+0x15d/0x280 [ipv6] [ 10.156026] [<8044b168>] call_timer_fn+0x68/0x190 [ 10.156026] [<a0579320>] ? igmp6_group_added+0x150/0x150 [ipv6] [ 10.156026] [<8044b3fa>] run_timer_softirq+0x16a/0x240 [ 10.156026] [<a0579320>] ? igmp6_group_added+0x150/0x150 [ipv6] [ 10.156026] [<80444984>] __do_softirq+0xd4/0x2f0 [ 10.156026] [<804448b0>] ? tasklet_action+0x100/0x100 [ 10.156026] [<80404036>] do_softirq_own_stack+0x26/0x30 [ 10.156026] <IRQ> [<80444d05>] irq_exit+0x65/0x70 [ 10.156026] [<8042d758>] smp_apic_timer_interrupt+0x38/0x50 [ 10.156026] [<809ee91f>] apic_timer_interrupt+0x2f/0x34 [ 10.156026] [<8048007b>] ? mark_lock+0x17b/0x5f0 [ 10.156026] [<8040a912>] ? default_idle+0x22/0xf0 [ 10.156026] [<8040b13e>] arch_cpu_idle+0xe/0x10 [ 10.156026] [<8047bfc6>] cpu_startup_entry+0x206/0x410 [ 10.156026] [<8042bfbd>] start_secondary+0x19d/0x1e0 Acked-by: Tony Lindgren <[email protected]> Reported-by: Thomas Gleixner <[email protected]> Cc: Jeff Westfahl <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: <[email protected]> Signed-off-by: Li RongQing <[email protected]> Signed-off-by: Felipe Balbi <[email protected]>

This reverts commit 8b37e1b. It's broken as it changes led_blink_set() in a way that it can now sleep (while synchronously waiting for workqueue to be cancelled). That's a problem, because it's possible that this function gets called from atomic context (tpt_trig_timer() takes a readlock and thus disables preemption). This has been brought up 3 weeks ago already [1] but no proper fix has materialized, and I keep seeing the problem since 3.17-rc1. [1] https://lkml.org/lkml/2014/8/16/128 BUG: sleeping function called from invalid context at kernel/workqueue.c:2650 in_atomic(): 1, irqs_disabled(): 0, pid: 2335, name: wpa_supplicant 5 locks held by wpa_supplicant/2335: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c7c92>] rtnl_lock+0x12/0x20 #1: (&wdev->mtx){+.+.+.}, at: [<ffffffffc06e649c>] cfg80211_mgd_wext_siwessid+0x5c/0x180 [cfg80211] MIPS#2: (&local->mtx){+.+.+.}, at: [<ffffffffc0817dea>] ieee80211_prep_connection+0x17a/0x9a0 [mac80211] MIPS#3: (&local->chanctx_mtx){+.+.+.}, at: [<ffffffffc08081ed>] ieee80211_vif_use_channel+0x5d/0x2a0 [mac80211] MIPS#4: (&trig->leddev_list_lock){.+.+..}, at: [<ffffffffc081e68c>] tpt_trig_timer+0xec/0x170 [mac80211] CPU: 0 PID: 2335 Comm: wpa_supplicant Not tainted 3.17.0-rc3 #1 Hardware name: LENOVO 7470BN2/7470BN2, BIOS 6DET38WW (2.02 ) 12/19/2008 ffff8800360b5a50 ffff8800751f76d8 ffffffff8159e97f ffff8800360b5a30 ffff8800751f76e8 ffffffff810739a5 ffff8800751f77b0 ffffffff8106862f ffffffff810685d0 0aa2209200000000 ffff880000000004 ffff8800361c59d0 Call Trace: [<ffffffff8159e97f>] dump_stack+0x4d/0x66 [<ffffffff810739a5>] __might_sleep+0xe5/0x120 [<ffffffff8106862f>] flush_work+0x5f/0x270 [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80 [<ffffffff810945ca>] ? mark_held_locks+0x6a/0x90 [<ffffffff81068a5f>] ? __cancel_work_timer+0x6f/0x100 [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100 [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10 [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40 [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211] [<ffffffffc081ecdd>] ieee80211_mod_tpt_led_trig+0x9d/0x160 [mac80211] [<ffffffffc07e4278>] __ieee80211_recalc_idle+0x98/0x140 [mac80211] [<ffffffffc07e59ce>] ieee80211_idle_off+0xe/0x10 [mac80211] [<ffffffffc0804e5b>] ieee80211_add_chanctx+0x3b/0x220 [mac80211] [<ffffffffc08062e4>] ieee80211_new_chanctx+0x44/0xf0 [mac80211] [<ffffffffc080838a>] ieee80211_vif_use_channel+0x1fa/0x2a0 [mac80211] [<ffffffffc0817df8>] ieee80211_prep_connection+0x188/0x9a0 [mac80211] [<ffffffffc081c246>] ieee80211_mgd_auth+0x256/0x2e0 [mac80211] [<ffffffffc07eab33>] ieee80211_auth+0x13/0x20 [mac80211] [<ffffffffc06cb006>] cfg80211_mlme_auth+0x106/0x270 [cfg80211] [<ffffffffc06ce085>] cfg80211_conn_do_work+0x155/0x3b0 [cfg80211] [<ffffffffc06cf670>] cfg80211_connect+0x3f0/0x540 [cfg80211] [<ffffffffc06e6148>] cfg80211_mgd_wext_connect+0x158/0x1f0 [cfg80211] [<ffffffffc06e651e>] cfg80211_mgd_wext_siwessid+0xde/0x180 [cfg80211] [<ffffffffc06e36c0>] ? cfg80211_wext_giwessid+0x50/0x50 [cfg80211] [<ffffffffc06e36dd>] cfg80211_wext_siwessid+0x1d/0x40 [cfg80211] [<ffffffff81584d0c>] ioctl_standard_iw_point+0x14c/0x3e0 [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff8158502a>] ioctl_standard_call+0x8a/0xd0 [<ffffffff81584fa0>] ? ioctl_standard_iw_point+0x3e0/0x3e0 [<ffffffff81584b76>] wireless_process_ioctl.constprop.10+0xb6/0x100 [<ffffffff8158521d>] wext_handle_ioctl+0x5d/0xb0 [<ffffffff814cfb29>] dev_ioctl+0x329/0x620 [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff8149c7f2>] sock_ioctl+0x142/0x2e0 [<ffffffff811b0140>] do_vfs_ioctl+0x300/0x520 [<ffffffff815a67fb>] ? sysret_check+0x1b/0x56 [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff811b03e1>] SyS_ioctl+0x81/0xa0 [<ffffffff815a67d6>] system_call_fastpath+0x1a/0x1f wlan0: send auth to 00:0b:6b:3c:8c:e4 (try 1/3) wlan0: authenticated wlan0: associate with 00:0b:6b:3c:8c:e4 (try 1/3) wlan0: RX AssocResp from 00:0b:6b:3c:8c:e4 (capab=0x431 status=0 aid=2) wlan0: associated IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready cfg80211: Calling CRDA for country: NA wlan0: Limiting TX power to 27 (27 - 0) dBm as advertised by 00:0b:6b:3c:8c:e4 ================================= [ INFO: inconsistent lock state ] 3.17.0-rc3 #1 Not tainted --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes: ((&(&led_cdev->blink_work)->work)){+.?...}, at: [<ffffffff810685d0>] flush_work+0x0/0x270 {SOFTIRQ-ON-W} state was registered at: [<ffffffff81094dbe>] __lock_acquire+0x30e/0x1a30 [<ffffffff81096c81>] lock_acquire+0x91/0x110 [<ffffffff81068608>] flush_work+0x38/0x270 [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100 [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10 [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40 [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211] [<ffffffffc081ecdd>] ieee80211_mod_tpt_led_trig+0x9d/0x160 [mac80211] [<ffffffffc07e4278>] __ieee80211_recalc_idle+0x98/0x140 [mac80211] [<ffffffffc07e59ce>] ieee80211_idle_off+0xe/0x10 [mac80211] [<ffffffffc0804e5b>] ieee80211_add_chanctx+0x3b/0x220 [mac80211] [<ffffffffc08062e4>] ieee80211_new_chanctx+0x44/0xf0 [mac80211] [<ffffffffc080838a>] ieee80211_vif_use_channel+0x1fa/0x2a0 [mac80211] [<ffffffffc0817df8>] ieee80211_prep_connection+0x188/0x9a0 [mac80211] [<ffffffffc081c246>] ieee80211_mgd_auth+0x256/0x2e0 [mac80211] [<ffffffffc07eab33>] ieee80211_auth+0x13/0x20 [mac80211] [<ffffffffc06cb006>] cfg80211_mlme_auth+0x106/0x270 [cfg80211] [<ffffffffc06ce085>] cfg80211_conn_do_work+0x155/0x3b0 [cfg80211] [<ffffffffc06cf670>] cfg80211_connect+0x3f0/0x540 [cfg80211] [<ffffffffc06e6148>] cfg80211_mgd_wext_connect+0x158/0x1f0 [cfg80211] [<ffffffffc06e651e>] cfg80211_mgd_wext_siwessid+0xde/0x180 [cfg80211] [<ffffffffc06e36dd>] cfg80211_wext_siwessid+0x1d/0x40 [cfg80211] [<ffffffff81584d0c>] ioctl_standard_iw_point+0x14c/0x3e0 [<ffffffff8158502a>] ioctl_standard_call+0x8a/0xd0 [<ffffffff81584b76>] wireless_process_ioctl.constprop.10+0xb6/0x100 [<ffffffff8158521d>] wext_handle_ioctl+0x5d/0xb0 [<ffffffff814cfb29>] dev_ioctl+0x329/0x620 [<ffffffff8149c7f2>] sock_ioctl+0x142/0x2e0 [<ffffffff811b0140>] do_vfs_ioctl+0x300/0x520 [<ffffffff811b03e1>] SyS_ioctl+0x81/0xa0 [<ffffffff815a67d6>] system_call_fastpath+0x1a/0x1f irq event stamp: 493416 hardirqs last enabled at (493416): [<ffffffff81068a5f>] __cancel_work_timer+0x6f/0x100 hardirqs last disabled at (493415): [<ffffffff81067e9f>] try_to_grab_pending+0x1f/0x160 softirqs last enabled at (493408): [<ffffffff81053ced>] _local_bh_enable+0x1d/0x50 softirqs last disabled at (493409): [<ffffffff81054c75>] irq_exit+0xa5/0xb0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock((&(&led_cdev->blink_work)->work)); <Interrupt> lock((&(&led_cdev->blink_work)->work)); *** DEADLOCK *** 2 locks held by swapper/0/0: #0: (((&tpt_trig->timer))){+.-...}, at: [<ffffffff810b4c50>] call_timer_fn+0x0/0x180 #1: (&trig->leddev_list_lock){.+.?..}, at: [<ffffffffc081e68c>] tpt_trig_timer+0xec/0x170 [mac80211] stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.17.0-rc3 #1 Hardware name: LENOVO 7470BN2/7470BN2, BIOS 6DET38WW (2.02 ) 12/19/2008 ffffffff8246eb30 ffff88007c203b00 ffffffff8159e97f ffffffff81a194c0 ffff88007c203b50 ffffffff81599c29 0000000000000001 ffffffff00000001 ffff880000000000 0000000000000006 ffffffff81a194c0 ffffffff81093ad0 Call Trace: <IRQ> [<ffffffff8159e97f>] dump_stack+0x4d/0x66 [<ffffffff81599c29>] print_usage_bug+0x1f4/0x205 [<ffffffff81093ad0>] ? check_usage_backwards+0x140/0x140 [<ffffffff810944d3>] mark_lock+0x223/0x2b0 [<ffffffff81094d60>] __lock_acquire+0x2b0/0x1a30 [<ffffffff81096c81>] lock_acquire+0x91/0x110 [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80 [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211] [<ffffffff81068608>] flush_work+0x38/0x270 [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80 [<ffffffff810945ca>] ? mark_held_locks+0x6a/0x90 [<ffffffff81068a5f>] ? __cancel_work_timer+0x6f/0x100 [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211] [<ffffffff8109469d>] ? trace_hardirqs_on_caller+0xad/0x1c0 [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211] [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100 [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10 [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40 [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211] [<ffffffff810b4cc5>] call_timer_fn+0x75/0x180 [<ffffffff810b4c50>] ? process_timeout+0x10/0x10 [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211] [<ffffffff810b50ac>] run_timer_softirq+0x1fc/0x2f0 [<ffffffff81054805>] __do_softirq+0x115/0x2e0 [<ffffffff81054c75>] irq_exit+0xa5/0xb0 [<ffffffff810049b3>] do_IRQ+0x53/0xf0 [<ffffffff815a74af>] common_interrupt+0x6f/0x6f <EOI> [<ffffffff8147b56e>] ? cpuidle_enter_state+0x6e/0x180 [<ffffffff8147b732>] cpuidle_enter+0x12/0x20 [<ffffffff8108bba0>] cpu_startup_entry+0x330/0x360 [<ffffffff8158fb51>] rest_init+0xc1/0xd0 [<ffffffff8158fa90>] ? csum_partial_copy_generic+0x170/0x170 [<ffffffff81af3ff2>] start_kernel+0x44f/0x45a [<ffffffff81af399c>] ? set_init_arg+0x53/0x53 [<ffffffff81af35ad>] x86_64_start_reservations+0x2a/0x2c [<ffffffff81af36a0>] x86_64_start_kernel+0xf1/0xf4 Cc: Vincent Donnefort <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> Signed-off-by: Bryan Wu <[email protected]>

While debugging a cpufreq-related hardware failure on a system I saw the following lockdep warning: ========================= [ BUG: held lock freed! ] 3.17.0-rc4+ #1 Tainted: G E ------------------------- insmod/2247 is freeing memory ffff88006e1b1400-ffff88006e1b17ff, with a lock still held there! (&policy->rwsem){+.+...}, at: [<ffffffff8156d37d>] __cpufreq_add_dev.isra.21+0x47d/0xb80 3 locks held by insmod/2247: #0: (subsys mutex#5){+.+.+.}, at: [<ffffffff81485579>] subsys_interface_register+0x69/0x120 #1: (cpufreq_rwsem){.+.+.+}, at: [<ffffffff8156cf73>] __cpufreq_add_dev.isra.21+0x73/0xb80 MIPS#2: (&policy->rwsem){+.+...}, at: [<ffffffff8156d37d>] __cpufreq_add_dev.isra.21+0x47d/0xb80 stack backtrace: CPU: 0 PID: 2247 Comm: insmod Tainted: G E 3.17.0-rc4+ #1 Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 08/24/2013 0000000000000000 000000008f3063c4 ffff88006f87bb30 ffffffff8171b358 ffff88006bcf3750 ffff88006f87bb68 ffffffff810e09e1 ffff88006e1b1400 ffffea0001b86c00 ffffffff8156d327 ffff880073003500 0000000000000246 Call Trace: [<ffffffff8171b358>] dump_stack+0x4d/0x66 [<ffffffff810e09e1>] debug_check_no_locks_freed+0x171/0x180 [<ffffffff8156d327>] ? __cpufreq_add_dev.isra.21+0x427/0xb80 [<ffffffff8121412b>] kfree+0xab/0x2b0 [<ffffffff8156d327>] __cpufreq_add_dev.isra.21+0x427/0xb80 [<ffffffff81724cf7>] ? _raw_spin_unlock+0x27/0x40 [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffff8156da8e>] cpufreq_add_dev+0xe/0x10 [<ffffffff814855d1>] subsys_interface_register+0xc1/0x120 [<ffffffff8156bcf2>] cpufreq_register_driver+0x112/0x340 [<ffffffff8121415a>] ? kfree+0xda/0x2b0 [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffffa003562e>] pcc_cpufreq_init+0x4af/0xe81 [pcc_cpufreq] [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffff81002144>] do_one_initcall+0xd4/0x210 [<ffffffff811f7472>] ? __vunmap+0xd2/0x120 [<ffffffff81127155>] load_module+0x1315/0x1b70 [<ffffffff811222a0>] ? store_uevent+0x70/0x70 [<ffffffff811229d9>] ? copy_module_from_fd.isra.44+0x129/0x180 [<ffffffff81127b86>] SyS_finit_module+0xa6/0xd0 [<ffffffff81725b69>] system_call_fastpath+0x16/0x1b cpufreq: __cpufreq_add_dev: ->get() failed insmod: ERROR: could not insert module pcc-cpufreq.ko: No such device The warning occurs in the __cpufreq_add_dev() code which does down_write(&policy->rwsem); ... if (cpufreq_driver->get && !cpufreq_driver->setpolicy) { policy->cur = cpufreq_driver->get(policy->cpu); if (!policy->cur) { pr_err("%s: ->get() failed\n", __func__); goto err_get_freq; } If cpufreq_driver->get(policy->cpu) returns an error we execute the code at err_get_freq, which does not up the policy->rwsem. This causes the lockdep warning. Trivial patch to up the policy->rwsem in the error path. After the patch has been applied, and an error occurs in the cpufreq_driver->get(policy->cpu) call we will now see cpufreq: __cpufreq_add_dev: ->get() failed cpufreq: __cpufreq_add_dev: ->get() failed modprobe: ERROR: could not insert 'pcc_cpufreq': No such device Fixes: 4e97b63 (cpufreq: Initialize governor for a new policy under policy->rwsem) Signed-off-by: Prarit Bhargava <[email protected]> Acked-by: Viresh Kumar <[email protected]> Cc: 3.14+ <[email protected]> # 3.14+ Signed-off-by: Rafael J. Wysocki <[email protected]>

Reported in fdo#82527 comment MIPS#2. Signed-off-by: Ben Skeggs <[email protected]>

…na/nfs-rdma NFS: RDMA Client Sparse Fix MIPS#2 This patch fixes another sparse fix found by Dan Carpenter's tool. Signed-off-by: Anna Schumaker <[email protected]> * tag 'nfs-rdma-for-4.0-3' of git://git.linux-nfs.org/projects/anna/nfs-rdma: xprtrdma: Store RDMA credits in unsigned variables

A number of tx queue wake-up events went missing due to the outlined scenario below. Start state is a pool of 16 tx URBs, active tx_urbs count = 15, with the netdev tx queue open. CPU #1 [softirq] CPU MIPS#2 [softirq] start_xmit() tx_acknowledge() ................ ................ atomic_inc(&tx_urbs); if (atomic_read(&tx_urbs) >= 16) { --> atomic_dec(&tx_urbs); netif_wake_queue(); return; <-- netif_stop_queue(); } At the end, the correct state expected is a 15 tx_urbs count value with the tx queue state _open_. Due to the race, we get the same tx_urbs value but with the tx queue state _stopped_. The wake-up event is completely lost. Thus avoid hand-rolled concurrency mechanisms and use a proper lock for contexts and tx queue protection. Signed-off-by: Ahmed S. Darwish <[email protected]> Cc: linux-stable <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>

@scratch

The regfile provided to SA_SIGINFO signal handler as ucontext was off by one due to pt_regs gutter cleanups in 2013. Before handling signal, user pt_regs are copied onto user_regs_struct and copied back later. Both structs are binary compatible. This was all fine until commit 2fa9190 (ARC: pt_regs update MIPS#2) which removed the empty stack slot at top of pt_regs (corresponding to first pad) and made the corresponding fixup in struct user_regs_struct (the pad in there was moved out of @scratch - not removed altogether as it is part of ptrace ABI) struct user_regs_struct { + long pad; struct { - long pad; long bta, lp_start, lp_end,.... } scratch; ... } This meant that now user_regs_struct was off by 1 reg w.r.t pt_regs and signal code needs to user_regs_struct.scratch to reflect it as pt_regs, which is what this commit does. This problem was hidden for 2 years, because both save/restore, despite using wrong location, were using the same location. Only an interim inspection (reproducer below) exposed the issue. void handle_segv(int signo, siginfo_t *info, void *context) { ucontext_t *uc = context; struct user_regs_struct *regs = &(uc->uc_mcontext.regs); printf("regs %x %x\n", <=== prints 7 8 (vs. 8 9) regs->scratch.r8, regs->scratch.r9); } int main() { struct sigaction sa; sa.sa_sigaction = handle_segv; sa.sa_flags = SA_SIGINFO; sigemptyset(&sa.sa_mask); sigaction(SIGSEGV, &sa, NULL); asm volatile( "mov r7, 7 \n" "mov r8, 8 \n" "mov r9, 9 \n" "mov r10, 10 \n" :::"r7","r8","r9","r10"); *((unsigned int*)0x10) = 0; } Fixes: 2fa9190 "ARC: pt_regs update MIPS#2: Remove unused gutter at start of pt_regs" CC: <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>

…s enabled Nils Holland and Klaus Ethgen have reported unexpected OOM killer invocations with 32b kernel starting with 4.8 kernels kworker/u4:5 invoked oom-killer: gfp_mask=0x2400840(GFP_NOFS|__GFP_NOFAIL), nodemask=0, order=0, oom_score_adj=0 kworker/u4:5 cpuset=/ mems_allowed=0 CPU: 1 PID: 2603 Comm: kworker/u4:5 Not tainted 4.9.0-gentoo #2 [...] Mem-Info: active_anon:58685 inactive_anon:90 isolated_anon:0 active_file:274324 inactive_file:281962 isolated_file:0 unevictable:0 dirty:649 writeback:0 unstable:0 slab_reclaimable:40662 slab_unreclaimable:17754 mapped:7382 shmem:202 pagetables:351 bounce:0 free:206736 free_pcp:332 free_cma:0 Node 0 active_anon:234740kB inactive_anon:360kB active_file:1097296kB inactive_file:1127848kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:29528kB dirty:2596kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 184320kB anon_thp: 808kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no DMA free:3952kB min:788kB low:984kB high:1180kB active_anon:0kB inactive_anon:0kB active_file:7316kB inactive_file:0kB unevictable:0kB writepending:96kB present:15992kB managed:15916kB mlocked:0kB slab_reclaimable:3200kB slab_unreclaimable:1408kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB lowmem_reserve[]: 0 813 3474 3474 Normal free:41332kB min:41368kB low:51708kB high:62048kB active_anon:0kB inactive_anon:0kB active_file:532748kB inactive_file:44kB unevictable:0kB writepending:24kB present:897016kB managed:836248kB mlocked:0kB slab_reclaimable:159448kB slab_unreclaimable:69608kB kernel_stack:1112kB pagetables:1404kB bounce:0kB free_pcp:528kB local_pcp:340kB free_cma:0kB lowmem_reserve[]: 0 0 21292 21292 HighMem free:781660kB min:512kB low:34356kB high:68200kB active_anon:234740kB inactive_anon:360kB active_file:557232kB inactive_file:1127804kB unevictable:0kB writepending:2592kB present:2725384kB managed:2725384kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:800kB local_pcp:608kB free_cma:0kB the oom killer is clearly pre-mature because there there is still a lot of page cache in the zone Normal which should satisfy this lowmem request. Further debugging has shown that the reclaim cannot make any forward progress because the page cache is hidden in the active list which doesn't get rotated because inactive_list_is_low is not memcg aware. The code simply subtracts per-zone highmem counters from the respective memcg's lru sizes which doesn't make any sense. We can simply end up always seeing the resulting active and inactive counts 0 and return false. This issue is not limited to 32b kernels but in practice the effect on systems without CONFIG_HIGHMEM would be much harder to notice because we do not invoke the OOM killer for allocations requests targeting < ZONE_NORMAL. Fix the issue by tracking per zone lru page counts in mem_cgroup_per_node and subtract per-memcg highmem counts when memcg is enabled. Introduce helper lruvec_zone_lru_size which redirects to either zone counters or mem_cgroup_get_zone_lru_size when appropriate. We are losing empty LRU but non-zero lru size detection introduced by ca70723 ("mm: update_lru_size warn and reset bad lru_size") because of the inherent zone vs. node discrepancy. Fixes: f8d1a31 ("mm: consider whether to decivate based on eligible zones inactive ratio") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Michal Hocko <[email protected]> Reported-by: Nils Holland <[email protected]> Tested-by: Nils Holland <[email protected]> Reported-by: Klaus Ethgen <[email protected]> Acked-by: Minchan Kim <[email protected]> Acked-by: Mel Gorman <[email protected]> Acked-by: Johannes Weiner <[email protected]> Reviewed-by: Vladimir Davydov <[email protected]> Cc: <[email protected]> [4.8+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>

There may be a race condition if f_fs calls unregister_gadget_item in ffs_closed() when unregister_gadget is called by UDC store at the same time. this leads to a kernel NULL pointer dereference: [ 310.644928] Unable to handle kernel NULL pointer dereference at virtual address 00000004 [ 310.645053] init: Service 'adbd' is being killed... [ 310.658938] pgd = c9528000 [ 310.662515] [00000004] *pgd=19451831, *pte=00000000, *ppte=00000000 [ 310.669702] Internal error: Oops: 817 [#1] PREEMPT SMP ARM [ 310.675211] Modules linked in: [ 310.678294] CPU: 0 PID: 1537 Comm: ->transport Not tainted 4.1.15-03725-g793404c #2 [ 310.685958] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 310.692493] task: c8e24200 ti: c945e000 task.ti: c945e000 [ 310.697911] PC is at usb_gadget_unregister_driver+0xb4/0xd0 [ 310.703502] LR is at __mutex_lock_slowpath+0x10c/0x16c [ 310.708648] pc : [<c075efc0>] lr : [<c0bfb0bc>] psr: 600f0113 <snip..> [ 311.565585] [<c075efc0>] (usb_gadget_unregister_driver) from [<c075e2b8>] (unregister_gadget_item+0x1c/0x34) [ 311.575426] [<c075e2b8>] (unregister_gadget_item) from [<c076fcc8>] (ffs_closed+0x8c/0x9c) [ 311.583702] [<c076fcc8>] (ffs_closed) from [<c07736b8>] (ffs_data_reset+0xc/0xa0) [ 311.591194] [<c07736b8>] (ffs_data_reset) from [<c07738ac>] (ffs_data_closed+0x90/0xd0) [ 311.599208] [<c07738ac>] (ffs_data_closed) from [<c07738f8>] (ffs_ep0_release+0xc/0x14) [ 311.607224] [<c07738f8>] (ffs_ep0_release) from [<c023e030>] (__fput+0x80/0x1d0) [ 311.614635] [<c023e030>] (__fput) from [<c014e688>] (task_work_run+0xb0/0xe8) [ 311.621788] [<c014e688>] (task_work_run) from [<c010afdc>] (do_work_pending+0x7c/0xa4) [ 311.629718] [<c010afdc>] (do_work_pending) from [<c010770c>] (work_pending+0xc/0x20) for functions using functionFS, i.e. android adbd will close /dev/usb-ffs/adb/ep0 when usb IO thread fails, but switch adb from on to off also triggers write "none" > UDC. These 2 operations both call unregister_gadget, which will lead to the panic above. add a mutex before calling unregister_gadget for api used in f_fs. Signed-off-by: Winter Wang <[email protected]> Signed-off-by: Felipe Balbi <[email protected]>

Reverted disabling mipsr32r2 cause it leads to SIGILL in Android SKIA. logcat error: 01-01 00:01:22.716 1083 1083 F libc : Fatal signal 4 (SIGILL), code 128, fault addr 0x0 in tid 1083 (ndroid.systemui) 01-01 00:01:22.832 99 99 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** 01-01 00:01:22.832 99 99 F DEBUG : Build fingerprint: 'Android/aosp_ci20/ci20:6.0.1/MOB30D/alistair05241026:userdebug/test-keys' 01-01 00:01:22.832 99 99 F DEBUG : Revision: '0' 01-01 00:01:22.832 99 99 F DEBUG : ABI: 'mips' 01-01 00:01:22.833 99 99 F DEBUG : pid: 1083, tid: 1083, name: ndroid.systemui >>> com.android.systemui <<< 01-01 00:01:22.833 99 99 F DEBUG : signal 4 (SIGILL), code 128 (SI_KERNEL), fault addr 0x0 01-01 00:01:22.864 99 99 F DEBUG : zr 00000000 at 00000001 v0 00000000 v1 00000001 01-01 00:01:22.864 99 99 E DEBUG : AM write failed: Broken pipe 01-01 00:01:22.864 99 99 F DEBUG : a0 70cb8160 a1 70cb8160 a2 721f890c a3 00000001 01-01 00:01:22.864 99 99 F DEBUG : t0 00000000 t1 7fa2fb30 t2 00000001 t3 00000000 01-01 00:01:22.865 99 99 F DEBUG : t4 00000001 t5 00000000 t6 00000001 t7 7fa2fce0 01-01 00:01:22.865 99 99 F DEBUG : s0 721f890c s1 00000001 s2 00000002 s3 721f8924 01-01 00:01:22.865 99 99 F DEBUG : s4 00000000 s5 00000000 s6 766f3000 s7 00000000 01-01 00:01:22.865 99 99 F DEBUG : t8 766f3000 t9 766f6810 k0 73a76500 k1 00000000 01-01 00:01:22.865 99 99 F DEBUG : gp 76a20040 sp 7fa2fac0 s8 7fa2fce0 ra 766f89b8 01-01 00:01:22.865 99 99 F DEBUG : hi 00000000 lo 55555556 bva 721f8924 epc 766f8674 01-01 00:01:22.893 99 99 F DEBUG : 01-01 00:01:22.893 99 99 F DEBUG : backtrace: 01-01 00:01:22.893 99 99 F DEBUG : #00 pc 00215674 /system/lib/libskia.so (SkOpSpan::sortableTop(SkOpContour*)+1052) 01-01 00:01:22.893 99 99 F DEBUG : #1 pc 00215d68 /system/lib/libskia.so (SkOpSegment::findSortableTop(SkOpContour*)+112) 01-01 00:01:22.893 99 99 F DEBUG : #2 pc 00215e10 /system/lib/libskia.so (SkOpContour::findSortableTop(SkOpContour*)+72) 01-01 00:01:22.893 99 99 F DEBUG : #3 pc 00215e90 /system/lib/libskia.so (FindSortableTop(SkOpContourHead*)+88) 01-01 00:01:22.893 99 99 F DEBUG : #4 pc 001e1194 /system/lib/libskia.so (OpDebug(SkPath const&, SkPath const&, SkPathOp, SkPath*, bool)+868) 01-01 00:01:22.893 99 99 F DEBUG : #5 pc 001e1940 /system/lib/libskia.so (Op(SkPath const&, SkPath const&, SkPathOp, SkPath*)+44) 01-01 00:01:22.893 99 99 F DEBUG : #6 pc 33b8884c /data/dalvik-cache/mips/system@[email protected] (offset 0x215b000) Ingenic introduced new cache driver in video acceleration support patch and disabled selection if MIPS_CPU_SCACHE which was used earlier. Cache driver selection has been provided in order to easily select between new and old cache driver in case that problems are encountered in the future. Enabled VPU support in ci20_android_defconfig. Removed dummy functions and switch to the proper ones in drivers/staging/imgtec/ci20/ Change-Id: I399ea09bcd5454339260b9dfd942a87a19a3cfa2 Signed-off-by: Dragan Cecavac <[email protected]>

This is a story about 4 distinct (and very old) btrfs bugs. Commit c8b9781 ("Btrfs: Add zlib compression support") added three data corruption bugs for inline extents (bugs #1-3). Commit 93c82d5 ("Btrfs: zero page past end of inline file items") fixed bug #1: uncompressed inline extents followed by a hole and more extents could get non-zero data in the hole as they were read. The fix was to add a memset in btrfs_get_extent to zero out the hole. Commit 166ae5a ("btrfs: fix inline compressed read err corruption") fixed bug #2: compressed inline extents which contained non-zero bytes might be replaced with zero bytes in some cases. This patch removed an unhelpful memset from uncompress_inline, but the case where memset is required was missed. There is also a memset in the decompression code, but this only covers decompressed data that is shorter than the ram_bytes from the extent ref record. This memset doesn't cover the region between the end of the decompressed data and the end of the page. It has also moved around a few times over the years, so there's no single patch to refer to. This patch fixes bug #3: compressed inline extents followed by a hole and more extents could get non-zero data in the hole as they were read (i.e. bug #3 is the same as bug #1, but s/uncompressed/compressed/). The fix is the same: zero out the hole in the compressed case too, by putting a memset back in uncompress_inline, but this time with correct parameters. The last and oldest bug, bug #0, is the cause of the offending inline extent/hole/extent pattern. Bug #0 is a subtle and mostly-harmless quirk of behavior somewhere in the btrfs write code. In a few special cases, an inline extent and hole are allowed to persist where they normally would be combined with later extents in the file. A fast reproducer for bug #0 is presented below. A few offending extents are also created in the wild during large rsync transfers with the -S flag. A Linux kernel build (git checkout; make allyesconfig; make -j8) will produce a handful of offending files as well. Once an offending file is created, it can present different content to userspace each time it is read. Bug #0 is at least 4 and possibly 8 years old. I verified every vX.Y kernel back to v3.5 has this behavior. There are fossil records of this bug's effects in commits all the way back to v2.6.32. I have no reason to believe bug #0 wasn't present at the beginning of btrfs compression support in v2.6.29, but I can't easily test kernels that old to be sure. It is not clear whether bug #0 is worth fixing. A fix would likely require injecting extra reads into currently write-only paths, and most of the exceptional cases caused by bug #0 are already handled now. Whether we like them or not, bug #0's inline extents followed by holes are part of the btrfs de-facto disk format now, and we need to be able to read them without data corruption or an infoleak. So enough about bug #0, let's get back to bug #3 (this patch). An example of on-disk structure leading to data corruption found in the wild: item 61 key (606890 INODE_ITEM 0) itemoff 9662 itemsize 160 inode generation 50 transid 50 size 47424 nbytes 49141 block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0 flags 0x0(none) item 62 key (606890 INODE_REF 603050) itemoff 9642 itemsize 20 inode ref index 3 namelen 10 name: DB_File.so item 63 key (606890 EXTENT_DATA 0) itemoff 8280 itemsize 1362 inline extent data size 1341 ram 4085 compress(zlib) item 64 key (606890 EXTENT_DATA 4096) itemoff 8227 itemsize 53 extent data disk byte 5367308288 nr 20480 extent data offset 0 nr 45056 ram 45056 extent compression(zlib) Different data appears in userspace during each read of the 11 bytes between 4085 and 4096. The extent in item 63 is not long enough to fill the first page of the file, so a memset is required to fill the space between item 63 (ending at 4085) and item 64 (beginning at 4096) with zero. Here is a reproducer from Liu Bo, which demonstrates another method of creating the same inline extent and hole pattern: Using 'page_poison=on' kernel command line (or enable CONFIG_PAGE_POISONING) run the following: # touch foo # chattr +c foo # xfs_io -f -c "pwrite -W 0 1000" foo # xfs_io -f -c "falloc 4 8188" foo # od -x foo # echo 3 >/proc/sys/vm/drop_caches # od -x foo This produce the following on my box: Correct output: file contains 1000 data bytes followed by zeros: 0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd * 0001740 cdcd cdcd cdcd cdcd 0000 0000 0000 0000 0001760 0000 0000 0000 0000 0000 0000 0000 0000 * 0020000 Actual output: the data after the first 1000 bytes will be different each run: 0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd * 0001740 cdcd cdcd cdcd cdcd 6c63 7400 635f 006d 0001760 5f74 6f43 7400 435f 0053 5f74 7363 7400 0002000 435f 0056 5f74 6164 7400 645f 0062 5f74 (...) Signed-off-by: Zygo Blaxell <[email protected]> Reviewed-by: Liu Bo <[email protected]> Reviewed-by: Chris Mason <[email protected]> Signed-off-by: Chris Mason <[email protected]>

The rcu_barrier() takes the cpu_hotplug mutex which itself is not reclaim-safe, and so rcu_barrier() is illegal from inside the shrinker. [ 309.661373] ========================================================= [ 309.661376] [ INFO: possible irq lock inversion dependency detected ] [ 309.661380] 4.11.0-rc1-CI-CI_DRM_2333+ #1 Tainted: G W [ 309.661383] --------------------------------------------------------- [ 309.661386] gem_exec_gttfil/6435 just changed the state of lock: [ 309.661389] (rcu_preempt_state.barrier_mutex){+.+.-.}, at: [<ffffffff81100731>] _rcu_barrier+0x31/0x160 [ 309.661399] but this lock took another, RECLAIM_FS-unsafe lock in the past: [ 309.661402] (cpu_hotplug.lock){+.+.+.} [ 309.661404] and interrupts could create inverse lock ordering between them. [ 309.661410] other info that might help us debug this: [ 309.661414] Possible interrupt unsafe locking scenario: [ 309.661417] CPU0 CPU1 [ 309.661419] ---- ---- [ 309.661421] lock(cpu_hotplug.lock); [ 309.661425] local_irq_disable(); [ 309.661432] lock(rcu_preempt_state.barrier_mutex); [ 309.661441] lock(cpu_hotplug.lock); [ 309.661446] <Interrupt> [ 309.661448] lock(rcu_preempt_state.barrier_mutex); [ 309.661453] *** DEADLOCK *** [ 309.661460] 4 locks held by gem_exec_gttfil/6435: [ 309.661464] #0: (sb_writers#10){.+.+.+}, at: [<ffffffff8120d83d>] vfs_write+0x17d/0x1f0 [ 309.661475] #1: (debugfs_srcu){......}, at: [<ffffffff81320491>] debugfs_use_file_start+0x41/0xa0 [ 309.661486] #2: (&attr->mutex){+.+.+.}, at: [<ffffffff8123a3e7>] simple_attr_write+0x37/0xe0 [ 309.661495] #3: (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0091b4a>] i915_drop_caches_set+0x3a/0x150 [i915] [ 309.661540] the shortest dependencies between 2nd lock and 1st lock: [ 309.661547] -> (cpu_hotplug.lock){+.+.+.} ops: 829 { [ 309.661553] HARDIRQ-ON-W at: [ 309.661560] __lock_acquire+0x5e5/0x1b50 [ 309.661565] lock_acquire+0xc9/0x220 [ 309.661572] __mutex_lock+0x6e/0x990 [ 309.661576] mutex_lock_nested+0x16/0x20 [ 309.661583] get_online_cpus+0x61/0x80 [ 309.661590] kmem_cache_create+0x25/0x1d0 [ 309.661596] debug_objects_mem_init+0x30/0x249 [ 309.661602] start_kernel+0x341/0x3fe [ 309.661607] x86_64_start_reservations+0x2a/0x2c [ 309.661612] x86_64_start_kernel+0x173/0x186 [ 309.661619] verify_cpu+0x0/0xfc [ 309.661622] SOFTIRQ-ON-W at: [ 309.661627] __lock_acquire+0x611/0x1b50 [ 309.661632] lock_acquire+0xc9/0x220 [ 309.661636] __mutex_lock+0x6e/0x990 [ 309.661641] mutex_lock_nested+0x16/0x20 [ 309.661646] get_online_cpus+0x61/0x80 [ 309.661650] kmem_cache_create+0x25/0x1d0 [ 309.661655] debug_objects_mem_init+0x30/0x249 [ 309.661660] start_kernel+0x341/0x3fe [ 309.661664] x86_64_start_reservations+0x2a/0x2c [ 309.661669] x86_64_start_kernel+0x173/0x186 [ 309.661674] verify_cpu+0x0/0xfc [ 309.661677] RECLAIM_FS-ON-W at: [ 309.661682] mark_held_locks+0x6f/0xa0 [ 309.661687] lockdep_trace_alloc+0xb3/0x100 [ 309.661693] kmem_cache_alloc_trace+0x31/0x2e0 [ 309.661699] __smpboot_create_thread.part.1+0x27/0xe0 [ 309.661704] smpboot_create_threads+0x61/0x90 [ 309.661709] cpuhp_invoke_callback+0x9c/0x8a0 [ 309.661713] cpuhp_up_callbacks+0x31/0xb0 [ 309.661718] _cpu_up+0x7a/0xc0 [ 309.661723] do_cpu_up+0x5f/0x80 [ 309.661727] cpu_up+0xe/0x10 [ 309.661734] smp_init+0x71/0xb3 [ 309.661738] kernel_init_freeable+0x94/0x19e [ 309.661743] kernel_init+0x9/0xf0 [ 309.661748] ret_from_fork+0x2e/0x40 [ 309.661752] INITIAL USE at: [ 309.661757] __lock_acquire+0x234/0x1b50 [ 309.661761] lock_acquire+0xc9/0x220 [ 309.661766] __mutex_lock+0x6e/0x990 [ 309.661771] mutex_lock_nested+0x16/0x20 [ 309.661775] get_online_cpus+0x61/0x80 [ 309.661780] __cpuhp_setup_state+0x44/0x170 [ 309.661785] page_alloc_init+0x23/0x3a [ 309.661790] start_kernel+0x124/0x3fe [ 309.661794] x86_64_start_reservations+0x2a/0x2c [ 309.661799] x86_64_start_kernel+0x173/0x186 [ 309.661804] verify_cpu+0x0/0xfc [ 309.661807] } [ 309.661813] ... key at: [<ffffffff81e37690>] cpu_hotplug+0xb0/0x100 [ 309.661817] ... acquired at: [ 309.661821] lock_acquire+0xc9/0x220 [ 309.661825] __mutex_lock+0x6e/0x990 [ 309.661829] mutex_lock_nested+0x16/0x20 [ 309.661833] get_online_cpus+0x61/0x80 [ 309.661837] _rcu_barrier+0x9f/0x160 [ 309.661841] rcu_barrier+0x10/0x20 [ 309.661847] netdev_run_todo+0x5f/0x310 [ 309.661852] rtnl_unlock+0x9/0x10 [ 309.661856] default_device_exit_batch+0x133/0x150 [ 309.661862] ops_exit_list.isra.0+0x4d/0x60 [ 309.661866] cleanup_net+0x1d8/0x2c0 [ 309.661872] process_one_work+0x1f4/0x6d0 [ 309.661876] worker_thread+0x49/0x4a0 [ 309.661881] kthread+0x107/0x140 [ 309.661884] ret_from_fork+0x2e/0x40 [ 309.661890] -> (rcu_preempt_state.barrier_mutex){+.+.-.} ops: 179 { [ 309.661896] HARDIRQ-ON-W at: [ 309.661901] __lock_acquire+0x5e5/0x1b50 [ 309.661905] lock_acquire+0xc9/0x220 [ 309.661910] __mutex_lock+0x6e/0x990 [ 309.661914] mutex_lock_nested+0x16/0x20 [ 309.661919] _rcu_barrier+0x31/0x160 [ 309.661923] rcu_barrier+0x10/0x20 [ 309.661928] netdev_run_todo+0x5f/0x310 [ 309.661932] rtnl_unlock+0x9/0x10 [ 309.661936] default_device_exit_batch+0x133/0x150 [ 309.661941] ops_exit_list.isra.0+0x4d/0x60 [ 309.661946] cleanup_net+0x1d8/0x2c0 [ 309.661951] process_one_work+0x1f4/0x6d0 [ 309.661955] worker_thread+0x49/0x4a0 [ 309.661960] kthread+0x107/0x140 [ 309.661964] ret_from_fork+0x2e/0x40 [ 309.661968] SOFTIRQ-ON-W at: [ 309.661972] __lock_acquire+0x611/0x1b50 [ 309.661977] lock_acquire+0xc9/0x220 [ 309.661981] __mutex_lock+0x6e/0x990 [ 309.661986] mutex_lock_nested+0x16/0x20 [ 309.661990] _rcu_barrier+0x31/0x160 [ 309.661995] rcu_barrier+0x10/0x20 [ 309.661999] netdev_run_todo+0x5f/0x310 [ 309.662003] rtnl_unlock+0x9/0x10 [ 309.662008] default_device_exit_batch+0x133/0x150 [ 309.662013] ops_exit_list.isra.0+0x4d/0x60 [ 309.662017] cleanup_net+0x1d8/0x2c0 [ 309.662022] process_one_work+0x1f4/0x6d0 [ 309.662027] worker_thread+0x49/0x4a0 [ 309.662031] kthread+0x107/0x140 [ 309.662035] ret_from_fork+0x2e/0x40 [ 309.662039] IN-RECLAIM_FS-W at: [ 309.662043] __lock_acquire+0x638/0x1b50 [ 309.662048] lock_acquire+0xc9/0x220 [ 309.662053] __mutex_lock+0x6e/0x990 [ 309.662058] mutex_lock_nested+0x16/0x20 [ 309.662062] _rcu_barrier+0x31/0x160 [ 309.662067] rcu_barrier+0x10/0x20 [ 309.662089] i915_gem_shrink_all+0x33/0x40 [i915] [ 309.662109] i915_drop_caches_set+0x141/0x150 [i915] [ 309.662114] simple_attr_write+0xc7/0xe0 [ 309.662119] full_proxy_write+0x4f/0x70 [ 309.662124] __vfs_write+0x23/0x120 [ 309.662128] vfs_write+0xc6/0x1f0 [ 309.662133] SyS_write+0x44/0xb0 [ 309.662138] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 309.662142] INITIAL USE at: [ 309.662147] __lock_acquire+0x234/0x1b50 [ 309.662151] lock_acquire+0xc9/0x220 [ 309.662156] __mutex_lock+0x6e/0x990 [ 309.662160] mutex_lock_nested+0x16/0x20 [ 309.662165] _rcu_barrier+0x31/0x160 [ 309.662169] rcu_barrier+0x10/0x20 [ 309.662174] netdev_run_todo+0x5f/0x310 [ 309.662178] rtnl_unlock+0x9/0x10 [ 309.662183] default_device_exit_batch+0x133/0x150 [ 309.662188] ops_exit_list.isra.0+0x4d/0x60 [ 309.662192] cleanup_net+0x1d8/0x2c0 [ 309.662197] process_one_work+0x1f4/0x6d0 [ 309.662202] worker_thread+0x49/0x4a0 [ 309.662206] kthread+0x107/0x140 [ 309.662210] ret_from_fork+0x2e/0x40 [ 309.662214] } [ 309.662220] ... key at: [<ffffffff81e4e1c8>] rcu_preempt_state+0x508/0x780 [ 309.662225] ... acquired at: [ 309.662229] check_usage_forwards+0x12b/0x130 [ 309.662233] mark_lock+0x360/0x6f0 [ 309.662237] __lock_acquire+0x638/0x1b50 [ 309.662241] lock_acquire+0xc9/0x220 [ 309.662245] __mutex_lock+0x6e/0x990 [ 309.662249] mutex_lock_nested+0x16/0x20 [ 309.662253] _rcu_barrier+0x31/0x160 [ 309.662257] rcu_barrier+0x10/0x20 [ 309.662279] i915_gem_shrink_all+0x33/0x40 [i915] [ 309.662298] i915_drop_caches_set+0x141/0x150 [i915] [ 309.662303] simple_attr_write+0xc7/0xe0 [ 309.662307] full_proxy_write+0x4f/0x70 [ 309.662311] __vfs_write+0x23/0x120 [ 309.662315] vfs_write+0xc6/0x1f0 [ 309.662319] SyS_write+0x44/0xb0 [ 309.662323] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 309.662329] stack backtrace: [ 309.662335] CPU: 1 PID: 6435 Comm: gem_exec_gttfil Tainted: G W 4.11.0-rc1-CI-CI_DRM_2333+ #1 [ 309.662342] Hardware name: Hewlett-Packard HP Compaq 8100 Elite SFF PC/304Ah, BIOS 786H1 v01.13 07/14/2011 [ 309.662348] Call Trace: [ 309.662354] dump_stack+0x67/0x92 [ 309.662359] print_irq_inversion_bug.part.19+0x1a4/0x1b0 [ 309.662365] check_usage_forwards+0x12b/0x130 [ 309.662369] mark_lock+0x360/0x6f0 [ 309.662374] ? print_shortest_lock_dependencies+0x1a0/0x1a0 [ 309.662379] __lock_acquire+0x638/0x1b50 [ 309.662383] ? __mutex_unlock_slowpath+0x3e/0x2e0 [ 309.662388] ? trace_hardirqs_on+0xd/0x10 [ 309.662392] ? _rcu_barrier+0x31/0x160 [ 309.662396] lock_acquire+0xc9/0x220 [ 309.662400] ? _rcu_barrier+0x31/0x160 [ 309.662404] ? _rcu_barrier+0x31/0x160 [ 309.662409] __mutex_lock+0x6e/0x990 [ 309.662412] ? _rcu_barrier+0x31/0x160 [ 309.662416] ? _rcu_barrier+0x31/0x160 [ 309.662421] ? synchronize_rcu_expedited+0x35/0xb0 [ 309.662426] ? _raw_spin_unlock_irqrestore+0x52/0x60 [ 309.662434] mutex_lock_nested+0x16/0x20 [ 309.662438] _rcu_barrier+0x31/0x160 [ 309.662442] rcu_barrier+0x10/0x20 [ 309.662464] i915_gem_shrink_all+0x33/0x40 [i915] [ 309.662484] i915_drop_caches_set+0x141/0x150 [i915] [ 309.662489] simple_attr_write+0xc7/0xe0 [ 309.662494] full_proxy_write+0x4f/0x70 [ 309.662498] __vfs_write+0x23/0x120 [ 309.662503] ? rcu_read_lock_sched_held+0x75/0x80 [ 309.662507] ? rcu_sync_lockdep_assert+0x2a/0x50 [ 309.662512] ? __sb_start_write+0x102/0x210 [ 309.662516] ? vfs_write+0x17d/0x1f0 [ 309.662520] vfs_write+0xc6/0x1f0 [ 309.662524] ? trace_hardirqs_on_caller+0xe7/0x200 [ 309.662529] SyS_write+0x44/0xb0 [ 309.662533] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 309.662537] RIP: 0033:0x7f507eac24a0 [ 309.662541] RSP: 002b:00007fffda8720e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 309.662548] RAX: ffffffffffffffda RBX: ffffffff81482bd3 RCX: 00007f507eac24a0 [ 309.662552] RDX: 0000000000000005 RSI: 00007fffda8720f0 RDI: 0000000000000005 [ 309.662557] RBP: ffffc9000048bf88 R08: 0000000000000000 R09: 000000000000002c [ 309.662561] R10: 0000000000000014 R11: 0000000000000246 R12: 00007fffda872230 [ 309.662566] R13: 00007fffda872228 R14: 0000000000000201 R15: 00007fffda8720f0 [ 309.662572] ? __this_cpu_preempt_check+0x13/0x20 Fixes: 0eafec6 ("drm/i915: Enable lockless lookup of request tracking via RCU") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100192 Signed-off-by: Chris Wilson <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: <[email protected]> # v4.9+ Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Daniel Vetter <[email protected]> (cherry picked from commit bd784b7) Signed-off-by: Jani Nikula <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]

commit 57b0e31 upstream. We need to check the return value of match_token() for Opt_err before doing anything with it. [ Not only did the old "-1" value for Opt_err cause problems for the __test_and_set_bit(), as fixed in commit 94c13f6 ("security: don't use a negative Opt_err token index"), but accessing "args[0].from" is invalid for the Opt_err case, as pointed out by Eric later. - Linus ] Reported-by: [email protected] Fixes: 00d60fd ("KEYS: Provide keyctls to drive the new key type ops for asymmetric keys [ver #2]") Signed-off-by: Eric Biggers <[email protected]> Cc: [email protected] # 4.20 Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

…er shutdown commit 60a161b upstream. Suppose adapter (open) recovery is between opened QDIO queues and before (the end of) initial posting of status read buffers (SRBs). This time window can be seconds long due to FSF_PROT_HOST_CONNECTION_INITIALIZING causing by design looping with exponential increase sleeps in the function performing exchange config data during recovery [zfcp_erp_adapter_strat_fsf_xconf()]. Recovery triggered by local link up. Suppose an event occurs for which the FCP channel would send an unsolicited notification to zfcp by means of a previously posted SRB. We saw it with local cable pull (link down) in multi-initiator zoning with multiple NPIV-enabled subchannels of the same shared FCP channel. As soon as zfcp_erp_adapter_strategy_open_fsf() starts posting the initial status read buffers from within the adapter's ERP thread, the channel does send an unsolicited notification. Since v2.6.27 commit d26ab06 ("[SCSI] zfcp: receiving an unsolicted status can lead to I/O stall"), zfcp_fsf_status_read_handler() schedules adapter->stat_work to re-fill the just consumed SRB from a work item. Now the ERP thread and the work item post SRBs in parallel. Both contexts call the helper function zfcp_status_read_refill(). The tracking of missing (to be posted / re-filled) SRBs is not thread-safe due to separate atomic_read() and atomic_dec(), in order to depend on posting success. Hence, both contexts can see atomic_read(&adapter->stat_miss) == 1. One of the two contexts posts one too many SRB. Zfcp gets QDIO_ERROR_SLSB_STATE on the output queue (trace tag "qdireq1") leading to zfcp_erp_adapter_shutdown() in zfcp_qdio_handler_error(). An obvious and seemingly clean fix would be to schedule stat_work from the ERP thread and wait for it to finish. This would serialize all SRB re-fills. However, we already have another work item wait on the ERP thread: adapter->scan_work runs zfcp_fc_scan_ports() which calls zfcp_fc_eval_gpn_ft(). The latter calls zfcp_erp_wait() to wait for all the open port recoveries during zfcp auto port scan, but in fact it waits for any pending recovery including an adapter recovery. This approach leads to a deadlock. [see also v3.19 commit 18f87a6 ("zfcp: auto port scan resiliency"); v2.6.37 commit d3e1088 ("[SCSI] zfcp: No ERP escalation on gpn_ft eval"); v2.6.28 commit fca55b6 ("[SCSI] zfcp: fix deadlock between wq triggered port scan and ERP") fixing v2.6.27 commit c57a39a ("[SCSI] zfcp: wait until adapter is finished with ERP during auto-port"); v2.6.27 commit cc8c282 ("[SCSI] zfcp: Automatically attach remote ports")] Instead make the accounting of missing SRBs atomic for parallel execution in both the ERP thread and adapter->stat_work. Signed-off-by: Steffen Maier <[email protected]> Fixes: d26ab06 ("[SCSI] zfcp: receiving an unsolicted status can lead to I/O stall") Cc: <[email protected]> #2.6.27+ Reviewed-by: Jens Remus <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 8b66fee upstream. syzbot reports following splat: BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:486 CPU: 1 PID: 11057 Comm: syz-executor0 Not tainted 4.20.0-rc7+ #2 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613 __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:295 strlen+0x3b/0xa0 lib/string.c:486 nla_put_string include/net/netlink.h:1154 [inline] tipc_nl_compat_link_reset_stats+0x1f0/0x360 net/tipc/netlink_compat.c:760 __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline] tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:344 tipc_nl_compat_handle net/tipc/netlink_compat.c:1107 [inline] tipc_nl_compat_recv+0x14d7/0x2760 net/tipc/netlink_compat.c:1210 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626 netlink_rcv_skb+0x444/0x640 net/netlink/af_netlink.c:2477 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0xf40/0x1020 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x457ec9 Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f2557338c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457ec9 RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f25573396d4 R13: 00000000004cb478 R14: 00000000004d86c8 R15: 00000000ffffffff Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline] kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158 kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176 kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185 slab_post_alloc_hook mm/slab.h:446 [inline] slab_alloc_node mm/slub.c:2759 [inline] __kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383 __kmalloc_reserve net/core/skbuff.c:137 [inline] __alloc_skb+0x309/0xa20 net/core/skbuff.c:205 alloc_skb include/linux/skbuff.h:998 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline] netlink_sendmsg+0xb82/0x1300 net/netlink/af_netlink.c:1892 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 The uninitialised access happened in tipc_nl_compat_link_reset_stats: nla_put_string(skb, TIPC_NLA_LINK_NAME, name) This is because name string is not validated before it's used. Reported-by: [email protected] Signed-off-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit edf5ff0 upstream. syzbot reports following splat: BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:486 CPU: 1 PID: 9306 Comm: syz-executor172 Not tainted 4.20.0-rc7+ #2 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613 __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313 strlen+0x3b/0xa0 lib/string.c:486 nla_put_string include/net/netlink.h:1154 [inline] __tipc_nl_compat_link_set net/tipc/netlink_compat.c:708 [inline] tipc_nl_compat_link_set+0x929/0x1220 net/tipc/netlink_compat.c:744 __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline] tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:344 tipc_nl_compat_handle net/tipc/netlink_compat.c:1107 [inline] tipc_nl_compat_recv+0x14d7/0x2760 net/tipc/netlink_compat.c:1210 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626 netlink_rcv_skb+0x444/0x640 net/netlink/af_netlink.c:2477 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0xf40/0x1020 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 The uninitialised access happened in nla_put_string(skb, TIPC_NLA_LINK_NAME, lc->name) This is because lc->name string is not validated before it's used. Reported-by: [email protected] Signed-off-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit d8797b1 ] __install_bp_hardening_cb() is called via stop_machine() as part of the cpu_enable callback. To force each CPU to take its turn when allocating slots, they take a spinlock. With the RT patches applied, the spinlock becomes a mutex, and we get warnings about sleeping while in stop_machine(): | [ 0.319176] CPU features: detected: RAS Extension Support | [ 0.319950] BUG: scheduling while atomic: migration/3/36/0x00000002 | [ 0.319955] Modules linked in: | [ 0.319958] Preemption disabled at: | [ 0.319969] [<ffff000008181ae4>] cpu_stopper_thread+0x7c/0x108 | [ 0.319973] CPU: 3 PID: 36 Comm: migration/3 Not tainted 4.19.1-rt3-00250-g330fc2c2a880 #2 | [ 0.319975] Hardware name: linux,dummy-virt (DT) | [ 0.319976] Call trace: | [ 0.319981] dump_backtrace+0x0/0x148 | [ 0.319983] show_stack+0x14/0x20 | [ 0.319987] dump_stack+0x80/0xa4 | [ 0.319989] __schedule_bug+0x94/0xb0 | [ 0.319991] __schedule+0x510/0x560 | [ 0.319992] schedule+0x38/0xe8 | [ 0.319994] rt_spin_lock_slowlock_locked+0xf0/0x278 | [ 0.319996] rt_spin_lock_slowlock+0x5c/0x90 | [ 0.319998] rt_spin_lock+0x54/0x58 | [ 0.320000] enable_smccc_arch_workaround_1+0xdc/0x260 | [ 0.320001] __enable_cpu_capability+0x10/0x20 | [ 0.320003] multi_cpu_stop+0x84/0x108 | [ 0.320004] cpu_stopper_thread+0x84/0x108 | [ 0.320008] smpboot_thread_fn+0x1e8/0x2b0 | [ 0.320009] kthread+0x124/0x128 | [ 0.320010] ret_from_fork+0x10/0x18 Switch this to a raw spinlock, as we know this is only called with IRQs masked. Signed-off-by: James Morse <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 5a86d68 ] When network namespace is destroyed, cleanup_net() is called. cleanup_net() holds pernet_ops_rwsem then calls each ->exit callback. So that clusterip_tg_destroy() is called by cleanup_net(). And clusterip_tg_destroy() calls unregister_netdevice_notifier(). But both cleanup_net() and clusterip_tg_destroy() hold same lock(pernet_ops_rwsem). hence deadlock occurrs. After this patch, only 1 notifier is registered when module is inserted. And all of configs are added to per-net list. test commands: %ip netns add vm1 %ip netns exec vm1 bash %ip link set lo up %iptables -A INPUT -p tcp -i lo -d 192.168.0.5 --dport 80 \ -j CLUSTERIP --new --hashmode sourceip \ --clustermac 01:00:5e:00:00:20 --total-nodes 2 --local-node 1 %exit %ip netns del vm1 splat looks like: [ 341.809674] ============================================ [ 341.809674] WARNING: possible recursive locking detected [ 341.809674] 4.19.0-rc5+ MIPS#16 Tainted: G W [ 341.809674] -------------------------------------------- [ 341.809674] kworker/u4:2/87 is trying to acquire lock: [ 341.809674] 000000005da2d519 (pernet_ops_rwsem){++++}, at: unregister_netdevice_notifier+0x8c/0x460 [ 341.809674] [ 341.809674] but task is already holding lock: [ 341.809674] 000000005da2d519 (pernet_ops_rwsem){++++}, at: cleanup_net+0x119/0x900 [ 341.809674] [ 341.809674] other info that might help us debug this: [ 341.809674] Possible unsafe locking scenario: [ 341.809674] [ 341.809674] CPU0 [ 341.809674] ---- [ 341.809674] lock(pernet_ops_rwsem); [ 341.809674] lock(pernet_ops_rwsem); [ 341.809674] [ 341.809674] *** DEADLOCK *** [ 341.809674] [ 341.809674] May be due to missing lock nesting notation [ 341.809674] [ 341.809674] 3 locks held by kworker/u4:2/87: [ 341.809674] #0: 00000000d9df6c92 ((wq_completion)"%s""netns"){+.+.}, at: process_one_work+0xafe/0x1de0 [ 341.809674] #1: 00000000c2cbcee2 (net_cleanup_work){+.+.}, at: process_one_work+0xb60/0x1de0 [ 341.809674] #2: 000000005da2d519 (pernet_ops_rwsem){++++}, at: cleanup_net+0x119/0x900 [ 341.809674] [ 341.809674] stack backtrace: [ 341.809674] CPU: 1 PID: 87 Comm: kworker/u4:2 Tainted: G W 4.19.0-rc5+ MIPS#16 [ 341.809674] Workqueue: netns cleanup_net [ 341.809674] Call Trace: [ ... ] [ 342.070196] down_write+0x93/0x160 [ 342.070196] ? unregister_netdevice_notifier+0x8c/0x460 [ 342.070196] ? down_read+0x1e0/0x1e0 [ 342.070196] ? sched_clock_cpu+0x126/0x170 [ 342.070196] ? find_held_lock+0x39/0x1c0 [ 342.070196] unregister_netdevice_notifier+0x8c/0x460 [ 342.070196] ? register_netdevice_notifier+0x790/0x790 [ 342.070196] ? __local_bh_enable_ip+0xe9/0x1b0 [ 342.070196] ? __local_bh_enable_ip+0xe9/0x1b0 [ 342.070196] ? clusterip_tg_destroy+0x372/0x650 [ipt_CLUSTERIP] [ 342.070196] ? trace_hardirqs_on+0x93/0x210 [ 342.070196] ? __bpf_trace_preemptirq_template+0x10/0x10 [ 342.070196] ? clusterip_tg_destroy+0x372/0x650 [ipt_CLUSTERIP] [ 342.123094] clusterip_tg_destroy+0x3ad/0x650 [ipt_CLUSTERIP] [ 342.123094] ? clusterip_net_init+0x3d0/0x3d0 [ipt_CLUSTERIP] [ 342.123094] ? cleanup_match+0x17d/0x200 [ip_tables] [ 342.123094] ? xt_unregister_table+0x215/0x300 [x_tables] [ 342.123094] ? kfree+0xe2/0x2a0 [ 342.123094] cleanup_entry+0x1d5/0x2f0 [ip_tables] [ 342.123094] ? cleanup_match+0x200/0x200 [ip_tables] [ 342.123094] __ipt_unregister_table+0x9b/0x1a0 [ip_tables] [ 342.123094] iptable_filter_net_exit+0x43/0x80 [iptable_filter] [ 342.123094] ops_exit_list.isra.10+0x94/0x140 [ 342.123094] cleanup_net+0x45b/0x900 [ ... ] Fixes: 202f59a ("netfilter: ipt_CLUSTERIP: do not hold dev") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 4f4b374 ] This is the much more correct fix for my earlier attempt at: https://lkml.org/lkml/2018/12/10/118 Short recap: - There's not actually a locking issue, it's just lockdep being a bit too eager to complain about a possible deadlock. - Contrary to what I claimed the real problem is recursion on kn->count. Greg pointed me at sysfs_break_active_protection(), used by the scsi subsystem to allow a sysfs file to unbind itself. That would be a real deadlock, which isn't what's happening here. Also, breaking the active protection means we'd need to manually handle all the lifetime fun. - With Rafael we discussed the task_work approach, which kinda works, but has two downsides: It's a functional change for a lockdep annotation issue, and it won't work for the bind file (which needs to get the errno from the driver load function back to userspace). - Greg also asked why this never showed up: To hit this you need to unregister a 2nd driver from the unload code of your first driver. I guess only gpus do that. The bug has always been there, but only with a recent patch series did we add more locks so that lockdep built a chain from unbinding the snd-hda driver to the acpi_video_unregister call. Full lockdep splat: [12301.898799] ============================================ [12301.898805] WARNING: possible recursive locking detected [12301.898811] 4.20.0-rc7+ MIPS#84 Not tainted [12301.898815] -------------------------------------------- [12301.898821] bash/5297 is trying to acquire lock: [12301.898826] 00000000f61c6093 (kn->count#39){++++}, at: kernfs_remove_by_name_ns+0x3b/0x80 [12301.898841] but task is already holding lock: [12301.898847] 000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190 [12301.898856] other info that might help us debug this: [12301.898862] Possible unsafe locking scenario: [12301.898867] CPU0 [12301.898870] ---- [12301.898874] lock(kn->count#39); [12301.898879] lock(kn->count#39); [12301.898883] *** DEADLOCK *** [12301.898891] May be due to missing lock nesting notation [12301.898899] 5 locks held by bash/5297: [12301.898903] #0: 00000000cd800e54 (sb_writers#4){.+.+}, at: vfs_write+0x17f/0x1b0 [12301.898915] #1: 000000000465e7c2 (&of->mutex){+.+.}, at: kernfs_fop_write+0xd3/0x190 [12301.898925] #2: 000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190 [12301.898936] #3: 00000000414ef7ac (&dev->mutex){....}, at: device_release_driver_internal+0x34/0x240 [12301.898950] #4: 000000003218fbdf (register_count_mutex){+.+.}, at: acpi_video_unregister+0xe/0x40 [12301.898960] stack backtrace: [12301.898968] CPU: 1 PID: 5297 Comm: bash Not tainted 4.20.0-rc7+ MIPS#84 [12301.898974] Hardware name: Hewlett-Packard HP EliteBook 8460p/161C, BIOS 68SCF Ver. F.01 03/11/2011 [12301.898982] Call Trace: [12301.898989] dump_stack+0x67/0x9b [12301.898997] __lock_acquire+0x6ad/0x1410 [12301.899003] ? kernfs_remove_by_name_ns+0x3b/0x80 [12301.899010] ? find_held_lock+0x2d/0x90 [12301.899017] ? mutex_spin_on_owner+0xe4/0x150 [12301.899023] ? find_held_lock+0x2d/0x90 [12301.899030] ? lock_acquire+0x90/0x180 [12301.899036] lock_acquire+0x90/0x180 [12301.899042] ? kernfs_remove_by_name_ns+0x3b/0x80 [12301.899049] __kernfs_remove+0x296/0x310 [12301.899055] ? kernfs_remove_by_name_ns+0x3b/0x80 [12301.899060] ? kernfs_name_hash+0xd/0x80 [12301.899066] ? kernfs_find_ns+0x6c/0x100 [12301.899073] kernfs_remove_by_name_ns+0x3b/0x80 [12301.899080] bus_remove_driver+0x92/0xa0 [12301.899085] acpi_video_unregister+0x24/0x40 [12301.899127] i915_driver_unload+0x42/0x130 [i915] [12301.899160] i915_pci_remove+0x19/0x30 [i915] [12301.899169] pci_device_remove+0x36/0xb0 [12301.899176] device_release_driver_internal+0x185/0x240 [12301.899183] unbind_store+0xaf/0x180 [12301.899189] kernfs_fop_write+0x104/0x190 [12301.899195] __vfs_write+0x31/0x180 [12301.899203] ? rcu_read_lock_sched_held+0x6f/0x80 [12301.899209] ? rcu_sync_lockdep_assert+0x29/0x50 [12301.899216] ? __sb_start_write+0x13c/0x1a0 [12301.899221] ? vfs_write+0x17f/0x1b0 [12301.899227] vfs_write+0xb9/0x1b0 [12301.899233] ksys_write+0x50/0xc0 [12301.899239] do_syscall_64+0x4b/0x180 [12301.899247] entry_SYSCALL_64_after_hwframe+0x49/0xbe [12301.899253] RIP: 0033:0x7f452ac7f7a4 [12301.899259] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 aa f0 2c 00 48 63 ff 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 55 53 48 89 d5 48 89 f3 48 83 [12301.899273] RSP: 002b:00007ffceafa6918 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [12301.899282] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f452ac7f7a4 [12301.899288] RDX: 000000000000000d RSI: 00005612a1abf7c0 RDI: 0000000000000001 [12301.899295] RBP: 00005612a1abf7c0 R08: 000000000000000a R09: 00005612a1c46730 [12301.899301] R10: 000000000000000a R11: 0000000000000246 R12: 000000000000000d [12301.899308] R13: 0000000000000001 R14: 00007f452af4a740 R15: 000000000000000d Looking around I've noticed that usb and i2c already handle similar recursion problems, where a sysfs file can unbind the same type of sysfs somewhere else in the hierarchy. Relevant commits are: commit 356c05d Author: Alan Stern <[email protected]> Date: Mon May 14 13:30:03 2012 -0400 sysfs: get rid of some lockdep false positives commit e9b526f Author: Alexander Sverdlin <[email protected]> Date: Fri May 17 14:56:35 2013 +0200 i2c: suppress lockdep warning on delete_device Implement the same trick for driver bind/unbind. v2: Put the macro into bus.c (Greg). Reviewed-by: Rafael J. Wysocki <[email protected]> Cc: Ramalingam C <[email protected]> Cc: Arend van Spriel <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Bartosz Golaszewski <[email protected]> Cc: Heikki Krogerus <[email protected]> Cc: Vivek Gautam <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 532e1e5 ] mount.ocfs2 ignore the inconsistent error that journal is clean but local alloc is unrecovered. After mount, local alloc not empty, then reserver cluster didn't alloc a new local alloc window, reserveration map is empty(ocfs2_reservation_map.m_bitmap_len = 0), that triggered the following panic. This issue was reported at https://oss.oracle.com/pipermail/ocfs2-devel/2015-May/010854.html and was advised to fixed during mount. But this is a very unusual inconsistent state, usually journal dirty flag should be cleared at the last stage of umount until every other things go right. We may need do further debug to check that. Any way to avoid possible futher corruption, mount should be abort and fsck should be run. (mount.ocfs2,1765,1):ocfs2_load_local_alloc:353 ERROR: Local alloc hasn't been recovered! found = 6518, set = 6518, taken = 8192, off = 15912372 ocfs2: Mounting device (202,64) on (node 0, slot 3) with ordered data mode. o2dlm: Joining domain 89CEAC63CC4F4D03AC185B44E0EE0F3F ( 0 1 2 3 4 5 6 8 ) 8 nodes ocfs2: Mounting device (202,80) on (node 0, slot 3) with ordered data mode. o2hb: Region 89CEAC63CC4F4D03AC185B44E0EE0F3F (xvdf) is now a quorum device o2net: Accepted connection from node yvwsoa17p (num 7) at 172.22.77.88:7777 o2dlm: Node 7 joins domain 64FE421C8C984E6D96ED12C55FEE2435 ( 0 1 2 3 4 5 6 7 8 ) 9 nodes o2dlm: Node 7 joins domain 89CEAC63CC4F4D03AC185B44E0EE0F3F ( 0 1 2 3 4 5 6 7 8 ) 9 nodes ------------[ cut here ]------------ kernel BUG at fs/ocfs2/reservations.c:507! invalid opcode: 0000 [#1] SMP Modules linked in: ocfs2 rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs fscache lockd grace ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sunrpc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 ovmapi ppdev parport_pc parport xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea acpi_cpufreq pcspkr i2c_piix4 i2c_core sg ext4 jbd2 mbcache2 sr_mod cdrom xen_blkfront pata_acpi ata_generic ata_piix floppy dm_mirror dm_region_hash dm_log dm_mod CPU: 0 PID: 4349 Comm: startWebLogic.s Not tainted 4.1.12-124.19.2.el6uek.x86_64 #2 Hardware name: Xen HVM domU, BIOS 4.4.4OVM 09/06/2018 task: ffff8803fb04e200 ti: ffff8800ea4d8000 task.ti: ffff8800ea4d8000 RIP: 0010:[<ffffffffa05e96a8>] [<ffffffffa05e96a8>] __ocfs2_resv_find_window+0x498/0x760 [ocfs2] Call Trace: ocfs2_resmap_resv_bits+0x10d/0x400 [ocfs2] ocfs2_claim_local_alloc_bits+0xd0/0x640 [ocfs2] __ocfs2_claim_clusters+0x178/0x360 [ocfs2] ocfs2_claim_clusters+0x1f/0x30 [ocfs2] ocfs2_convert_inline_data_to_extents+0x634/0xa60 [ocfs2] ocfs2_write_begin_nolock+0x1c6/0x1da0 [ocfs2] ocfs2_write_begin+0x13e/0x230 [ocfs2] generic_perform_write+0xbf/0x1c0 __generic_file_write_iter+0x19c/0x1d0 ocfs2_file_write_iter+0x589/0x1360 [ocfs2] __vfs_write+0xb8/0x110 vfs_write+0xa9/0x1b0 SyS_write+0x46/0xb0 system_call_fastpath+0x18/0xd7 Code: ff ff 8b 75 b8 39 75 b0 8b 45 c8 89 45 98 0f 84 e5 fe ff ff 45 8b 74 24 18 41 8b 54 24 1c e9 56 fc ff ff 85 c0 0f 85 48 ff ff ff <0f> 0b 48 8b 05 cf c3 de ff 48 ba 00 00 00 00 00 00 00 10 48 85 RIP __ocfs2_resv_find_window+0x498/0x760 [ocfs2] RSP <ffff8800ea4db668> ---[ end trace 566f07529f2edf3c ]--- Kernel panic - not syncing: Fatal exception Kernel Offset: disabled Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Junxiao Bi <[email protected]> Reviewed-by: Yiwen Jiang <[email protected]> Acked-by: Joseph Qi <[email protected]> Cc: Jun Piao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Changwei Ge <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 427c5ce ] As of v4.20, the swim3 driver crashes when loaded on a PowerBook G3 (Wallstreet). MacIO PCI driver attached to Gatwick chipset MacIO PCI driver attached to Heathrow chipset swim3 0.00015000:floppy: [fd0] SWIM3 floppy controller in media bay 0.00013020:ch-a: ttyS0 at MMIO 0xf3013020 (irq = 16, base_baud = 230400) is a Z85c30 ESCC - Serial port 0.00013000:ch-b: ttyS1 at MMIO 0xf3013000 (irq = 17, base_baud = 230400) is a Z85c30 ESCC - Infrared port macio: fixed media-bay irq on gatwick macio: fixed left floppy irqs swim3 1.00015000:floppy: [fd1] Couldn't request interrupt Unable to handle kernel paging request for data at address 0x00000024 Faulting instruction address: 0xc02652f8 Oops: Kernel access of bad area, sig: 11 [#1] BE SMP NR_CPUS=2 PowerMac Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0 #2 NIP: c02652f8 LR: c026915c CTR: c0276d1c REGS: df43ba10 TRAP: 0300 Not tainted (4.20.0) MSR: 00009032 <EE,ME,IR,DR,RI> CR: 28228288 XER: 00000100 DAR: 00000024 DSISR: 40000000 GPR00: c026915c df43bac0 df439060 c0731524 df494700 00000000 c06e1c08 00000001 GPR08: 00000001 00000000 df5ff220 00001032 28228282 00000000 c0004ca4 00000000 GPR16: 00000000 00000000 00000000 c073144c dfffe064 c0731524 00000120 c0586108 GPR24: c073132c c073143c c073143c 00000000 c0731524 df67cd70 df494700 00000001 NIP [c02652f8] blk_mq_free_rqs+0x28/0xf8 LR [c026915c] blk_mq_sched_tags_teardown+0x58/0x84 Call Trace: [df43bac0] [c0045f50] flush_workqueue_prep_pwqs+0x178/0x1c4 (unreliable) [df43bae0] [c026915c] blk_mq_sched_tags_teardown+0x58/0x84 [df43bb00] [c02697f0] blk_mq_exit_sched+0x9c/0xb8 [df43bb20] [c0252794] elevator_exit+0x84/0xa4 [df43bb40] [c0256538] blk_exit_queue+0x30/0x50 [df43bb50] [c0256640] blk_cleanup_queue+0xe8/0x184 [df43bb70] [c034732c] swim3_attach+0x330/0x5f0 [df43bbb0] [c034fb24] macio_device_probe+0x58/0xec [df43bbd0] [c032ba88] really_probe+0x1e4/0x2f4 [df43bc00] [c032bd28] driver_probe_device+0x64/0x204 [df43bc20] [c0329ac4] bus_for_each_drv+0x60/0xac [df43bc50] [c032b824] __device_attach+0xe8/0x160 [df43bc80] [c032ab38] bus_probe_device+0xa0/0xbc [df43bca0] [c0327338] device_add+0x3d8/0x630 [df43bcf0] [c0350848] macio_add_one_device+0x444/0x48c [df43bd50] [c03509f8] macio_pci_add_devices+0x168/0x1bc [df43bd90] [c03500ec] macio_pci_probe+0xc0/0x10c [df43bda0] [c02ad884] pci_device_probe+0xd4/0x184 [df43bdd0] [c032ba88] really_probe+0x1e4/0x2f4 [df43be00] [c032bd28] driver_probe_device+0x64/0x204 [df43be20] [c032bfcc] __driver_attach+0x104/0x108 [df43be40] [c0329a00] bus_for_each_dev+0x64/0xb4 [df43be70] [c032add8] bus_add_driver+0x154/0x238 [df43be90] [c032ca24] driver_register+0x84/0x148 [df43bea0] [c0004aa0] do_one_initcall+0x40/0x188 [df43bf00] [c0690100] kernel_init_freeable+0x138/0x1d4 [df43bf30] [c0004cbc] kernel_init+0x18/0x10c [df43bf40] [c00121e4] ret_from_kernel_thread+0x14/0x1c Instruction dump: 5484d97e 4bfff4f4 9421ffe0 7c0802a6 bf410008 7c9e2378 90010024 8124005c 2f890000 419e0078 81230004 7c7c1b78 <81290024> 2f890000 419e0064 81440000 ---[ end trace 12025ab921a9784c ]--- Reverting commit 8ccb8cb ("swim3: convert to blk-mq") resolves the problem. That commit added a struct blk_mq_tag_set to struct floppy_state and initialized it with a blk_mq_init_sq_queue() call. Unfortunately, there is a memset() in swim3_add_device() that subsequently clears the floppy_state struct. That means fs->tag_set->ops is a NULL pointer, and it gets dereferenced by blk_mq_free_rqs() which gets called in the request_irq() error path. Move the memset() to fix this bug. BTW, the request_irq() failure for the left mediabay floppy (fd1) is not a regression. I don't know why it happens. The right media bay floppy (fd0) works fine however. Reported-and-tested-by: Stan Johnson <[email protected]> Fixes: 8ccb8cb ("swim3: convert to blk-mq") Cc: [email protected] Signed-off-by: Finn Thain <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 6347244 ] Since __sanitizer_cov_trace_const_cmp4 is marked as notrace, the function called from __sanitizer_cov_trace_const_cmp4 shouldn't be traceable either. ftrace_graph_caller() gets called every time func write_comp_data() gets called if it isn't marked 'notrace'. This is the backtrace from gdb: #0 ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:179 #1 0xffffff8010201920 in ftrace_caller () at ../arch/arm64/kernel/entry-ftrace.S:151 #2 0xffffff8010439714 in write_comp_data (type=5, arg1=0, arg2=0, ip=18446743524224276596) at ../kernel/kcov.c:116 #3 0xffffff8010439894 in __sanitizer_cov_trace_const_cmp4 (arg1=<optimized out>, arg2=<optimized out>) at ../kernel/kcov.c:188 #4 0xffffff8010201874 in prepare_ftrace_return (self_addr=18446743524226602768, parent=0xffffff801014b918, frame_pointer=18446743524223531344) at ./include/generated/atomic-instrumented.h:27 #5 0xffffff801020194c in ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:182 Rework so that write_comp_data() that are called from __sanitizer_cov_trace_*_cmp*() are marked as 'notrace'. Commit 903e8ff ("kernel/kcov.c: mark funcs in __sanitizer_cov_trace_pc() as notrace") missed to mark write_comp_data() as 'notrace'. When that patch was created gcc-7 was used. In lib/Kconfig.debug config KCOV_ENABLE_COMPARISONS depends on $(cc-option,-fsanitize-coverage=trace-cmp) That code path isn't hit with gcc-7. However, it were that with gcc-8. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Anders Roxell <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Co-developed-by: Arnd Bergmann <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit bb61b84 upstream. Presently when an error is encountered during probe of the cxlflash adapter, a deadlock is seen with cpu thread stuck inside cxlflash_remove(). Below is the trace of the deadlock as logged by khungtaskd: cxlflash 0006:00:00.0: cxlflash_probe: init_afu failed rc=-16 INFO: task kworker/80:1:890 blocked for more than 120 seconds. Not tainted 5.0.0-rc4-capi2-kexec+ #2 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/80:1 D 0 890 2 0x00000808 Workqueue: events work_for_cpu_fn Call Trace: 0x4d72136320 (unreliable) __switch_to+0x2cc/0x460 __schedule+0x2bc/0xac0 schedule+0x40/0xb0 cxlflash_remove+0xec/0x640 [cxlflash] cxlflash_probe+0x370/0x8f0 [cxlflash] local_pci_probe+0x6c/0x140 work_for_cpu_fn+0x38/0x60 process_one_work+0x260/0x530 worker_thread+0x280/0x5d0 kthread+0x1a8/0x1b0 ret_from_kernel_thread+0x5c/0x80 INFO: task systemd-udevd:5160 blocked for more than 120 seconds. The deadlock occurs as cxlflash_remove() is called from cxlflash_probe() without setting 'cxlflash_cfg->state' to STATE_PROBED and the probe thread starts to wait on 'cxlflash_cfg->reset_waitq'. Since the device was never successfully probed the 'cxlflash_cfg->state' never changes from STATE_PROBING hence the deadlock occurs. We fix this deadlock by setting the variable 'cxlflash_cfg->state' to STATE_PROBED in case an error occurs during cxlflash_probe() and just before calling cxlflash_remove(). Cc: [email protected] Fixes: c21e0bb("cxlflash: Base support for IBM CXL Flash Adapter") Signed-off-by: Vaibhav Jain <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit b284909 upstream. With the following commit: 73d5e2b ("cpu/hotplug: detect SMT disabled by BIOS") ... the hotplug code attempted to detect when SMT was disabled by BIOS, in which case it reported SMT as permanently disabled. However, that code broke a virt hotplug scenario, where the guest is booted with only primary CPU threads, and a sibling is brought online later. The problem is that there doesn't seem to be a way to reliably distinguish between the HW "SMT disabled by BIOS" case and the virt "sibling not yet brought online" case. So the above-mentioned commit was a bit misguided, as it permanently disabled SMT for both cases, preventing future virt sibling hotplugs. Going back and reviewing the original problems which were attempted to be solved by that commit, when SMT was disabled in BIOS: 1) /sys/devices/system/cpu/smt/control showed "on" instead of "notsupported"; and 2) vmx_vm_init() was incorrectly showing the L1TF_MSG_SMT warning. I'd propose that we instead consider #1 above to not actually be a problem. Because, at least in the virt case, it's possible that SMT wasn't disabled by BIOS and a sibling thread could be brought online later. So it makes sense to just always default the smt control to "on" to allow for that possibility (assuming cpuid indicates that the CPU supports SMT). The real problem is #2, which has a simple fix: change vmx_vm_init() to query the actual current SMT state -- i.e., whether any siblings are currently online -- instead of looking at the SMT "control" sysfs value. So fix it by: a) reverting the original "fix" and its followup fix: 73d5e2b ("cpu/hotplug: detect SMT disabled by BIOS") bc2d8d2 ("cpu/hotplug: Fix SMT supported evaluation") and b) changing vmx_vm_init() to query the actual current SMT state -- instead of the sysfs control value -- to determine whether the L1TF warning is needed. This also requires the 'sched_smt_present' variable to exported, instead of 'cpu_smt_control'. Fixes: 73d5e2b ("cpu/hotplug: detect SMT disabled by BIOS") Reported-by: Igor Mammedov <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Joe Mario <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/e3a85d585da28cc333ecbc1e78ee9216e6da9396.1548794349.git.jpoimboe@redhat.com Signed-off-by: Greg Kroah-Hartman <[email protected]>

On ESP output, sk_wmem_alloc is incremented for the added padding if a socket is associated to the skb. When replying with TCP SYNACKs over IPsec, the associated sk is a casted request socket, only. Increasing sk_wmem_alloc on a request socket results in a write at an arbitrary struct offset. In the best case, this produces the following WARNING: WARNING: CPU: 1 PID: 0 at lib/refcount.c:102 esp_output_head+0x2e4/0x308 [esp4] refcount_t: addition on 0; use-after-free. CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.0.0-rc3 #2 Hardware name: Marvell Armada 380/385 (Device Tree) [...] [<bf0ff354>] (esp_output_head [esp4]) from [<bf1006a4>] (esp_output+0xb8/0x180 [esp4]) [<bf1006a4>] (esp_output [esp4]) from [<c05dee64>] (xfrm_output_resume+0x558/0x664) [<c05dee64>] (xfrm_output_resume) from [<c05d07b0>] (xfrm4_output+0x44/0xc4) [<c05d07b0>] (xfrm4_output) from [<c05956bc>] (tcp_v4_send_synack+0xa8/0xe8) [<c05956bc>] (tcp_v4_send_synack) from [<c0586ad8>] (tcp_conn_request+0x7f4/0x948) [<c0586ad8>] (tcp_conn_request) from [<c058c404>] (tcp_rcv_state_process+0x2a0/0xe64) [<c058c404>] (tcp_rcv_state_process) from [<c05958ac>] (tcp_v4_do_rcv+0xf0/0x1f4) [<c05958ac>] (tcp_v4_do_rcv) from [<c0598a4c>] (tcp_v4_rcv+0xdb8/0xe20) [<c0598a4c>] (tcp_v4_rcv) from [<c056eb74>] (ip_protocol_deliver_rcu+0x2c/0x2dc) [<c056eb74>] (ip_protocol_deliver_rcu) from [<c056ee6c>] (ip_local_deliver_finish+0x48/0x54) [<c056ee6c>] (ip_local_deliver_finish) from [<c056eecc>] (ip_local_deliver+0x54/0xec) [<c056eecc>] (ip_local_deliver) from [<c056efac>] (ip_rcv+0x48/0xb8) [<c056efac>] (ip_rcv) from [<c0519c2c>] (__netif_receive_skb_one_core+0x50/0x6c) [...] The issue triggers only when not using TCP syncookies, as for syncookies no socket is associated. Fixes: cac2661 ("esp4: Avoid skb_cow_data whenever possible") Fixes: 03e2a30 ("esp6: Avoid skb_cow_data whenever possible") Signed-off-by: Martin Willi <[email protected]> Signed-off-by: Steffen Klassert <[email protected]>

This is the much more correct fix for my earlier attempt at: https://lkml.org/lkml/2018/12/10/118 Short recap: - There's not actually a locking issue, it's just lockdep being a bit too eager to complain about a possible deadlock. - Contrary to what I claimed the real problem is recursion on kn->count. Greg pointed me at sysfs_break_active_protection(), used by the scsi subsystem to allow a sysfs file to unbind itself. That would be a real deadlock, which isn't what's happening here. Also, breaking the active protection means we'd need to manually handle all the lifetime fun. - With Rafael we discussed the task_work approach, which kinda works, but has two downsides: It's a functional change for a lockdep annotation issue, and it won't work for the bind file (which needs to get the errno from the driver load function back to userspace). - Greg also asked why this never showed up: To hit this you need to unregister a 2nd driver from the unload code of your first driver. I guess only gpus do that. The bug has always been there, but only with a recent patch series did we add more locks so that lockdep built a chain from unbinding the snd-hda driver to the acpi_video_unregister call. Full lockdep splat: [12301.898799] ============================================ [12301.898805] WARNING: possible recursive locking detected [12301.898811] 4.20.0-rc7+ MIPS#84 Not tainted [12301.898815] -------------------------------------------- [12301.898821] bash/5297 is trying to acquire lock: [12301.898826] 00000000f61c6093 (kn->count#39){++++}, at: kernfs_remove_by_name_ns+0x3b/0x80 [12301.898841] but task is already holding lock: [12301.898847] 000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190 [12301.898856] other info that might help us debug this: [12301.898862] Possible unsafe locking scenario: [12301.898867] CPU0 [12301.898870] ---- [12301.898874] lock(kn->count#39); [12301.898879] lock(kn->count#39); [12301.898883] *** DEADLOCK *** [12301.898891] May be due to missing lock nesting notation [12301.898899] 5 locks held by bash/5297: [12301.898903] #0: 00000000cd800e54 (sb_writers#4){.+.+}, at: vfs_write+0x17f/0x1b0 [12301.898915] #1: 000000000465e7c2 (&of->mutex){+.+.}, at: kernfs_fop_write+0xd3/0x190 [12301.898925] #2: 000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190 [12301.898936] #3: 00000000414ef7ac (&dev->mutex){....}, at: device_release_driver_internal+0x34/0x240 [12301.898950] #4: 000000003218fbdf (register_count_mutex){+.+.}, at: acpi_video_unregister+0xe/0x40 [12301.898960] stack backtrace: [12301.898968] CPU: 1 PID: 5297 Comm: bash Not tainted 4.20.0-rc7+ MIPS#84 [12301.898974] Hardware name: Hewlett-Packard HP EliteBook 8460p/161C, BIOS 68SCF Ver. F.01 03/11/2011 [12301.898982] Call Trace: [12301.898989] dump_stack+0x67/0x9b [12301.898997] __lock_acquire+0x6ad/0x1410 [12301.899003] ? kernfs_remove_by_name_ns+0x3b/0x80 [12301.899010] ? find_held_lock+0x2d/0x90 [12301.899017] ? mutex_spin_on_owner+0xe4/0x150 [12301.899023] ? find_held_lock+0x2d/0x90 [12301.899030] ? lock_acquire+0x90/0x180 [12301.899036] lock_acquire+0x90/0x180 [12301.899042] ? kernfs_remove_by_name_ns+0x3b/0x80 [12301.899049] __kernfs_remove+0x296/0x310 [12301.899055] ? kernfs_remove_by_name_ns+0x3b/0x80 [12301.899060] ? kernfs_name_hash+0xd/0x80 [12301.899066] ? kernfs_find_ns+0x6c/0x100 [12301.899073] kernfs_remove_by_name_ns+0x3b/0x80 [12301.899080] bus_remove_driver+0x92/0xa0 [12301.899085] acpi_video_unregister+0x24/0x40 [12301.899127] i915_driver_unload+0x42/0x130 [i915] [12301.899160] i915_pci_remove+0x19/0x30 [i915] [12301.899169] pci_device_remove+0x36/0xb0 [12301.899176] device_release_driver_internal+0x185/0x240 [12301.899183] unbind_store+0xaf/0x180 [12301.899189] kernfs_fop_write+0x104/0x190 [12301.899195] __vfs_write+0x31/0x180 [12301.899203] ? rcu_read_lock_sched_held+0x6f/0x80 [12301.899209] ? rcu_sync_lockdep_assert+0x29/0x50 [12301.899216] ? __sb_start_write+0x13c/0x1a0 [12301.899221] ? vfs_write+0x17f/0x1b0 [12301.899227] vfs_write+0xb9/0x1b0 [12301.899233] ksys_write+0x50/0xc0 [12301.899239] do_syscall_64+0x4b/0x180 [12301.899247] entry_SYSCALL_64_after_hwframe+0x49/0xbe [12301.899253] RIP: 0033:0x7f452ac7f7a4 [12301.899259] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 aa f0 2c 00 48 63 ff 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 55 53 48 89 d5 48 89 f3 48 83 [12301.899273] RSP: 002b:00007ffceafa6918 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [12301.899282] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f452ac7f7a4 [12301.899288] RDX: 000000000000000d RSI: 00005612a1abf7c0 RDI: 0000000000000001 [12301.899295] RBP: 00005612a1abf7c0 R08: 000000000000000a R09: 00005612a1c46730 [12301.899301] R10: 000000000000000a R11: 0000000000000246 R12: 000000000000000d [12301.899308] R13: 0000000000000001 R14: 00007f452af4a740 R15: 000000000000000d Locking around I've noticed that usb and i2c already handle similar recursion problems, where a sysfs file can unbind the same type of sysfs somewhere else in the hierarchy. Relevant commits are: commit 356c05d Author: Alan Stern <[email protected]> Date: Mon May 14 13:30:03 2012 -0400 sysfs: get rid of some lockdep false positives commit e9b526f Author: Alexander Sverdlin <[email protected]> Date: Fri May 17 14:56:35 2013 +0200 i2c: suppress lockdep warning on delete_device Implement the same trick for driver bind/unbind. v2: Put the macro into bus.c (Greg). Cc: "Rafael J. Wysocki" <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Ramalingam C <[email protected]> Cc: Arend van Spriel <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Bartosz Golaszewski <[email protected]> Cc: Heikki Krogerus <[email protected]> Cc: Vivek Gautam <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Note that Greg already merged this, so this should drop out as soon as we've moved forward past 4.21-rc1. -Daniel

The idea of taking the reset lock around writing the fence register was to serialise the mmio write we also perform during the reset where those registers get clobbered. However, the lock is overkill as write tearing between reset and fence_update() is harmless; the final value of the fence register is the same. A race between revoke_fences() and fence_update() is also harmless at this point as on the fault path where this is necessary, we acquire the reset lock to coordinate ourselves in the upper layer. The danger of acquiring the reset lock again in fence_update() is that we may recurse from the shrinker along the i915_gem_fault() path. <4> [125.739646] ============================================ <4> [125.739652] WARNING: possible recursive locking detected <4> [125.739659] 5.0.0-rc6-ga6e4cbf00557-drmtip_223+ #1 Tainted: G U <4> [125.739666] -------------------------------------------- <4> [125.739672] gem_mmap_gtt/1017 is trying to acquire lock: <4> [125.739679] 00000000a730190a (&dev_priv->gpu_error.reset_backoff_srcu){+.+.}, at: i915_reset_trylock+0x0/0x310 [i915] <4> [125.739848] but task is already holding lock: <4> [125.739854] 00000000a730190a (&dev_priv->gpu_error.reset_backoff_srcu){+.+.}, at: i915_reset_trylock+0x192/0x310 [i915] <4> [125.739918] other info that might help us debug this: <4> [125.739925] Possible unsafe locking scenario: <4> [125.739930] CPU0 <4> [125.739934] ---- <4> [125.739937] lock(&dev_priv->gpu_error.reset_backoff_srcu); <4> [125.739944] lock(&dev_priv->gpu_error.reset_backoff_srcu); <4> [125.739950] *** DEADLOCK *** <4> [125.739958] May be due to missing lock nesting notation <4> [125.739966] 5 locks held by gem_mmap_gtt/1017: <4> [125.739972] #0: 00000000471f682c (&mm->mmap_sem){++++}, at: __do_page_fault+0x133/0x500 <4> [125.739987] #1: 0000000026542685 (&dev->struct_mutex){+.+.}, at: i915_gem_fault+0x1f6/0x860 [i915] <4> [125.740061] #2: 00000000a730190a (&dev_priv->gpu_error.reset_backoff_srcu){+.+.}, at: i915_reset_trylock+0x192/0x310 [i915] <4> [125.740126] #3: 00000000c828eb4f (fs_reclaim){+.+.}, at: fs_reclaim_acquire.part.25+0x0/0x30 <4> [125.740140] #4: 000000002d360d65 (shrinker_rwsem){++++}, at: shrink_slab+0x1cb/0x2c0 <4> [125.740151] stack backtrace: <4> [125.740159] CPU: 1 PID: 1017 Comm: gem_mmap_gtt Tainted: G U 5.0.0-rc6-ga6e4cbf00557-drmtip_223+ #1 <4> [125.740170] Hardware name: Dell Inc. OptiPlex 745 /0GW726, BIOS 2.3.1 05/21/2007 <4> [125.740180] Call Trace: <4> [125.740189] dump_stack+0x67/0x9b <4> [125.740199] __lock_acquire+0xc75/0x1b00 <4> [125.740209] ? arch_tlb_finish_mmu+0x2a/0xa0 <4> [125.740216] ? tlb_finish_mmu+0x1a/0x30 <4> [125.740222] ? zap_page_range_single+0xe2/0x130 <4> [125.740230] ? lock_acquire+0xa6/0x1c0 <4> [125.740237] lock_acquire+0xa6/0x1c0 <4> [125.740296] ? i915_clear_error_registers+0x280/0x280 [i915] <4> [125.740357] i915_reset_trylock+0x44/0x310 [i915] <4> [125.740417] ? i915_clear_error_registers+0x280/0x280 [i915] <4> [125.740426] ? lockdep_hardirqs_on+0xe0/0x1b0 <4> [125.740434] ? _raw_spin_unlock_irqrestore+0x39/0x60 <4> [125.740499] fence_update+0x218/0x470 [i915] <4> [125.740571] i915_vma_unbind+0xa6/0x550 [i915] <4> [125.740640] i915_gem_object_unbind+0xfa/0x190 [i915] <4> [125.740711] i915_gem_shrink+0x2dc/0x590 [i915] <4> [125.740722] ? ___preempt_schedule+0x16/0x18 <4> [125.740792] ? i915_gem_shrinker_scan+0xc9/0x130 [i915] <4> [125.740861] i915_gem_shrinker_scan+0xc9/0x130 [i915] <4> [125.740870] do_shrink_slab+0x143/0x3f0 <4> [125.740878] shrink_slab+0x228/0x2c0 <4> [125.740886] shrink_node+0x167/0x450 <4> [125.740894] do_try_to_free_pages+0xc4/0x340 <4> [125.740902] try_to_free_pages+0xdc/0x2e0 <4> [125.740911] __alloc_pages_nodemask+0x662/0x1110 <4> [125.740921] ? reacquire_held_locks+0xb5/0x1b0 <4> [125.740928] ? reacquire_held_locks+0xb5/0x1b0 <4> [125.740986] ? i915_reset_trylock+0x192/0x310 [i915] <4> [125.741045] ? i915_memcpy_init_early+0x30/0x30 [i915] <4> [125.741054] pte_alloc_one+0x12/0x70 <4> [125.741060] __pte_alloc+0x11/0xf0 <4> [125.741067] apply_to_page_range+0x37e/0x440 <4> [125.741127] remap_io_mapping+0x6c/0x100 [i915] <4> [125.741196] i915_gem_fault+0x5a9/0x860 [i915] <4> [125.741204] ? ptlock_alloc+0x15/0x30 <4> [125.741212] __do_fault+0x2c/0xb0 <4> [125.741218] __handle_mm_fault+0x8ee/0xfa0 <4> [125.741227] handle_mm_fault+0x196/0x3a0 <4> [125.741235] __do_page_fault+0x246/0x500 <4> [125.741243] ? page_fault+0x8/0x30 <4> [125.741250] page_fault+0x1e/0x30 <4> [125.741256] RIP: 0033:0x55d0cc456e12 <4> [125.741264] Code: b0 df ff ff 89 c2 8b 85 70 df ff ff 01 c2 8b 85 70 df ff ff 48 98 48 8d 0c 85 00 00 00 00 48 8b 85 e0 df ff ff 48 01 c8 f7 d2 <89> 10 83 85 70 df ff ff 01 81 bd 70 df ff ff ff 03 00 00 7e be 48 <4> [125.741280] RSP: 002b:00007ffc1bab7ab0 EFLAGS: 00010206 <4> [125.741287] RAX: 00007fc787cb6000 RBX: 0000000000000000 RCX: 0000000000000000 <4> [125.741295] RDX: 00000000ffffffff RSI: 0000000000005401 RDI: 0000000000000002 <4> [125.741303] RBP: 00007ffc1bab9b70 R08: 00007ffc1bab7920 R09: 000000000000001b <4> [125.741310] R10: 7165722074736554 R11: 0000000000000246 R12: 000055d0cc454a80 <4> [125.741318] R13: 00007ffc1bab9f60 R14: 0000000000000000 R15: 0000000000000000 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109665 Signed-off-by: Chris Wilson <[email protected]> Cc: Mika Kuoppala <[email protected]> Reviewed-by: Mika Kuoppala <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Reference counting in amdgpu_dm_connector for amdgpu_dm_connector::dc_sink and amdgpu_dm_connector::dc_em_sink as well as in dc_link::local_sink seems to be out of shape. Thus make reference counting consistent for these members and just plain increment the reference count when the variable gets assigned and decrement when the pointer is set to zero or replaced. Also simplify reference counting in selected function sopes to be sure the reference is released in any case. In some cases add NULL pointer check before dereferencing. At a hand full of places a comment is placed to stat that the reference increment happened already somewhere else. This actually fixes the following kernel bug on my system when enabling display core in amdgpu. There are some more similar bug reports around, so it probably helps at more places. kernel BUG at mm/slub.c:294! invalid opcode: 0000 [#1] SMP PTI CPU: 9 PID: 1180 Comm: Xorg Not tainted 5.0.0-rc1+ #2 Hardware name: Supermicro X10DAi/X10DAI, BIOS 3.0a 02/05/2018 RIP: 0010:__slab_free+0x1e2/0x3d0 Code: 8b 54 24 30 48 89 4c 24 28 e8 da fb ff ff 4c 8b 54 24 28 85 c0 0f 85 67 fe ff ff 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 49 3b 5c 24 28 75 ab 48 8b 44 24 30 49 89 4c 24 28 49 89 44 RSP: 0018:ffffb0978589fa90 EFLAGS: 00010246 RAX: ffff92f12806c400 RBX: 0000000080200019 RCX: ffff92f12806c400 RDX: ffff92f12806c400 RSI: ffffdd6421a01a00 RDI: ffff92ed2f406e80 RBP: ffffb0978589fb40 R08: 0000000000000001 R09: ffffffffc0ee4748 R10: ffff92f12806c400 R11: 0000000000000001 R12: ffffdd6421a01a00 R13: ffff92f12806c400 R14: ffff92ed2f406e80 R15: ffffdd6421a01a20 FS: 00007f4170be0ac0(0000) GS:ffff92ed2fb40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000562818aaa000 CR3: 000000045745a002 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? drm_dbg+0x87/0x90 [drm] dc_stream_release+0x28/0x50 [amdgpu] amdgpu_dm_connector_mode_valid+0xb4/0x1f0 [amdgpu] drm_helper_probe_single_connector_modes+0x492/0x6b0 [drm_kms_helper] drm_mode_getconnector+0x457/0x490 [drm] ? drm_connector_property_set_ioctl+0x60/0x60 [drm] drm_ioctl_kernel+0xa9/0xf0 [drm] drm_ioctl+0x201/0x3a0 [drm] ? drm_connector_property_set_ioctl+0x60/0x60 [drm] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] do_vfs_ioctl+0xa4/0x630 ? __sys_recvmsg+0x83/0xa0 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x5b/0x160 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f417110809b Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffdd8d1c268 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000562818a8ebc0 RCX: 00007f417110809b RDX: 00007ffdd8d1c2a0 RSI: 00000000c05064a7 RDI: 0000000000000012 RBP: 00007ffdd8d1c2a0 R08: 0000562819012280 R09: 0000000000000007 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c05064a7 R13: 0000000000000012 R14: 0000000000000012 R15: 00007ffdd8d1c2a0 Modules linked in: nfsv4 dns_resolver nfs lockd grace fscache fuse vfat fat amdgpu intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul chash gpu_sched crc32_pclmul snd_hda_codec_realtek ghash_clmulni_intel amd_iommu_v2 iTCO_wdt iTCO_vendor_support ttm snd_hda_codec_generic snd_hda_codec_hdmi ledtrig_audio snd_hda_intel drm_kms_helper snd_hda_codec intel_cstate snd_hda_core drm snd_hwdep snd_seq snd_seq_device intel_uncore snd_pcm intel_rapl_perf snd_timer snd soundcore ioatdma pcspkr intel_wmi_thunderbolt mxm_wmi i2c_i801 lpc_ich pcc_cpufreq auth_rpcgss sunrpc igb crc32c_intel i2c_algo_bit dca wmi hid_cherry analog gameport joydev This patch is based on agd5f/drm-next-5.1-wip. This patch does not require all of that, but agd5f/drm-next-5.1-wip contains at least one more dc_sink counting fix that I could spot. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Leo Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>

In func check_6rd,tunnel->ip6rd.relay_prefixlen may equal to 32,so UBSAN complain about it. UBSAN: Undefined behaviour in net/ipv6/sit.c:781:47 shift exponent 32 is too large for 32-bit type 'unsigned int' CPU: 6 PID: 20036 Comm: syz-executor.0 Not tainted 4.19.27 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xca/0x13e lib/dump_stack.c:113 ubsan_epilogue+0xe/0x81 lib/ubsan.c:159 __ubsan_handle_shift_out_of_bounds+0x293/0x2e8 lib/ubsan.c:425 check_6rd.constprop.9+0x433/0x4e0 net/ipv6/sit.c:781 try_6rd net/ipv6/sit.c:806 [inline] ipip6_tunnel_xmit net/ipv6/sit.c:866 [inline] sit_tunnel_xmit+0x141c/0x2720 net/ipv6/sit.c:1033 __netdev_start_xmit include/linux/netdevice.h:4300 [inline] netdev_start_xmit include/linux/netdevice.h:4309 [inline] xmit_one net/core/dev.c:3243 [inline] dev_hard_start_xmit+0x17c/0x780 net/core/dev.c:3259 __dev_queue_xmit+0x1656/0x2500 net/core/dev.c:3829 neigh_output include/net/neighbour.h:501 [inline] ip6_finish_output2+0xa36/0x2290 net/ipv6/ip6_output.c:120 ip6_finish_output+0x3e7/0xa20 net/ipv6/ip6_output.c:154 NF_HOOK_COND include/linux/netfilter.h:278 [inline] ip6_output+0x1e2/0x720 net/ipv6/ip6_output.c:171 dst_output include/net/dst.h:444 [inline] ip6_local_out+0x99/0x170 net/ipv6/output_core.c:176 ip6_send_skb+0x9d/0x2f0 net/ipv6/ip6_output.c:1697 ip6_push_pending_frames+0xc0/0x100 net/ipv6/ip6_output.c:1717 rawv6_push_pending_frames net/ipv6/raw.c:616 [inline] rawv6_sendmsg+0x2435/0x3530 net/ipv6/raw.c:946 inet_sendmsg+0xf8/0x5c0 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xc8/0x110 net/socket.c:631 ___sys_sendmsg+0x6cf/0x890 net/socket.c:2114 __sys_sendmsg+0xf0/0x1b0 net/socket.c:2152 do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: linmiaohe <[email protected]> Signed-off-by: David S. Miller <[email protected]>

Ido Schimmel says: ==================== mlxsw: Various fixes Patch #1 fixes the recently introduced QSFP thermal zones to correctly work with split ports, where several ports are mapped to the same module. Patch #2 initializes the base MAC in the minimal driver. The driver is using the base MAC as its parent ID and without initializing it, it is reported as all zeroes to user space. ==================== Signed-off-by: David S. Miller <[email protected]>

[ Upstream commit a843dc4 ] In func check_6rd,tunnel->ip6rd.relay_prefixlen may equal to 32,so UBSAN complain about it. UBSAN: Undefined behaviour in net/ipv6/sit.c:781:47 shift exponent 32 is too large for 32-bit type 'unsigned int' CPU: 6 PID: 20036 Comm: syz-executor.0 Not tainted 4.19.27 MIPS#2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xca/0x13e lib/dump_stack.c:113 ubsan_epilogue+0xe/0x81 lib/ubsan.c:159 __ubsan_handle_shift_out_of_bounds+0x293/0x2e8 lib/ubsan.c:425 check_6rd.constprop.9+0x433/0x4e0 net/ipv6/sit.c:781 try_6rd net/ipv6/sit.c:806 [inline] ipip6_tunnel_xmit net/ipv6/sit.c:866 [inline] sit_tunnel_xmit+0x141c/0x2720 net/ipv6/sit.c:1033 __netdev_start_xmit include/linux/netdevice.h:4300 [inline] netdev_start_xmit include/linux/netdevice.h:4309 [inline] xmit_one net/core/dev.c:3243 [inline] dev_hard_start_xmit+0x17c/0x780 net/core/dev.c:3259 __dev_queue_xmit+0x1656/0x2500 net/core/dev.c:3829 neigh_output include/net/neighbour.h:501 [inline] ip6_finish_output2+0xa36/0x2290 net/ipv6/ip6_output.c:120 ip6_finish_output+0x3e7/0xa20 net/ipv6/ip6_output.c:154 NF_HOOK_COND include/linux/netfilter.h:278 [inline] ip6_output+0x1e2/0x720 net/ipv6/ip6_output.c:171 dst_output include/net/dst.h:444 [inline] ip6_local_out+0x99/0x170 net/ipv6/output_core.c:176 ip6_send_skb+0x9d/0x2f0 net/ipv6/ip6_output.c:1697 ip6_push_pending_frames+0xc0/0x100 net/ipv6/ip6_output.c:1717 rawv6_push_pending_frames net/ipv6/raw.c:616 [inline] rawv6_sendmsg+0x2435/0x3530 net/ipv6/raw.c:946 inet_sendmsg+0xf8/0x5c0 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xc8/0x110 net/socket.c:631 ___sys_sendmsg+0x6cf/0x890 net/socket.c:2114 __sys_sendmsg+0xf0/0x1b0 net/socket.c:2152 do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: linmiaohe <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit cee51c3 ] There may be a race condition if f_fs calls unregister_gadget_item in ffs_closed() when unregister_gadget is called by UDC store at the same time. this leads to a kernel NULL pointer dereference: [ 310.644928] Unable to handle kernel NULL pointer dereference at virtual address 00000004 [ 310.645053] init: Service 'adbd' is being killed... [ 310.658938] pgd = c9528000 [ 310.662515] [00000004] *pgd=19451831, *pte=00000000, *ppte=00000000 [ 310.669702] Internal error: Oops: 817 [MIPS#1] PREEMPT SMP ARM [ 310.675211] Modules linked in: [ 310.678294] CPU: 0 PID: 1537 Comm: ->transport Not tainted 4.1.15-03725-g793404c MIPS#2 [ 310.685958] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 310.692493] task: c8e24200 ti: c945e000 task.ti: c945e000 [ 310.697911] PC is at usb_gadget_unregister_driver+0xb4/0xd0 [ 310.703502] LR is at __mutex_lock_slowpath+0x10c/0x16c [ 310.708648] pc : [<c075efc0>] lr : [<c0bfb0bc>] psr: 600f0113 <snip..> [ 311.565585] [<c075efc0>] (usb_gadget_unregister_driver) from [<c075e2b8>] (unregister_gadget_item+0x1c/0x34) [ 311.575426] [<c075e2b8>] (unregister_gadget_item) from [<c076fcc8>] (ffs_closed+0x8c/0x9c) [ 311.583702] [<c076fcc8>] (ffs_closed) from [<c07736b8>] (ffs_data_reset+0xc/0xa0) [ 311.591194] [<c07736b8>] (ffs_data_reset) from [<c07738ac>] (ffs_data_closed+0x90/0xd0) [ 311.599208] [<c07738ac>] (ffs_data_closed) from [<c07738f8>] (ffs_ep0_release+0xc/0x14) [ 311.607224] [<c07738f8>] (ffs_ep0_release) from [<c023e030>] (__fput+0x80/0x1d0) [ 311.614635] [<c023e030>] (__fput) from [<c014e688>] (task_work_run+0xb0/0xe8) [ 311.621788] [<c014e688>] (task_work_run) from [<c010afdc>] (do_work_pending+0x7c/0xa4) [ 311.629718] [<c010afdc>] (do_work_pending) from [<c010770c>] (work_pending+0xc/0x20) for functions using functionFS, i.e. android adbd will close /dev/usb-ffs/adb/ep0 when usb IO thread fails, but switch adb from on to off also triggers write "none" > UDC. These 2 operations both call unregister_gadget, which will lead to the panic above. add a mutex before calling unregister_gadget for api used in f_fs. Signed-off-by: Winter Wang <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 92d1d07 ] Kmemleak throws endless warnings during boot due to in __alloc_alien_cache(), alc = kmalloc_node(memsize, gfp, node); init_arraycache(&alc->ac, entries, batch); kmemleak_no_scan(ac); Kmemleak does not track the array cache (alc->ac) but the alien cache (alc) instead, so let it track the latter by lifting kmemleak_no_scan() out of init_arraycache(). There is another place that calls init_arraycache(), but alloc_kmem_cache_cpus() uses the percpu allocation where will never be considered as a leak. kmemleak: Found object by alias at 0xffff8007b9aa7e38 CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ MIPS#2 Call trace: dump_backtrace+0x0/0x168 show_stack+0x24/0x30 dump_stack+0x88/0xb0 lookup_object+0x84/0xac find_and_get_object+0x84/0xe4 kmemleak_no_scan+0x74/0xf4 setup_kmem_cache_node+0x2b4/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 ret_from_fork+0x10/0x18 kmemleak: Object 0xffff8007b9aa7e00 (size 256): kmemleak: comm "swapper/0", pid 1, jiffies 4294697137 kmemleak: min_count = 1 kmemleak: count = 0 kmemleak: flags = 0x1 kmemleak: checksum = 0 kmemleak: backtrace: kmemleak_alloc+0x84/0xb8 kmem_cache_alloc_node_trace+0x31c/0x3a0 __kmalloc_node+0x58/0x78 setup_kmem_cache_node+0x26c/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 kmemleak: Not scanning unknown object at 0xffff8007b9aa7e38 CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ MIPS#2 Call trace: dump_backtrace+0x0/0x168 show_stack+0x24/0x30 dump_stack+0x88/0xb0 kmemleak_no_scan+0x90/0xf4 setup_kmem_cache_node+0x2b4/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 ret_from_fork+0x10/0x18 Link: http://lkml.kernel.org/r/[email protected] Fixes: 1fe00d5 ("slab: factor out initialization of array cache") Signed-off-by: Qian Cai <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Catalin Marinas <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit d982b33 ] ================================================================= ==20875==ERROR: LeakSanitizer: detected memory leaks Direct leak of 1160 byte(s) in 1 object(s) allocated from: #0 0x7f1b6fc84138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) MIPS#1 0x55bd50005599 in zalloc util/util.h:23 MIPS#2 0x55bd500068f5 in perf_evsel__newtp_idx util/evsel.c:327 MIPS#3 0x55bd4ff810fc in perf_evsel__newtp /home/work/linux/tools/perf/util/evsel.h:216 MIPS#4 0x55bd4ff81608 in test__perf_evsel__tp_sched_test tests/evsel-tp-sched.c:69 MIPS#5 0x55bd4ff528e6 in run_test tests/builtin-test.c:358 MIPS#6 0x55bd4ff52baf in test_and_print tests/builtin-test.c:388 MIPS#7 0x55bd4ff543fe in __cmd_test tests/builtin-test.c:583 MIPS#8 0x55bd4ff5572f in cmd_test tests/builtin-test.c:722 MIPS#9 0x55bd4ffc4087 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 MIPS#10 0x55bd4ffc45c6 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 MIPS#11 0x55bd4ffc49ca in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 MIPS#12 0x55bd4ffc5138 in main /home/changbin/work/linux/tools/perf/perf.c:520 MIPS#13 0x7f1b6e34809a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Indirect leak of 19 byte(s) in 1 object(s) allocated from: #0 0x7f1b6fc83f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30) MIPS#1 0x7f1b6e3ac30f in vasprintf (/lib/x86_64-linux-gnu/libc.so.6+0x8830f) Signed-off-by: Changbin Du <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Fixes: 6a6cd11 ("perf test: Add test for the sched tracepoint format fields") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit 9002b21 upstream. Commit 32a5ad9 ("sysctl: handle overflow for file-max") hooked up min/max values for the file-max sysctl parameter via the .extra1 and .extra2 fields in the corresponding struct ctl_table entry. Unfortunately, the minimum value points at the global 'zero' variable, which is an int. This results in a KASAN splat when accessed as a long by proc_doulongvec_minmax on 64-bit architectures: | BUG: KASAN: global-out-of-bounds in __do_proc_doulongvec_minmax+0x5d8/0x6a0 | Read of size 8 at addr ffff2000133d1c20 by task systemd/1 | | CPU: 0 PID: 1 Comm: systemd Not tainted 5.1.0-rc3-00012-g40b114779944 MIPS#2 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0x0/0x228 | show_stack+0x14/0x20 | dump_stack+0xe8/0x124 | print_address_description+0x60/0x258 | kasan_report+0x140/0x1a0 | __asan_report_load8_noabort+0x18/0x20 | __do_proc_doulongvec_minmax+0x5d8/0x6a0 | proc_doulongvec_minmax+0x4c/0x78 | proc_sys_call_handler.isra.19+0x144/0x1d8 | proc_sys_write+0x34/0x58 | __vfs_write+0x54/0xe8 | vfs_write+0x124/0x3c0 | ksys_write+0xbc/0x168 | __arm64_sys_write+0x68/0x98 | el0_svc_common+0x100/0x258 | el0_svc_handler+0x48/0xc0 | el0_svc+0x8/0xc | | The buggy address belongs to the variable: | zero+0x0/0x40 | | Memory state around the buggy address: | ffff2000133d1b00: 00 00 00 00 00 00 00 00 fa fa fa fa 04 fa fa fa | ffff2000133d1b80: fa fa fa fa 04 fa fa fa fa fa fa fa 04 fa fa fa | >ffff2000133d1c00: fa fa fa fa 04 fa fa fa fa fa fa fa 00 00 00 00 | ^ | ffff2000133d1c80: fa fa fa fa 00 fa fa fa fa fa fa fa 00 00 00 00 | ffff2000133d1d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Fix the splat by introducing a unsigned long 'zero_ul' and using that instead. Link: http://lkml.kernel.org/r/[email protected] Fixes: 32a5ad9 ("sysctl: handle overflow for file-max") Signed-off-by: Will Deacon <[email protected]> Acked-by: Christian Brauner <[email protected]> Cc: Kees Cook <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Matteo Croce <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 47b1682 ] If xace hardware reports a bad version number, the error handling code in ace_setup() calls put_disk(), followed by queue cleanup. However, since the disk data structure has the queue pointer set, put_disk() also cleans and releases the queue. This results in blk_cleanup_queue() accessing an already released data structure, which in turn may result in a crash such as the following. [ 10.681671] BUG: Kernel NULL pointer dereference at 0x00000040 [ 10.681826] Faulting instruction address: 0xc0431480 [ 10.682072] Oops: Kernel access of bad area, sig: 11 [MIPS#1] [ 10.682251] BE PAGE_SIZE=4K PREEMPT Xilinx Virtex440 [ 10.682387] Modules linked in: [ 10.682528] CPU: 0 PID: 1 Comm: swapper Tainted: G W 5.0.0-rc6-next-20190218+ MIPS#2 [ 10.682733] NIP: c0431480 LR: c043147c CTR: c0422ad8 [ 10.682863] REGS: cf82fbe0 TRAP: 0300 Tainted: G W (5.0.0-rc6-next-20190218+) [ 10.683065] MSR: 00029000 <CE,EE,ME> CR: 22000222 XER: 00000000 [ 10.683236] DEAR: 00000040 ESR: 00000000 [ 10.683236] GPR00: c043147c cf82fc90 cf82ccc0 00000000 00000000 00000000 00000002 00000000 [ 10.683236] GPR08: 00000000 00000000 c04310bc 00000000 22000222 00000000 c0002c54 00000000 [ 10.683236] GPR16: 00000000 00000001 c09aa39c c09021b0 c09021dc 00000007 c0a68c08 00000000 [ 10.683236] GPR24: 00000001 ced6d400 ced6dcf0 c0815d9c 00000000 00000000 00000000 cedf0800 [ 10.684331] NIP [c0431480] blk_mq_run_hw_queue+0x28/0x114 [ 10.684473] LR [c043147c] blk_mq_run_hw_queue+0x24/0x114 [ 10.684602] Call Trace: [ 10.684671] [cf82fc90] [c043147c] blk_mq_run_hw_queue+0x24/0x114 (unreliable) [ 10.684854] [cf82fcc0] [c04315bc] blk_mq_run_hw_queues+0x50/0x7c [ 10.685002] [cf82fce0] [c0422b24] blk_set_queue_dying+0x30/0x68 [ 10.685154] [cf82fcf0] [c0423ec0] blk_cleanup_queue+0x34/0x14c [ 10.685306] [cf82fd10] [c054d73c] ace_probe+0x3dc/0x508 [ 10.685445] [cf82fd50] [c052d740] platform_drv_probe+0x4c/0xb8 [ 10.685592] [cf82fd70] [c052abb0] really_probe+0x20c/0x32c [ 10.685728] [cf82fda0] [c052ae58] driver_probe_device+0x68/0x464 [ 10.685877] [cf82fdc0] [c052b500] device_driver_attach+0xb4/0xe4 [ 10.686024] [cf82fde0] [c052b5dc] __driver_attach+0xac/0xfc [ 10.686161] [cf82fe00] [c0528428] bus_for_each_dev+0x80/0xc0 [ 10.686314] [cf82fe30] [c0529b3c] bus_add_driver+0x144/0x234 [ 10.686457] [cf82fe50] [c052c46] driver_register+0x88/0x15c [ 10.686610] [cf82fe60] [c09de288] ace_init+0x4c/0xac [ 10.686742] [cf82fe80] [c0002730] do_one_initcall+0xac/0x330 [ 10.686888] [cf82fee0] [c09aafd0] kernel_init_freeable+0x34c/0x478 [ 10.687043] [cf82ff30] [c0002c6c] kernel_init+0x18/0x114 [ 10.687188] [cf82ff40] [c000f2f0] ret_from_kernel_thread+0x14/0x1c [ 10.687349] Instruction dump: [ 10.687435] 3863ffd4 4bfffd70 9421ffd0 7c0802a6 93c10028 7c9e2378 93e1002c 38810008 [ 10.687637] 7c7f1b78 90010034 4bfffc25 813f008c <81290040> 75290100 4182002c 80810008 [ 10.688056] ---[ end trace 13c9ff51d41b9d40 ]--- Fix the problem by setting the disk queue pointer to NULL before calling put_disk(). A more comprehensive fix might be to rearrange the code to check the hardware version before initializing data structures, but I don't know if this would have undesirable side effects, and it would increase the complexity of backporting the fix to older kernels. Fixes: 74489a9 ("Add support for Xilinx SystemACE CompactFlash interface") Acked-by: Michal Simek <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit a843dc4 ] In func check_6rd,tunnel->ip6rd.relay_prefixlen may equal to 32,so UBSAN complain about it. UBSAN: Undefined behaviour in net/ipv6/sit.c:781:47 shift exponent 32 is too large for 32-bit type 'unsigned int' CPU: 6 PID: 20036 Comm: syz-executor.0 Not tainted 4.19.27 MIPS#2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xca/0x13e lib/dump_stack.c:113 ubsan_epilogue+0xe/0x81 lib/ubsan.c:159 __ubsan_handle_shift_out_of_bounds+0x293/0x2e8 lib/ubsan.c:425 check_6rd.constprop.9+0x433/0x4e0 net/ipv6/sit.c:781 try_6rd net/ipv6/sit.c:806 [inline] ipip6_tunnel_xmit net/ipv6/sit.c:866 [inline] sit_tunnel_xmit+0x141c/0x2720 net/ipv6/sit.c:1033 __netdev_start_xmit include/linux/netdevice.h:4300 [inline] netdev_start_xmit include/linux/netdevice.h:4309 [inline] xmit_one net/core/dev.c:3243 [inline] dev_hard_start_xmit+0x17c/0x780 net/core/dev.c:3259 __dev_queue_xmit+0x1656/0x2500 net/core/dev.c:3829 neigh_output include/net/neighbour.h:501 [inline] ip6_finish_output2+0xa36/0x2290 net/ipv6/ip6_output.c:120 ip6_finish_output+0x3e7/0xa20 net/ipv6/ip6_output.c:154 NF_HOOK_COND include/linux/netfilter.h:278 [inline] ip6_output+0x1e2/0x720 net/ipv6/ip6_output.c:171 dst_output include/net/dst.h:444 [inline] ip6_local_out+0x99/0x170 net/ipv6/output_core.c:176 ip6_send_skb+0x9d/0x2f0 net/ipv6/ip6_output.c:1697 ip6_push_pending_frames+0xc0/0x100 net/ipv6/ip6_output.c:1717 rawv6_push_pending_frames net/ipv6/raw.c:616 [inline] rawv6_sendmsg+0x2435/0x3530 net/ipv6/raw.c:946 inet_sendmsg+0xf8/0x5c0 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xc8/0x110 net/socket.c:631 ___sys_sendmsg+0x6cf/0x890 net/socket.c:2114 __sys_sendmsg+0xf0/0x1b0 net/socket.c:2152 do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: linmiaohe <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit cee51c3 ] There may be a race condition if f_fs calls unregister_gadget_item in ffs_closed() when unregister_gadget is called by UDC store at the same time. this leads to a kernel NULL pointer dereference: [ 310.644928] Unable to handle kernel NULL pointer dereference at virtual address 00000004 [ 310.645053] init: Service 'adbd' is being killed... [ 310.658938] pgd = c9528000 [ 310.662515] [00000004] *pgd=19451831, *pte=00000000, *ppte=00000000 [ 310.669702] Internal error: Oops: 817 [MIPS#1] PREEMPT SMP ARM [ 310.675211] Modules linked in: [ 310.678294] CPU: 0 PID: 1537 Comm: ->transport Not tainted 4.1.15-03725-g793404c MIPS#2 [ 310.685958] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 310.692493] task: c8e24200 ti: c945e000 task.ti: c945e000 [ 310.697911] PC is at usb_gadget_unregister_driver+0xb4/0xd0 [ 310.703502] LR is at __mutex_lock_slowpath+0x10c/0x16c [ 310.708648] pc : [<c075efc0>] lr : [<c0bfb0bc>] psr: 600f0113 <snip..> [ 311.565585] [<c075efc0>] (usb_gadget_unregister_driver) from [<c075e2b8>] (unregister_gadget_item+0x1c/0x34) [ 311.575426] [<c075e2b8>] (unregister_gadget_item) from [<c076fcc8>] (ffs_closed+0x8c/0x9c) [ 311.583702] [<c076fcc8>] (ffs_closed) from [<c07736b8>] (ffs_data_reset+0xc/0xa0) [ 311.591194] [<c07736b8>] (ffs_data_reset) from [<c07738ac>] (ffs_data_closed+0x90/0xd0) [ 311.599208] [<c07738ac>] (ffs_data_closed) from [<c07738f8>] (ffs_ep0_release+0xc/0x14) [ 311.607224] [<c07738f8>] (ffs_ep0_release) from [<c023e030>] (__fput+0x80/0x1d0) [ 311.614635] [<c023e030>] (__fput) from [<c014e688>] (task_work_run+0xb0/0xe8) [ 311.621788] [<c014e688>] (task_work_run) from [<c010afdc>] (do_work_pending+0x7c/0xa4) [ 311.629718] [<c010afdc>] (do_work_pending) from [<c010770c>] (work_pending+0xc/0x20) for functions using functionFS, i.e. android adbd will close /dev/usb-ffs/adb/ep0 when usb IO thread fails, but switch adb from on to off also triggers write "none" > UDC. These 2 operations both call unregister_gadget, which will lead to the panic above. add a mutex before calling unregister_gadget for api used in f_fs. Signed-off-by: Winter Wang <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 92d1d07 ] Kmemleak throws endless warnings during boot due to in __alloc_alien_cache(), alc = kmalloc_node(memsize, gfp, node); init_arraycache(&alc->ac, entries, batch); kmemleak_no_scan(ac); Kmemleak does not track the array cache (alc->ac) but the alien cache (alc) instead, so let it track the latter by lifting kmemleak_no_scan() out of init_arraycache(). There is another place that calls init_arraycache(), but alloc_kmem_cache_cpus() uses the percpu allocation where will never be considered as a leak. kmemleak: Found object by alias at 0xffff8007b9aa7e38 CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ MIPS#2 Call trace: dump_backtrace+0x0/0x168 show_stack+0x24/0x30 dump_stack+0x88/0xb0 lookup_object+0x84/0xac find_and_get_object+0x84/0xe4 kmemleak_no_scan+0x74/0xf4 setup_kmem_cache_node+0x2b4/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 ret_from_fork+0x10/0x18 kmemleak: Object 0xffff8007b9aa7e00 (size 256): kmemleak: comm "swapper/0", pid 1, jiffies 4294697137 kmemleak: min_count = 1 kmemleak: count = 0 kmemleak: flags = 0x1 kmemleak: checksum = 0 kmemleak: backtrace: kmemleak_alloc+0x84/0xb8 kmem_cache_alloc_node_trace+0x31c/0x3a0 __kmalloc_node+0x58/0x78 setup_kmem_cache_node+0x26c/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 kmemleak: Not scanning unknown object at 0xffff8007b9aa7e38 CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ MIPS#2 Call trace: dump_backtrace+0x0/0x168 show_stack+0x24/0x30 dump_stack+0x88/0xb0 kmemleak_no_scan+0x90/0xf4 setup_kmem_cache_node+0x2b4/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 ret_from_fork+0x10/0x18 Link: http://lkml.kernel.org/r/[email protected] Fixes: 1fe00d5 ("slab: factor out initialization of array cache") Signed-off-by: Qian Cai <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Catalin Marinas <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit d982b33 ] ================================================================= ==20875==ERROR: LeakSanitizer: detected memory leaks Direct leak of 1160 byte(s) in 1 object(s) allocated from: #0 0x7f1b6fc84138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) MIPS#1 0x55bd50005599 in zalloc util/util.h:23 MIPS#2 0x55bd500068f5 in perf_evsel__newtp_idx util/evsel.c:327 MIPS#3 0x55bd4ff810fc in perf_evsel__newtp /home/work/linux/tools/perf/util/evsel.h:216 MIPS#4 0x55bd4ff81608 in test__perf_evsel__tp_sched_test tests/evsel-tp-sched.c:69 MIPS#5 0x55bd4ff528e6 in run_test tests/builtin-test.c:358 MIPS#6 0x55bd4ff52baf in test_and_print tests/builtin-test.c:388 MIPS#7 0x55bd4ff543fe in __cmd_test tests/builtin-test.c:583 MIPS#8 0x55bd4ff5572f in cmd_test tests/builtin-test.c:722 MIPS#9 0x55bd4ffc4087 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 MIPS#10 0x55bd4ffc45c6 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 MIPS#11 0x55bd4ffc49ca in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 MIPS#12 0x55bd4ffc5138 in main /home/changbin/work/linux/tools/perf/perf.c:520 MIPS#13 0x7f1b6e34809a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Indirect leak of 19 byte(s) in 1 object(s) allocated from: #0 0x7f1b6fc83f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30) MIPS#1 0x7f1b6e3ac30f in vasprintf (/lib/x86_64-linux-gnu/libc.so.6+0x8830f) Signed-off-by: Changbin Du <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Fixes: 6a6cd11 ("perf test: Add test for the sched tracepoint format fields") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit 9002b21 upstream. Commit 32a5ad9 ("sysctl: handle overflow for file-max") hooked up min/max values for the file-max sysctl parameter via the .extra1 and .extra2 fields in the corresponding struct ctl_table entry. Unfortunately, the minimum value points at the global 'zero' variable, which is an int. This results in a KASAN splat when accessed as a long by proc_doulongvec_minmax on 64-bit architectures: | BUG: KASAN: global-out-of-bounds in __do_proc_doulongvec_minmax+0x5d8/0x6a0 | Read of size 8 at addr ffff2000133d1c20 by task systemd/1 | | CPU: 0 PID: 1 Comm: systemd Not tainted 5.1.0-rc3-00012-g40b114779944 MIPS#2 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0x0/0x228 | show_stack+0x14/0x20 | dump_stack+0xe8/0x124 | print_address_description+0x60/0x258 | kasan_report+0x140/0x1a0 | __asan_report_load8_noabort+0x18/0x20 | __do_proc_doulongvec_minmax+0x5d8/0x6a0 | proc_doulongvec_minmax+0x4c/0x78 | proc_sys_call_handler.isra.19+0x144/0x1d8 | proc_sys_write+0x34/0x58 | __vfs_write+0x54/0xe8 | vfs_write+0x124/0x3c0 | ksys_write+0xbc/0x168 | __arm64_sys_write+0x68/0x98 | el0_svc_common+0x100/0x258 | el0_svc_handler+0x48/0xc0 | el0_svc+0x8/0xc | | The buggy address belongs to the variable: | zero+0x0/0x40 | | Memory state around the buggy address: | ffff2000133d1b00: 00 00 00 00 00 00 00 00 fa fa fa fa 04 fa fa fa | ffff2000133d1b80: fa fa fa fa 04 fa fa fa fa fa fa fa 04 fa fa fa | >ffff2000133d1c00: fa fa fa fa 04 fa fa fa fa fa fa fa 00 00 00 00 | ^ | ffff2000133d1c80: fa fa fa fa 00 fa fa fa fa fa fa fa 00 00 00 00 | ffff2000133d1d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Fix the splat by introducing a unsigned long 'zero_ul' and using that instead. Link: http://lkml.kernel.org/r/[email protected] Fixes: 32a5ad9 ("sysctl: handle overflow for file-max") Signed-off-by: Will Deacon <[email protected]> Acked-by: Christian Brauner <[email protected]> Cc: Kees Cook <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Matteo Croce <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 47b1682 ] If xace hardware reports a bad version number, the error handling code in ace_setup() calls put_disk(), followed by queue cleanup. However, since the disk data structure has the queue pointer set, put_disk() also cleans and releases the queue. This results in blk_cleanup_queue() accessing an already released data structure, which in turn may result in a crash such as the following. [ 10.681671] BUG: Kernel NULL pointer dereference at 0x00000040 [ 10.681826] Faulting instruction address: 0xc0431480 [ 10.682072] Oops: Kernel access of bad area, sig: 11 [MIPS#1] [ 10.682251] BE PAGE_SIZE=4K PREEMPT Xilinx Virtex440 [ 10.682387] Modules linked in: [ 10.682528] CPU: 0 PID: 1 Comm: swapper Tainted: G W 5.0.0-rc6-next-20190218+ MIPS#2 [ 10.682733] NIP: c0431480 LR: c043147c CTR: c0422ad8 [ 10.682863] REGS: cf82fbe0 TRAP: 0300 Tainted: G W (5.0.0-rc6-next-20190218+) [ 10.683065] MSR: 00029000 <CE,EE,ME> CR: 22000222 XER: 00000000 [ 10.683236] DEAR: 00000040 ESR: 00000000 [ 10.683236] GPR00: c043147c cf82fc90 cf82ccc0 00000000 00000000 00000000 00000002 00000000 [ 10.683236] GPR08: 00000000 00000000 c04310bc 00000000 22000222 00000000 c0002c54 00000000 [ 10.683236] GPR16: 00000000 00000001 c09aa39c c09021b0 c09021dc 00000007 c0a68c08 00000000 [ 10.683236] GPR24: 00000001 ced6d400 ced6dcf0 c0815d9c 00000000 00000000 00000000 cedf0800 [ 10.684331] NIP [c0431480] blk_mq_run_hw_queue+0x28/0x114 [ 10.684473] LR [c043147c] blk_mq_run_hw_queue+0x24/0x114 [ 10.684602] Call Trace: [ 10.684671] [cf82fc90] [c043147c] blk_mq_run_hw_queue+0x24/0x114 (unreliable) [ 10.684854] [cf82fcc0] [c04315bc] blk_mq_run_hw_queues+0x50/0x7c [ 10.685002] [cf82fce0] [c0422b24] blk_set_queue_dying+0x30/0x68 [ 10.685154] [cf82fcf0] [c0423ec0] blk_cleanup_queue+0x34/0x14c [ 10.685306] [cf82fd10] [c054d73c] ace_probe+0x3dc/0x508 [ 10.685445] [cf82fd50] [c052d740] platform_drv_probe+0x4c/0xb8 [ 10.685592] [cf82fd70] [c052abb0] really_probe+0x20c/0x32c [ 10.685728] [cf82fda0] [c052ae58] driver_probe_device+0x68/0x464 [ 10.685877] [cf82fdc0] [c052b500] device_driver_attach+0xb4/0xe4 [ 10.686024] [cf82fde0] [c052b5dc] __driver_attach+0xac/0xfc [ 10.686161] [cf82fe00] [c0528428] bus_for_each_dev+0x80/0xc0 [ 10.686314] [cf82fe30] [c0529b3c] bus_add_driver+0x144/0x234 [ 10.686457] [cf82fe50] [c052c46] driver_register+0x88/0x15c [ 10.686610] [cf82fe60] [c09de288] ace_init+0x4c/0xac [ 10.686742] [cf82fe80] [c0002730] do_one_initcall+0xac/0x330 [ 10.686888] [cf82fee0] [c09aafd0] kernel_init_freeable+0x34c/0x478 [ 10.687043] [cf82ff30] [c0002c6c] kernel_init+0x18/0x114 [ 10.687188] [cf82ff40] [c000f2f0] ret_from_kernel_thread+0x14/0x1c [ 10.687349] Instruction dump: [ 10.687435] 3863ffd4 4bfffd70 9421ffd0 7c0802a6 93c10028 7c9e2378 93e1002c 38810008 [ 10.687637] 7c7f1b78 90010034 4bfffc25 813f008c <81290040> 75290100 4182002c 80810008 [ 10.688056] ---[ end trace 13c9ff51d41b9d40 ]--- Fix the problem by setting the disk queue pointer to NULL before calling put_disk(). A more comprehensive fix might be to rearrange the code to check the hardware version before initializing data structures, but I don't know if this would have undesirable side effects, and it would increase the complexity of backporting the fix to older kernels. Fixes: 74489a9 ("Add support for Xilinx SystemACE CompactFlash interface") Acked-by: Michal Simek <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

paulburton added the bug label Aug 12, 2014

paulburton added v3.16 labels Aug 12, 2014

paulburton closed this as completed Aug 19, 2014

ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Oct 20, 2014

drm/nv50/disp: fix dpms regression on certain boards

5838ae6

Reported in fdo#82527 comment MIPS#2. Signed-off-by: Ben Skeggs <[email protected]>

ZubairLK mentioned this issue Feb 24, 2015

Reliable crashes of nodejs #21

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build failure on JZ4780 generic target #2

Build failure on JZ4780 generic target #2

ChrisPWilson commented Aug 12, 2014

paulburton commented Aug 12, 2014

ZubairLK commented Aug 12, 2014

ChrisPWilson commented Aug 12, 2014

paulburton commented Aug 12, 2014

paulburton commented Aug 19, 2014

Build failure on JZ4780 generic target #2

Build failure on JZ4780 generic target #2

Comments

ChrisPWilson commented Aug 12, 2014

paulburton commented Aug 12, 2014

ZubairLK commented Aug 12, 2014

ChrisPWilson commented Aug 12, 2014

paulburton commented Aug 12, 2014

paulburton commented Aug 19, 2014