Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build failure on JZ4780 generic target #2

Closed
ChrisPWilson opened this issue Aug 12, 2014 · 5 comments
Closed

Build failure on JZ4780 generic target #2

ChrisPWilson opened this issue Aug 12, 2014 · 5 comments

Comments

@ChrisPWilson
Copy link

Building the generic target results in the error:

DTC arch/mips/jz4780/dts/generic.dtb
ERROR (duplicate_label): Duplicate label 'extal' on /clocks/oscillator and /clocks/extal
ERROR: Input tree has errors, aborting (use -f to force output)
make[3]: *** [arch/mips/jz4780/dts/generic.dtb] Error 2
make[2]: *** [arch/mips/jz4780/dts] Error 2
make[1]: *** [arch/mips/jz4780] Error 2
make[1]: *** Waiting for unfinished jobs....

I'm using the IA32 compiler here:

https://sourcery.mentor.com/GNUToolchain/release2791?cmpid=7108

@paulburton
Copy link

As I mentioned by email, my inclination is to fix this by just removing the generic "board" which I'm not sure gives us very much. Any objections?

@paulburton paulburton added the bug label Aug 12, 2014
@ZubairLK
Copy link

pretty redundant atm.

generic board can always be added later on if several similar jz4780 based boards prop up..

@ChrisPWilson
Copy link
Author

I have no objections, other than the fact that a "generic" target might be useful as a reference.
Re. "similar jz4780 boards," we do have an npm801 tablet, support for which would be nice :)

@paulburton
Copy link

Even then a generic "board" wouldn't be so useful, since essentially all commonality between boards is just due to the SoC & thus already abstracted. Patch submitted.

There are some npm801 tablets in the Leeds office, I'll see if I can nab one at some point :)

@paulburton
Copy link

Fixed by commit c16c790 "MIPS: remove generic jz4780 "board"".

ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Oct 20, 2014
With the introduction of fair queued rwlock, recursive read_lock()
may hang the offending process if there is a write_lock() somewhere
in between.

With recursive read_lock checking enabled, the following error was
reported:

=============================================
[ INFO: possible recursive locking detected ]
3.16.0-rc1 MIPS#2 Tainted: G            E
---------------------------------------------
load_policy/708 is trying to acquire lock:
 (policy_rwlock){.+.+..}, at: [<ffffffff8125b32a>]
security_genfs_sid+0x3a/0x170

but task is already holding lock:
 (policy_rwlock){.+.+..}, at: [<ffffffff8125b48c>]
security_fs_use+0x2c/0x110

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(policy_rwlock);
  lock(policy_rwlock);

This patch fixes the occurrence of recursive read_lock() of
policy_rwlock by adding a helper function __security_genfs_sid()
which requires caller to take the lock before calling it. The
security_fs_use() was then modified to call the new helper function.

Signed-off-by: Waiman Long <[email protected]>
Acked-by:  Stephen Smalley <[email protected]>
Signed-off-by: Paul Moore <[email protected]>
ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Oct 20, 2014
We were experiencing occasional "BUG: scheduling while atomic" splats
in our testing. Enabling DEBUG_SPINLOCK and DEBUG_LOCKDEP in the kernel
exposed a lockdep in the enic driver.

enic 0000:0b:00.0 eth2: Link UP

======================================================
[ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
3.12.0-rc1.x86_64-dbg+ MIPS#2 Tainted: GF       W
------------------------------------------------------
NetworkManager/4209 [HC0[0]:SC0[2]:HE1:SE0] is trying to acquire:
(&(&enic->devcmd_lock)->rlock){+.+...}, at: [<ffffffffa026b7e4>] enic_dev_packet_filter+0x44/0x90 [enic]

The fix was to replace spin_lock with spin_lock_bh for the enic
devcmd_lock, so that soft irqs would be disabled while the lock
is held.

Signed-off-by: Sujith Sankar <[email protected]>
Signed-off-by: Tony Camuso <[email protected]>
Signed-off-by: Govindarajulu Varadarajan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Oct 20, 2014
Linus Lüssing says:

====================
bridge: multicast snooping exports MIPS#2

Some people pointed out to me that it might be helpful to add stubs for
the newly added multicast exports. That way e.g. batman-adv should continue
to be compile and useable without having to have a kernel compiled
with bridge code in the future. This is what the first patch is supposed
to do.

The second patch adds a third multicast export for the bridge which
e.g. batman-adv is supposed to use, too, soon: Just like the bridge
disables its multicast snooping activities if no querier is present,
batman-adv needs to do the same if bridges are involved.

These three exports should be the final ones needed to marry the bridge
multicast snooping with the batman-adv multicast optimizations recently
added for the 3.15 kernel, allowing to use these optimzations in common
setups having a bridge on top of e.g. bat0, too. So far these bridged
setups would fall back to simple flooding through the batman-adv mesh
network for any multicast packet entering bat0.

More information about the batman-adv multicast optimizations currently
implemented can be found here:

http://www.open-mesh.org/projects/batman-adv/wiki/Basic-multicast-optimizations

The integration on the batman-adv side could afterwards look like this,
for instance (now including the third export):

http://git.open-mesh.org/batman-adv.git/commitdiff/61e4f6af4b7a21ed4040f2e711d50c778e5b6d93?hp=6ae4281474675fbca5bedcf768972a32db586eb6
====================
ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Oct 20, 2014
What the patch does:
1. Call pinmux_disable_setting ahead of pinmux_enable_setting
  each time pinctrl_select_state is called
2. Remove the HW disable operation in pinmux_disable_setting function.
3. Remove the disable ops in struct pinmux_ops
4. Remove all the disable ops users in current code base.

Notes:
1. Great thanks for the suggestion from Linus, Tony Lindgren and
   Stephen Warren and Everyone that shared comments on this patch.
2. The patch also includes comment fixes from Stephen Warren.

The reason why we do this:
1. To avoid duplicated calling of the enable_setting operation
   without disabling operation inbetween which will let the pin
   descriptor desc->mux_usecount increase monotonously.
2. The HW pin disable operation is not useful for any of the
   existing platforms.
   And this can be used to avoid the HW glitch after using the
   item #1 modification.

In the following case, the issue can be reproduced:
1. There is a driver that need to switch pin state dynamically,
   e.g. between "sleep" and "default" state
2. The pin setting configuration in a DTS node may be like this:

  component a {
	pinctrl-names = "default", "sleep";
	pinctrl-0 = <&a_grp_setting &c_grp_setting>;
	pinctrl-1 = <&b_grp_setting &c_grp_setting>;
  }

  The "c_grp_setting" config node is totally identical, maybe like
  following one:

  c_grp_setting: c_grp_setting {
	pinctrl-single,pins = <GPIO48 AF6>;
  }

3. When switching the pin state in the following official pinctrl
   sequence:
	pin = pinctrl_get();
	state = pinctrl_lookup_state(wanted_state);
	pinctrl_select_state(state);
	pinctrl_put();

Test Result:
1. The switch is completed as expected, that is: the device's
   pin configuration is changed according to the description in the
   "wanted_state" group setting
2. The "desc->mux_usecount" of the corresponding pins in "c_group"
   is increased without being decreased, because the "desc" is for
   each physical pin while the setting is for each setting node
   in the DTS.
   Thus, if the "c_grp_setting" in pinctrl-0 is not disabled ahead
   of enabling "c_grp_setting" in pinctrl-1, the desc->mux_usecount
   will keep increasing without any chance to be decreased.

According to the comments in the original code, only the setting,
in old state but not in new state, will be "disabled" (calling
pinmux_disable_setting), which is correct logic but not intact. We
still need consider case that the setting is in both old state
and new state. We can do this in the following two ways:

1. Avoid to "enable"(calling pinmux_enable_setting) the "same pin
   setting" repeatedly
2. "Disable"(calling pinmux_disable_setting) the "same pin setting",
   actually two setting instances, ahead of enabling them.

Analysis:
1. The solution MIPS#2 is better because it can avoid too much
   iteration.
2. If we disable all of the settings in the old state and one of
   the setting(s) exist in the new state, the pins mux function
   change may happen when some SoC vendors defined the
   "pinctrl-single,function-off"
   in their DTS file.
   old_setting => disabled_setting => new_setting.
3. In the pinmux framework, when a pin state is switched, the
   setting in the old state should be marked as "disabled".

Conclusion:
1. To Remove the HW disabling operation to above the glitch mentioned
   above.
2. Handle the issue mentioned above by disabling all of the settings
   in old state and then enable the all of the settings in new state.

Signed-off-by: Fan Wu <[email protected]>
Acked-by: Stephen Warren <[email protected]>
Acked-by: Patrice Chotard <[email protected]>
Acked-by: Heiko Stuebner <[email protected]>
Acked-by: Maxime Coquelin <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Oct 20, 2014
We got a report of the following warning in Fedora:

BUG: sleeping function called from invalid context at mm/slub.c:969
in_atomic(): 1, irqs_disabled(): 0, pid: 533, name: bash
3 locks held by bash/533:
 #0:  (&sp->so_delegreturn_mutex){+.+...}, at: [<ffffffffa033da62>] nfs4_proc_lock+0x262/0x910 [nfsv4]
 #1:  (&nfsi->rwsem){.+.+.+}, at: [<ffffffffa033da6a>] nfs4_proc_lock+0x26a/0x910 [nfsv4]
 MIPS#2:  (&sb->s_type->i_lock_key#23){+.+...}, at: [<ffffffff812998dc>] flock_lock_file_wait+0x8c/0x3a0
CPU: 0 PID: 533 Comm: bash Not tainted 3.15.0-0.rc1.git1.1.fc21.x86_64 #1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 0000000000000000 00000000d664ff3c ffff880078b69a70 ffffffff817e82e0
 0000000000000000 ffff880078b69a98 ffffffff810cf1a4 0000000000000050
 0000000000000050 ffff88007cc01a00 ffff880078b69ad8 ffffffff8121449e
Call Trace:
 [<ffffffff817e82e0>] dump_stack+0x4d/0x66
 [<ffffffff810cf1a4>] __might_sleep+0x184/0x240
 [<ffffffff8121449e>] kmem_cache_alloc_trace+0x4e/0x330
 [<ffffffffa0331124>] ? nfs4_release_lockowner+0x74/0x110 [nfsv4]
 [<ffffffffa0331124>] nfs4_release_lockowner+0x74/0x110 [nfsv4]
 [<ffffffffa0352340>] nfs4_put_lock_state+0x90/0xb0 [nfsv4]
 [<ffffffffa0352375>] nfs4_fl_release_lock+0x15/0x20 [nfsv4]
 [<ffffffff81297515>] locks_free_lock+0x45/0x90
 [<ffffffff8129996c>] flock_lock_file_wait+0x11c/0x3a0
 [<ffffffffa033da6a>] ? nfs4_proc_lock+0x26a/0x910 [nfsv4]
 [<ffffffffa033301e>] do_vfs_lock+0x1e/0x30 [nfsv4]
 [<ffffffffa033da79>] nfs4_proc_lock+0x279/0x910 [nfsv4]
 [<ffffffff810dbb26>] ? local_clock+0x16/0x30
 [<ffffffff810f5a3f>] ? lock_release_holdtime.part.28+0xf/0x200
 [<ffffffffa02f820c>] do_unlk+0x8c/0xc0 [nfs]
 [<ffffffffa02f85c5>] nfs_flock+0xa5/0xf0 [nfs]
 [<ffffffff8129a6f6>] locks_remove_file+0xb6/0x1e0
 [<ffffffff812159d8>] ? kfree+0xd8/0x2d0
 [<ffffffff8123bc63>] __fput+0xd3/0x210
 [<ffffffff8123bdee>] ____fput+0xe/0x10
 [<ffffffff810bfb6d>] task_work_run+0xcd/0xf0
 [<ffffffff81019cd1>] do_notify_resume+0x61/0x90
 [<ffffffff817fbea2>] int_signal+0x12/0x17

The problem is that NFSv4 is trying to do an allocation from
fl_release_private (in order to send a RELEASE_LOCKOWNER call). That
function can be called while holding the inode->i_lock, and it's
currently set up to do __GFP_WAIT allocations. v4.1 code has a
similar problem.

This patch adds a work_struct to the nfs4_lock_state and has the code
queue the free_lock_state operation to nfsiod.

Reported-by: Josh Stone <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Oct 20, 2014
While a queue is being destroyed, all the blkgs are destroyed and its
->root_blkg pointer is set to NULL.  If someone else starts to drain
while the queue is in this state, the following oops happens.

  NULL pointer dereference at 0000000000000028
  IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
  PGD e4a1067 PUD b773067 PMD 0
  Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched]
  CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ MIPS#2
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000
  RIP: 0010:[<ffffffff8144e944>]  [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
  RSP: 0018:ffff88000efd7bf0  EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
  RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001
  R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450
  R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28
  FS:  00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0
  Stack:
   ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80
   ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58
   ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450
  Call Trace:
   [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60
   [<ffffffff81427641>] __blk_drain_queue+0x71/0x180
   [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0
   [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120
   [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50
   [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40
   [<ffffffff8142d476>] blk_release_queue+0x26/0xd0
   [<ffffffff81454968>] kobject_cleanup+0x38/0x70
   [<ffffffff81454848>] kobject_put+0x28/0x60
   [<ffffffff81427505>] blk_put_queue+0x15/0x20
   [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0
   [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0
   [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20
   [<ffffffff817930e2>] device_release+0x32/0xa0
   [<ffffffff81454968>] kobject_cleanup+0x38/0x70
   [<ffffffff81454848>] kobject_put+0x28/0x60
   [<ffffffff817934d7>] put_device+0x17/0x20
   [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0
   [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40
   [<ffffffff817d1257>] sdev_store_delete+0x27/0x30
   [<ffffffff81792ca8>] dev_attr_store+0x18/0x30
   [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50
   [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170
   [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0
   [<ffffffff811f69bd>] SyS_write+0x4d/0xc0
   [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b

776687b ("block, blk-mq: draining can't be skipped even if
bypass_depth was non-zero") made it easier to trigger this bug by
making blk_queue_bypass_start() drain even when it loses the first
bypass test to blk_cleanup_queue(); however, the bug has always been
there even before the commit as blk_queue_bypass_start() could race
against queue destruction, win the initial bypass test but perform the
actual draining after blk_cleanup_queue() already destroyed all blkgs.

Fix it by skippping calling into policy draining if all the blkgs are
already gone.

Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Shirish Pargaonkar <[email protected]>
Reported-by: Sasha Levin <[email protected]>
Reported-by: Jet Chen <[email protected]>
Cc: [email protected]
Tested-by: Shirish Pargaonkar <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Oct 20, 2014
We allocate the cpufreq table after calling rcu_read_lock(),
which disables preemption. This causes scheduling while atomic
warnings. Use GFP_ATOMIC instead of GFP_KERNEL and update for
kcalloc while we're here.

BUG: sleeping function called from invalid context at mm/slub.c:1246
in_atomic(): 0, irqs_disabled(): 0, pid: 80, name: modprobe
5 locks held by modprobe/80:
 #0:  (&dev->mutex){......}, at: [<c050d484>] __driver_attach+0x48/0x98
 #1:  (&dev->mutex){......}, at: [<c050d494>] __driver_attach+0x58/0x98
 MIPS#2:  (subsys mutex#5){+.+.+.}, at: [<c050c114>] subsys_interface_register+0x38/0xc8
 MIPS#3:  (cpufreq_rwsem){.+.+.+}, at: [<c05a9c8c>] __cpufreq_add_dev.isra.22+0x84/0x92c
 MIPS#4:  (rcu_read_lock){......}, at: [<c05ab24c>] dev_pm_opp_init_cpufreq_table+0x18/0x10c
Preemption disabled at:[<  (null)>]   (null)

CPU: 2 PID: 80 Comm: modprobe Not tainted 3.16.0-rc3-next-20140701-00035-g286857f216aa-dirty torvalds#217
[<c0214da8>] (unwind_backtrace) from [<c02123f8>] (show_stack+0x10/0x14)
[<c02123f8>] (show_stack) from [<c070141c>] (dump_stack+0x70/0xbc)
[<c070141c>] (dump_stack) from [<c02f4cb0>] (__kmalloc+0x124/0x250)
[<c02f4cb0>] (__kmalloc) from [<c05ab270>] (dev_pm_opp_init_cpufreq_table+0x3c/0x10c)
[<c05ab270>] (dev_pm_opp_init_cpufreq_table) from [<bf000508>] (cpufreq_init+0x48/0x378 [cpufreq_generic])
[<bf000508>] (cpufreq_init [cpufreq_generic]) from [<c05a9e08>] (__cpufreq_add_dev.isra.22+0x200/0x92c)
[<c05a9e08>] (__cpufreq_add_dev.isra.22) from [<c050c160>] (subsys_interface_register+0x84/0xc8)
[<c050c160>] (subsys_interface_register) from [<c05a9494>] (cpufreq_register_driver+0x108/0x2d8)
[<c05a9494>] (cpufreq_register_driver) from [<bf000888>] (generic_cpufreq_probe+0x50/0x74 [cpufreq_generic])
[<bf000888>] (generic_cpufreq_probe [cpufreq_generic]) from [<c050e994>] (platform_drv_probe+0x18/0x48)
[<c050e994>] (platform_drv_probe) from [<c050d1f4>] (driver_probe_device+0x128/0x370)
[<c050d1f4>] (driver_probe_device) from [<c050d4d0>] (__driver_attach+0x94/0x98)
[<c050d4d0>] (__driver_attach) from [<c050b778>] (bus_for_each_dev+0x54/0x88)
[<c050b778>] (bus_for_each_dev) from [<c050c894>] (bus_add_driver+0xe8/0x204)
[<c050c894>] (bus_add_driver) from [<c050dd48>] (driver_register+0x78/0xf4)
[<c050dd48>] (driver_register) from [<c0208870>] (do_one_initcall+0xac/0x1d8)
[<c0208870>] (do_one_initcall) from [<c028b6b4>] (load_module+0x190c/0x21e8)
[<c028b6b4>] (load_module) from [<c028c034>] (SyS_init_module+0xa4/0x110)
[<c028c034>] (SyS_init_module) from [<c020f0c0>] (ret_fast_syscall+0x0/0x48)

Fixes: a0dd7b7 (PM / OPP: Move cpufreq specific OPP functions out of generic OPP library)
Signed-off-by: Stephen Boyd <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Cc: 3.16+ <[email protected]> # 3.16+
Signed-off-by: Rafael J. Wysocki <[email protected]>
ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Oct 20, 2014
This patch integrates creation of sysfs groups and
attributes into NILFS file system driver.

It was found the issue with nilfs_sysfs_{create/delete}_snapshot_group
functions by Michael L Semon <[email protected]> in the first
version of the patch:

  BUG: sleeping function called from invalid context at kernel/locking/mutex.c:579
  in_atomic(): 1, irqs_disabled(): 0, pid: 32676, name: umount.nilfs2
  2 locks held by umount.nilfs2/32676:
   #0:  (&type->s_umount_key#21){++++..}, at: [<790c18e2>] deactivate_super+0x37/0x58
   #1:  (&(&nilfs->ns_cptree_lock)->rlock){+.+...}, at: [<791bf659>] nilfs_put_root+0x23/0x5a
  Preemption disabled at:[<791bf659>] nilfs_put_root+0x23/0x5a

  CPU: 0 PID: 32676 Comm: umount.nilfs2 Not tainted 3.14.0+ MIPS#2
  Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002
  Call Trace:
    dump_stack+0x4b/0x75
    __might_sleep+0x111/0x16f
    mutex_lock_nested+0x1e/0x3ad
    kernfs_remove+0x12/0x26
    sysfs_remove_dir+0x3d/0x62
    kobject_del+0x13/0x38
    nilfs_sysfs_delete_snapshot_group+0xb/0xd
    nilfs_put_root+0x2a/0x5a
    nilfs_detach_log_writer+0x1ab/0x2c1
    nilfs_put_super+0x13/0x68
    generic_shutdown_super+0x60/0xd1
    kill_block_super+0x1d/0x60
    deactivate_locked_super+0x22/0x3f
    deactivate_super+0x3e/0x58
    mntput_no_expire+0xe2/0x141
    SyS_oldumount+0x70/0xa5
    syscall_call+0x7/0xb

The reason of the issue was placement of
nilfs_sysfs_{create/delete}_snapshot_group() call under
nilfs->ns_cptree_lock protection.  But this protection is unnecessary and
wrong solution.  The second version of the patch fixes this issue.

[[email protected]: nilfs_sysfs_create_mounted_snapshots_group can be static]
Reported-by: Michael L. Semon <[email protected]>
Signed-off-by: Vyacheslav Dubeyko <[email protected]>
Cc: Vyacheslav Dubeyko <[email protected]>
Cc: Ryusuke Konishi <[email protected]>
Tested-by: Michael L. Semon <[email protected]>
Signed-off-by: Fengguang Wu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Oct 20, 2014
… queue"

This reverts commit a923207.

It introduced a dead lock, and did not fix anything.

it made netif_tx_lock() be called in IRQ context, but in softirq context,
the same lock is locked without disabling IRQ. In fact, the commit a923207
did not fix anything, since netif_stop_queue did not free the any resource

[   10.154920] =================================
[   10.156026] [ INFO: inconsistent lock state ]
[   10.156026] 3.16.0-rc5+ MIPS#13 Not tainted
[   10.156026] ---------------------------------
[   10.156026] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[   10.156026] swapper/1/0 [HC0[0]:SC1[5]:HE1:SE0] takes:
[   10.156026]  (_xmit_ETHER){?.-...}, at: [<80948b6a>] sch_direct_xmit+0x7a/0x250
[   10.156026] {IN-HARDIRQ-W} state was registered at:
[   10.156026]   [<804811f0>] __lock_acquire+0x800/0x17a0
[   10.156026]   [<804828ba>] lock_acquire+0x6a/0xf0
[   10.156026]   [<809ed477>] _raw_spin_lock+0x27/0x40
[   10.156026]   [<8088d508>] gether_disconnect+0x68/0x280
[   10.156026]   [<8088e777>] eem_set_alt+0x37/0xc0
[   10.156026]   [<808847ce>] composite_setup+0x30e/0x1240
[   10.156026]   [<8088b8ae>] pch_udc_isr+0xa6e/0xf50
[   10.156026]   [<8048abe8>] handle_irq_event_percpu+0x38/0x1e0
[   10.156026]   [<8048adc1>] handle_irq_event+0x31/0x50
[   10.156026]   [<8048d94b>] handle_fasteoi_irq+0x6b/0x140
[   10.156026]   [<804040a5>] handle_irq+0x65/0x80
[   10.156026]   [<80403cfc>] do_IRQ+0x3c/0xc0
[   10.156026]   [<809ee6ae>] common_interrupt+0x2e/0x34
[   10.156026]   [<804668c5>] finish_task_switch+0x65/0xd0
[   10.156026]   [<809e89df>] __schedule+0x20f/0x7d0
[   10.156026]   [<809e94aa>] schedule_preempt_disabled+0x2a/0x70
[   10.156026]   [<8047bf03>] cpu_startup_entry+0x143/0x410
[   10.156026]   [<809e2e61>] rest_init+0xa1/0xb0
[   10.156026]   [<80ce2a3b>] start_kernel+0x336/0x33b
[   10.156026]   [<80ce22ab>] i386_start_kernel+0x79/0x7d
[   10.156026] irq event stamp: 52070
[   10.156026] hardirqs last  enabled at (52070): [<809375de>] neigh_resolve_output+0xee/0x2a0
[   10.156026] hardirqs last disabled at (52069): [<809375a8>] neigh_resolve_output+0xb8/0x2a0
[   10.156026] softirqs last  enabled at (52020): [<8044401f>] _local_bh_enable+0x1f/0x50
[   10.156026] softirqs last disabled at (52021): [<80404036>] do_softirq_own_stack+0x26/0x30
[   10.156026]
[   10.156026] other info that might help us debug this:
[   10.156026]  Possible unsafe locking scenario:
[   10.156026]
[   10.156026]        CPU0
[   10.156026]        ----
[   10.156026]   lock(_xmit_ETHER);
[   10.156026]   <Interrupt>
[   10.156026]     lock(_xmit_ETHER);
[   10.156026]
[   10.156026]  *** DEADLOCK ***
[   10.156026]
[   10.156026] 4 locks held by swapper/1/0:
[   10.156026]  #0:  (((&idev->mc_ifc_timer))){+.-...}, at: [<8044b100>] call_timer_fn+0x0/0x190
[   10.156026]  #1:  (rcu_read_lock){......}, at: [<a0577c40>] mld_sendpack+0x0/0x590 [ipv6]
[   10.156026]  MIPS#2:  (rcu_read_lock_bh){......}, at: [<a055680c>] ip6_finish_output2+0x4c/0x7f0 [ipv6]
[   10.156026]  MIPS#3:  (rcu_read_lock_bh){......}, at: [<8092e510>] __dev_queue_xmit+0x0/0x5f0
[   10.156026]
[   10.156026] stack backtrace:
[   10.156026] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-rc5+ MIPS#13
[   10.156026]  811dbb10 00000000 9e919d10 809e6785 9e8b8000 9e919d3c 809e561e 80b95511
[   10.156026]  80b9545a 80b9543d 80b95450 80b95441 80b957e4 9e8b84e0 00000002 8047f7b0
[   10.156026]  9e919d5c 8048043b 00000002 00000000 9e8b8000 00000001 00000004 9e8b8000
[   10.156026] Call Trace:
[   10.156026]  [<809e6785>] dump_stack+0x48/0x69
[   10.156026]  [<809e561e>] print_usage_bug+0x18f/0x19c
[   10.156026]  [<8047f7b0>] ? print_shortest_lock_dependencies+0x170/0x170
[   10.156026]  [<8048043b>] mark_lock+0x53b/0x5f0
[   10.156026]  [<804810cf>] __lock_acquire+0x6df/0x17a0
[   10.156026]  [<804828ba>] lock_acquire+0x6a/0xf0
[   10.156026]  [<80948b6a>] ? sch_direct_xmit+0x7a/0x250
[   10.156026]  [<809ed477>] _raw_spin_lock+0x27/0x40
[   10.156026]  [<80948b6a>] ? sch_direct_xmit+0x7a/0x250
[   10.156026]  [<80948b6a>] sch_direct_xmit+0x7a/0x250
[   10.156026]  [<8092e6bf>] __dev_queue_xmit+0x1af/0x5f0
[   10.156026]  [<80947fc0>] ? ether_setup+0x80/0x80
[   10.156026]  [<8092eb0f>] dev_queue_xmit+0xf/0x20
[   10.156026]  [<8093764c>] neigh_resolve_output+0x15c/0x2a0
[   10.156026]  [<a0556927>] ip6_finish_output2+0x167/0x7f0 [ipv6]
[   10.156026]  [<a0559b05>] ip6_finish_output+0x85/0x1c0 [ipv6]
[   10.156026]  [<a0559cb7>] ip6_output+0x77/0x240 [ipv6]
[   10.156026]  [<a0578163>] mld_sendpack+0x523/0x590 [ipv6]
[   10.156026]  [<80480501>] ? mark_held_locks+0x11/0x90
[   10.156026]  [<a057947d>] mld_ifc_timer_expire+0x15d/0x280 [ipv6]
[   10.156026]  [<8044b168>] call_timer_fn+0x68/0x190
[   10.156026]  [<a0579320>] ? igmp6_group_added+0x150/0x150 [ipv6]
[   10.156026]  [<8044b3fa>] run_timer_softirq+0x16a/0x240
[   10.156026]  [<a0579320>] ? igmp6_group_added+0x150/0x150 [ipv6]
[   10.156026]  [<80444984>] __do_softirq+0xd4/0x2f0
[   10.156026]  [<804448b0>] ? tasklet_action+0x100/0x100
[   10.156026]  [<80404036>] do_softirq_own_stack+0x26/0x30
[   10.156026]  <IRQ>  [<80444d05>] irq_exit+0x65/0x70
[   10.156026]  [<8042d758>] smp_apic_timer_interrupt+0x38/0x50
[   10.156026]  [<809ee91f>] apic_timer_interrupt+0x2f/0x34
[   10.156026]  [<8048007b>] ? mark_lock+0x17b/0x5f0
[   10.156026]  [<8040a912>] ? default_idle+0x22/0xf0
[   10.156026]  [<8040b13e>] arch_cpu_idle+0xe/0x10
[   10.156026]  [<8047bfc6>] cpu_startup_entry+0x206/0x410
[   10.156026]  [<8042bfbd>] start_secondary+0x19d/0x1e0

Acked-by: Tony Lindgren <[email protected]>
Reported-by: Thomas Gleixner <[email protected]>
Cc: Jeff Westfahl <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: <[email protected]>
Signed-off-by: Li RongQing <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Oct 20, 2014
This reverts commit 8b37e1b.

It's broken as it changes led_blink_set() in a way that it can now sleep
(while synchronously waiting for workqueue to be cancelled). That's a
problem, because it's possible that this function gets called from atomic
context (tpt_trig_timer() takes a readlock and thus disables preemption).

This has been brought up 3 weeks ago already [1] but no proper fix has
materialized, and I keep seeing the problem since 3.17-rc1.

[1] https://lkml.org/lkml/2014/8/16/128

 BUG: sleeping function called from invalid context at kernel/workqueue.c:2650
 in_atomic(): 1, irqs_disabled(): 0, pid: 2335, name: wpa_supplicant
 5 locks held by wpa_supplicant/2335:
  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff814c7c92>] rtnl_lock+0x12/0x20
  #1:  (&wdev->mtx){+.+.+.}, at: [<ffffffffc06e649c>] cfg80211_mgd_wext_siwessid+0x5c/0x180 [cfg80211]
  MIPS#2:  (&local->mtx){+.+.+.}, at: [<ffffffffc0817dea>] ieee80211_prep_connection+0x17a/0x9a0 [mac80211]
  MIPS#3:  (&local->chanctx_mtx){+.+.+.}, at: [<ffffffffc08081ed>] ieee80211_vif_use_channel+0x5d/0x2a0 [mac80211]
  MIPS#4:  (&trig->leddev_list_lock){.+.+..}, at: [<ffffffffc081e68c>] tpt_trig_timer+0xec/0x170 [mac80211]
 CPU: 0 PID: 2335 Comm: wpa_supplicant Not tainted 3.17.0-rc3 #1
 Hardware name: LENOVO 7470BN2/7470BN2, BIOS 6DET38WW (2.02 ) 12/19/2008
  ffff8800360b5a50 ffff8800751f76d8 ffffffff8159e97f ffff8800360b5a30
  ffff8800751f76e8 ffffffff810739a5 ffff8800751f77b0 ffffffff8106862f
  ffffffff810685d0 0aa2209200000000 ffff880000000004 ffff8800361c59d0
 Call Trace:
  [<ffffffff8159e97f>] dump_stack+0x4d/0x66
  [<ffffffff810739a5>] __might_sleep+0xe5/0x120
  [<ffffffff8106862f>] flush_work+0x5f/0x270
  [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80
  [<ffffffff810945ca>] ? mark_held_locks+0x6a/0x90
  [<ffffffff81068a5f>] ? __cancel_work_timer+0x6f/0x100
  [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0
  [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100
  [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10
  [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40
  [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211]
  [<ffffffffc081ecdd>] ieee80211_mod_tpt_led_trig+0x9d/0x160 [mac80211]
  [<ffffffffc07e4278>] __ieee80211_recalc_idle+0x98/0x140 [mac80211]
  [<ffffffffc07e59ce>] ieee80211_idle_off+0xe/0x10 [mac80211]
  [<ffffffffc0804e5b>] ieee80211_add_chanctx+0x3b/0x220 [mac80211]
  [<ffffffffc08062e4>] ieee80211_new_chanctx+0x44/0xf0 [mac80211]
  [<ffffffffc080838a>] ieee80211_vif_use_channel+0x1fa/0x2a0 [mac80211]
  [<ffffffffc0817df8>] ieee80211_prep_connection+0x188/0x9a0 [mac80211]
  [<ffffffffc081c246>] ieee80211_mgd_auth+0x256/0x2e0 [mac80211]
  [<ffffffffc07eab33>] ieee80211_auth+0x13/0x20 [mac80211]
  [<ffffffffc06cb006>] cfg80211_mlme_auth+0x106/0x270 [cfg80211]
  [<ffffffffc06ce085>] cfg80211_conn_do_work+0x155/0x3b0 [cfg80211]
  [<ffffffffc06cf670>] cfg80211_connect+0x3f0/0x540 [cfg80211]
  [<ffffffffc06e6148>] cfg80211_mgd_wext_connect+0x158/0x1f0 [cfg80211]
  [<ffffffffc06e651e>] cfg80211_mgd_wext_siwessid+0xde/0x180 [cfg80211]
  [<ffffffffc06e36c0>] ? cfg80211_wext_giwessid+0x50/0x50 [cfg80211]
  [<ffffffffc06e36dd>] cfg80211_wext_siwessid+0x1d/0x40 [cfg80211]
  [<ffffffff81584d0c>] ioctl_standard_iw_point+0x14c/0x3e0
  [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0
  [<ffffffff8158502a>] ioctl_standard_call+0x8a/0xd0
  [<ffffffff81584fa0>] ? ioctl_standard_iw_point+0x3e0/0x3e0
  [<ffffffff81584b76>] wireless_process_ioctl.constprop.10+0xb6/0x100
  [<ffffffff8158521d>] wext_handle_ioctl+0x5d/0xb0
  [<ffffffff814cfb29>] dev_ioctl+0x329/0x620
  [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0
  [<ffffffff8149c7f2>] sock_ioctl+0x142/0x2e0
  [<ffffffff811b0140>] do_vfs_ioctl+0x300/0x520
  [<ffffffff815a67fb>] ? sysret_check+0x1b/0x56
  [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0
  [<ffffffff811b03e1>] SyS_ioctl+0x81/0xa0
  [<ffffffff815a67d6>] system_call_fastpath+0x1a/0x1f
 wlan0: send auth to 00:0b:6b:3c:8c:e4 (try 1/3)
 wlan0: authenticated
 wlan0: associate with 00:0b:6b:3c:8c:e4 (try 1/3)
 wlan0: RX AssocResp from 00:0b:6b:3c:8c:e4 (capab=0x431 status=0 aid=2)
 wlan0: associated
 IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
 cfg80211: Calling CRDA for country: NA
 wlan0: Limiting TX power to 27 (27 - 0) dBm as advertised by 00:0b:6b:3c:8c:e4

 =================================
 [ INFO: inconsistent lock state ]
 3.17.0-rc3 #1 Not tainted
 ---------------------------------
 inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
 swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
  ((&(&led_cdev->blink_work)->work)){+.?...}, at: [<ffffffff810685d0>] flush_work+0x0/0x270
 {SOFTIRQ-ON-W} state was registered at:
   [<ffffffff81094dbe>] __lock_acquire+0x30e/0x1a30
   [<ffffffff81096c81>] lock_acquire+0x91/0x110
   [<ffffffff81068608>] flush_work+0x38/0x270
   [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100
   [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10
   [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40
   [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211]
   [<ffffffffc081ecdd>] ieee80211_mod_tpt_led_trig+0x9d/0x160 [mac80211]
   [<ffffffffc07e4278>] __ieee80211_recalc_idle+0x98/0x140 [mac80211]
   [<ffffffffc07e59ce>] ieee80211_idle_off+0xe/0x10 [mac80211]
   [<ffffffffc0804e5b>] ieee80211_add_chanctx+0x3b/0x220 [mac80211]
   [<ffffffffc08062e4>] ieee80211_new_chanctx+0x44/0xf0 [mac80211]
   [<ffffffffc080838a>] ieee80211_vif_use_channel+0x1fa/0x2a0 [mac80211]
   [<ffffffffc0817df8>] ieee80211_prep_connection+0x188/0x9a0 [mac80211]
   [<ffffffffc081c246>] ieee80211_mgd_auth+0x256/0x2e0 [mac80211]
   [<ffffffffc07eab33>] ieee80211_auth+0x13/0x20 [mac80211]
   [<ffffffffc06cb006>] cfg80211_mlme_auth+0x106/0x270 [cfg80211]
   [<ffffffffc06ce085>] cfg80211_conn_do_work+0x155/0x3b0 [cfg80211]
   [<ffffffffc06cf670>] cfg80211_connect+0x3f0/0x540 [cfg80211]
   [<ffffffffc06e6148>] cfg80211_mgd_wext_connect+0x158/0x1f0 [cfg80211]
   [<ffffffffc06e651e>] cfg80211_mgd_wext_siwessid+0xde/0x180 [cfg80211]
   [<ffffffffc06e36dd>] cfg80211_wext_siwessid+0x1d/0x40 [cfg80211]
   [<ffffffff81584d0c>] ioctl_standard_iw_point+0x14c/0x3e0
   [<ffffffff8158502a>] ioctl_standard_call+0x8a/0xd0
   [<ffffffff81584b76>] wireless_process_ioctl.constprop.10+0xb6/0x100
   [<ffffffff8158521d>] wext_handle_ioctl+0x5d/0xb0
   [<ffffffff814cfb29>] dev_ioctl+0x329/0x620
   [<ffffffff8149c7f2>] sock_ioctl+0x142/0x2e0
   [<ffffffff811b0140>] do_vfs_ioctl+0x300/0x520
   [<ffffffff811b03e1>] SyS_ioctl+0x81/0xa0
   [<ffffffff815a67d6>] system_call_fastpath+0x1a/0x1f
 irq event stamp: 493416
 hardirqs last  enabled at (493416): [<ffffffff81068a5f>] __cancel_work_timer+0x6f/0x100
 hardirqs last disabled at (493415): [<ffffffff81067e9f>] try_to_grab_pending+0x1f/0x160
 softirqs last  enabled at (493408): [<ffffffff81053ced>] _local_bh_enable+0x1d/0x50
 softirqs last disabled at (493409): [<ffffffff81054c75>] irq_exit+0xa5/0xb0

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock((&(&led_cdev->blink_work)->work));
   <Interrupt>
     lock((&(&led_cdev->blink_work)->work));

  *** DEADLOCK ***

 2 locks held by swapper/0/0:
  #0:  (((&tpt_trig->timer))){+.-...}, at: [<ffffffff810b4c50>] call_timer_fn+0x0/0x180
  #1:  (&trig->leddev_list_lock){.+.?..}, at: [<ffffffffc081e68c>] tpt_trig_timer+0xec/0x170 [mac80211]

 stack backtrace:
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.17.0-rc3 #1
 Hardware name: LENOVO 7470BN2/7470BN2, BIOS 6DET38WW (2.02 ) 12/19/2008
  ffffffff8246eb30 ffff88007c203b00 ffffffff8159e97f ffffffff81a194c0
  ffff88007c203b50 ffffffff81599c29 0000000000000001 ffffffff00000001
  ffff880000000000 0000000000000006 ffffffff81a194c0 ffffffff81093ad0
 Call Trace:
  <IRQ>  [<ffffffff8159e97f>] dump_stack+0x4d/0x66
  [<ffffffff81599c29>] print_usage_bug+0x1f4/0x205
  [<ffffffff81093ad0>] ? check_usage_backwards+0x140/0x140
  [<ffffffff810944d3>] mark_lock+0x223/0x2b0
  [<ffffffff81094d60>] __lock_acquire+0x2b0/0x1a30
  [<ffffffff81096c81>] lock_acquire+0x91/0x110
  [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80
  [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211]
  [<ffffffff81068608>] flush_work+0x38/0x270
  [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80
  [<ffffffff810945ca>] ? mark_held_locks+0x6a/0x90
  [<ffffffff81068a5f>] ? __cancel_work_timer+0x6f/0x100
  [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211]
  [<ffffffff8109469d>] ? trace_hardirqs_on_caller+0xad/0x1c0
  [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211]
  [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100
  [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10
  [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40
  [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211]
  [<ffffffff810b4cc5>] call_timer_fn+0x75/0x180
  [<ffffffff810b4c50>] ? process_timeout+0x10/0x10
  [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211]
  [<ffffffff810b50ac>] run_timer_softirq+0x1fc/0x2f0
  [<ffffffff81054805>] __do_softirq+0x115/0x2e0
  [<ffffffff81054c75>] irq_exit+0xa5/0xb0
  [<ffffffff810049b3>] do_IRQ+0x53/0xf0
  [<ffffffff815a74af>] common_interrupt+0x6f/0x6f
  <EOI>  [<ffffffff8147b56e>] ? cpuidle_enter_state+0x6e/0x180
  [<ffffffff8147b732>] cpuidle_enter+0x12/0x20
  [<ffffffff8108bba0>] cpu_startup_entry+0x330/0x360
  [<ffffffff8158fb51>] rest_init+0xc1/0xd0
  [<ffffffff8158fa90>] ? csum_partial_copy_generic+0x170/0x170
  [<ffffffff81af3ff2>] start_kernel+0x44f/0x45a
  [<ffffffff81af399c>] ? set_init_arg+0x53/0x53
  [<ffffffff81af35ad>] x86_64_start_reservations+0x2a/0x2c
  [<ffffffff81af36a0>] x86_64_start_kernel+0xf1/0xf4

Cc: Vincent Donnefort <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Tejun Heo <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Bryan Wu <[email protected]>
ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Oct 20, 2014
While debugging a cpufreq-related hardware failure on a system I saw the
following lockdep warning:

 =========================
 [ BUG: held lock freed! ] 3.17.0-rc4+ #1 Tainted: G            E
 -------------------------
 insmod/2247 is freeing memory ffff88006e1b1400-ffff88006e1b17ff, with a lock still held there!
  (&policy->rwsem){+.+...}, at: [<ffffffff8156d37d>] __cpufreq_add_dev.isra.21+0x47d/0xb80
 3 locks held by insmod/2247:
  #0:  (subsys mutex#5){+.+.+.}, at: [<ffffffff81485579>] subsys_interface_register+0x69/0x120
  #1:  (cpufreq_rwsem){.+.+.+}, at: [<ffffffff8156cf73>] __cpufreq_add_dev.isra.21+0x73/0xb80
  MIPS#2:  (&policy->rwsem){+.+...}, at: [<ffffffff8156d37d>] __cpufreq_add_dev.isra.21+0x47d/0xb80

 stack backtrace:
 CPU: 0 PID: 2247 Comm: insmod Tainted: G            E  3.17.0-rc4+ #1
 Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 08/24/2013
  0000000000000000 000000008f3063c4 ffff88006f87bb30 ffffffff8171b358
  ffff88006bcf3750 ffff88006f87bb68 ffffffff810e09e1 ffff88006e1b1400
  ffffea0001b86c00 ffffffff8156d327 ffff880073003500 0000000000000246
 Call Trace:
  [<ffffffff8171b358>] dump_stack+0x4d/0x66
  [<ffffffff810e09e1>] debug_check_no_locks_freed+0x171/0x180
  [<ffffffff8156d327>] ? __cpufreq_add_dev.isra.21+0x427/0xb80
  [<ffffffff8121412b>] kfree+0xab/0x2b0
  [<ffffffff8156d327>] __cpufreq_add_dev.isra.21+0x427/0xb80
  [<ffffffff81724cf7>] ? _raw_spin_unlock+0x27/0x40
  [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq]
  [<ffffffff8156da8e>] cpufreq_add_dev+0xe/0x10
  [<ffffffff814855d1>] subsys_interface_register+0xc1/0x120
  [<ffffffff8156bcf2>] cpufreq_register_driver+0x112/0x340
  [<ffffffff8121415a>] ? kfree+0xda/0x2b0
  [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq]
  [<ffffffffa003562e>] pcc_cpufreq_init+0x4af/0xe81 [pcc_cpufreq]
  [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq]
  [<ffffffff81002144>] do_one_initcall+0xd4/0x210
  [<ffffffff811f7472>] ? __vunmap+0xd2/0x120
  [<ffffffff81127155>] load_module+0x1315/0x1b70
  [<ffffffff811222a0>] ? store_uevent+0x70/0x70
  [<ffffffff811229d9>] ? copy_module_from_fd.isra.44+0x129/0x180
  [<ffffffff81127b86>] SyS_finit_module+0xa6/0xd0
  [<ffffffff81725b69>] system_call_fastpath+0x16/0x1b
 cpufreq: __cpufreq_add_dev: ->get() failed
insmod: ERROR: could not insert module pcc-cpufreq.ko: No such device

The warning occurs in the __cpufreq_add_dev() code which does

        down_write(&policy->rwsem);
	...
        if (cpufreq_driver->get && !cpufreq_driver->setpolicy) {
                policy->cur = cpufreq_driver->get(policy->cpu);
                if (!policy->cur) {
                        pr_err("%s: ->get() failed\n", __func__);
                        goto err_get_freq;
                }

If cpufreq_driver->get(policy->cpu) returns an error we execute the
code at err_get_freq, which does not up the policy->rwsem.  This causes
the lockdep warning.

Trivial patch to up the policy->rwsem in the error path.

After the patch has been applied, and an error occurs in the
cpufreq_driver->get(policy->cpu) call we will now see

cpufreq: __cpufreq_add_dev: ->get() failed
cpufreq: __cpufreq_add_dev: ->get() failed
modprobe: ERROR: could not insert 'pcc_cpufreq': No such device

Fixes: 4e97b63 (cpufreq: Initialize governor for a new policy under policy->rwsem)
Signed-off-by: Prarit Bhargava <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Cc: 3.14+ <[email protected]> # 3.14+
Signed-off-by: Rafael J. Wysocki <[email protected]>
ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Oct 20, 2014
Reported in fdo#82527 comment MIPS#2.

Signed-off-by: Ben Skeggs <[email protected]>
ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Apr 9, 2015
…na/nfs-rdma

NFS: RDMA Client Sparse Fix MIPS#2

This patch fixes another sparse fix found by Dan Carpenter's tool.

Signed-off-by: Anna Schumaker <[email protected]>

* tag 'nfs-rdma-for-4.0-3' of git://git.linux-nfs.org/projects/anna/nfs-rdma:
  xprtrdma: Store RDMA credits in unsigned variables
ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Jun 22, 2015
A number of tx queue wake-up events went missing due to the
outlined scenario below. Start state is a pool of 16 tx URBs,
active tx_urbs count = 15, with the netdev tx queue open.

CPU #1 [softirq]                         CPU MIPS#2 [softirq]
start_xmit()                             tx_acknowledge()
................                         ................

atomic_inc(&tx_urbs);
if (atomic_read(&tx_urbs) >= 16) {
                        -->
                                         atomic_dec(&tx_urbs);
                                         netif_wake_queue();
                                         return;
                        <--
    netif_stop_queue();
}

At the end, the correct state expected is a 15 tx_urbs count
value with the tx queue state _open_. Due to the race, we get
the same tx_urbs value but with the tx queue state _stopped_.
The wake-up event is completely lost.

Thus avoid hand-rolled concurrency mechanisms and use a proper
lock for contexts and tx queue protection.

Signed-off-by: Ahmed S. Darwish <[email protected]>
Cc: linux-stable <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
ZubairLK pushed a commit to ZubairLK/CI20_linux that referenced this issue Jun 22, 2015
The regfile provided to SA_SIGINFO signal handler as ucontext was off by
one due to pt_regs gutter cleanups in 2013.

Before handling signal, user pt_regs are copied onto user_regs_struct and copied
back later. Both structs are binary compatible. This was all fine until
commit 2fa9190 (ARC: pt_regs update MIPS#2) which removed the empty stack slot
at top of pt_regs (corresponding to first pad) and made the corresponding
fixup in struct user_regs_struct (the pad in there was moved out of
@scratch - not removed altogether as it is part of ptrace ABI)

 struct user_regs_struct {
+       long pad;
        struct {
-               long pad;
                long bta, lp_start, lp_end,....
        } scratch;
 ...
 }

This meant that now user_regs_struct was off by 1 reg w.r.t pt_regs and
signal code needs to user_regs_struct.scratch to reflect it as pt_regs,
which is what this commit does.

This problem was hidden for 2 years, because both save/restore, despite
using wrong location, were using the same location. Only an interim
inspection (reproducer below) exposed the issue.

     void handle_segv(int signo, siginfo_t *info, void *context)
     {
 	ucontext_t *uc = context;
	struct user_regs_struct *regs = &(uc->uc_mcontext.regs);

	printf("regs %x %x\n",               <=== prints 7 8 (vs. 8 9)
               regs->scratch.r8, regs->scratch.r9);
     }

     int main()
     {
	struct sigaction sa;

	sa.sa_sigaction = handle_segv;
	sa.sa_flags = SA_SIGINFO;
	sigemptyset(&sa.sa_mask);
	sigaction(SIGSEGV, &sa, NULL);

	asm volatile(
	"mov	r7, 7	\n"
	"mov	r8, 8	\n"
	"mov	r9, 9	\n"
	"mov	r10, 10	\n"
	:::"r7","r8","r9","r10");

	*((unsigned int*)0x10) = 0;
     }

Fixes: 2fa9190 "ARC: pt_regs update MIPS#2: Remove unused gutter at start of pt_regs"
CC: <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Jan 17, 2017
…s enabled

Nils Holland and Klaus Ethgen have reported unexpected OOM killer
invocations with 32b kernel starting with 4.8 kernels

	kworker/u4:5 invoked oom-killer: gfp_mask=0x2400840(GFP_NOFS|__GFP_NOFAIL), nodemask=0, order=0, oom_score_adj=0
	kworker/u4:5 cpuset=/ mems_allowed=0
	CPU: 1 PID: 2603 Comm: kworker/u4:5 Not tainted 4.9.0-gentoo #2
	[...]
	Mem-Info:
	active_anon:58685 inactive_anon:90 isolated_anon:0
	 active_file:274324 inactive_file:281962 isolated_file:0
	 unevictable:0 dirty:649 writeback:0 unstable:0
	 slab_reclaimable:40662 slab_unreclaimable:17754
	 mapped:7382 shmem:202 pagetables:351 bounce:0
	 free:206736 free_pcp:332 free_cma:0
	Node 0 active_anon:234740kB inactive_anon:360kB active_file:1097296kB inactive_file:1127848kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:29528kB dirty:2596kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 184320kB anon_thp: 808kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
	DMA free:3952kB min:788kB low:984kB high:1180kB active_anon:0kB inactive_anon:0kB active_file:7316kB inactive_file:0kB unevictable:0kB writepending:96kB present:15992kB managed:15916kB mlocked:0kB slab_reclaimable:3200kB slab_unreclaimable:1408kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
	lowmem_reserve[]: 0 813 3474 3474
	Normal free:41332kB min:41368kB low:51708kB high:62048kB active_anon:0kB inactive_anon:0kB active_file:532748kB inactive_file:44kB unevictable:0kB writepending:24kB present:897016kB managed:836248kB mlocked:0kB slab_reclaimable:159448kB slab_unreclaimable:69608kB kernel_stack:1112kB pagetables:1404kB bounce:0kB free_pcp:528kB local_pcp:340kB free_cma:0kB
	lowmem_reserve[]: 0 0 21292 21292
	HighMem free:781660kB min:512kB low:34356kB high:68200kB active_anon:234740kB inactive_anon:360kB active_file:557232kB inactive_file:1127804kB unevictable:0kB writepending:2592kB present:2725384kB managed:2725384kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:800kB local_pcp:608kB free_cma:0kB

the oom killer is clearly pre-mature because there there is still a lot
of page cache in the zone Normal which should satisfy this lowmem
request.  Further debugging has shown that the reclaim cannot make any
forward progress because the page cache is hidden in the active list
which doesn't get rotated because inactive_list_is_low is not memcg
aware.

The code simply subtracts per-zone highmem counters from the respective
memcg's lru sizes which doesn't make any sense.  We can simply end up
always seeing the resulting active and inactive counts 0 and return
false.  This issue is not limited to 32b kernels but in practice the
effect on systems without CONFIG_HIGHMEM would be much harder to notice
because we do not invoke the OOM killer for allocations requests
targeting < ZONE_NORMAL.

Fix the issue by tracking per zone lru page counts in mem_cgroup_per_node
and subtract per-memcg highmem counts when memcg is enabled.  Introduce
helper lruvec_zone_lru_size which redirects to either zone counters or
mem_cgroup_get_zone_lru_size when appropriate.

We are losing empty LRU but non-zero lru size detection introduced by
ca70723 ("mm: update_lru_size warn and reset bad lru_size") because
of the inherent zone vs. node discrepancy.

Fixes: f8d1a31 ("mm: consider whether to decivate based on eligible zones inactive ratio")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Reported-by: Nils Holland <[email protected]>
Tested-by: Nils Holland <[email protected]>
Reported-by: Klaus Ethgen <[email protected]>
Acked-by: Minchan Kim <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Reviewed-by: Vladimir Davydov <[email protected]>
Cc: <[email protected]>	[4.8+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
chrisdearman pushed a commit that referenced this issue Mar 22, 2017
There may be a race condition if f_fs calls unregister_gadget_item in
ffs_closed() when unregister_gadget is called by UDC store at the same time.
this leads to a kernel NULL pointer dereference:

[  310.644928] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[  310.645053] init: Service 'adbd' is being killed...
[  310.658938] pgd = c9528000
[  310.662515] [00000004] *pgd=19451831, *pte=00000000, *ppte=00000000
[  310.669702] Internal error: Oops: 817 [#1] PREEMPT SMP ARM
[  310.675211] Modules linked in:
[  310.678294] CPU: 0 PID: 1537 Comm: ->transport Not tainted 4.1.15-03725-g793404c #2
[  310.685958] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[  310.692493] task: c8e24200 ti: c945e000 task.ti: c945e000
[  310.697911] PC is at usb_gadget_unregister_driver+0xb4/0xd0
[  310.703502] LR is at __mutex_lock_slowpath+0x10c/0x16c
[  310.708648] pc : [<c075efc0>]    lr : [<c0bfb0bc>]    psr: 600f0113
<snip..>
[  311.565585] [<c075efc0>] (usb_gadget_unregister_driver) from [<c075e2b8>] (unregister_gadget_item+0x1c/0x34)
[  311.575426] [<c075e2b8>] (unregister_gadget_item) from [<c076fcc8>] (ffs_closed+0x8c/0x9c)
[  311.583702] [<c076fcc8>] (ffs_closed) from [<c07736b8>] (ffs_data_reset+0xc/0xa0)
[  311.591194] [<c07736b8>] (ffs_data_reset) from [<c07738ac>] (ffs_data_closed+0x90/0xd0)
[  311.599208] [<c07738ac>] (ffs_data_closed) from [<c07738f8>] (ffs_ep0_release+0xc/0x14)
[  311.607224] [<c07738f8>] (ffs_ep0_release) from [<c023e030>] (__fput+0x80/0x1d0)
[  311.614635] [<c023e030>] (__fput) from [<c014e688>] (task_work_run+0xb0/0xe8)
[  311.621788] [<c014e688>] (task_work_run) from [<c010afdc>] (do_work_pending+0x7c/0xa4)
[  311.629718] [<c010afdc>] (do_work_pending) from [<c010770c>] (work_pending+0xc/0x20)

for functions using functionFS, i.e. android adbd will close /dev/usb-ffs/adb/ep0
when usb IO thread fails, but switch adb from on to off also triggers write
"none" > UDC. These 2 operations both call unregister_gadget, which will lead
to the panic above.

add a mutex before calling unregister_gadget for api used in f_fs.

Signed-off-by: Winter Wang <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
chrisdearman pushed a commit that referenced this issue Mar 22, 2017
Reverted disabling mipsr32r2 cause it leads to SIGILL in Android SKIA.
logcat error:
01-01 00:01:22.716  1083  1083 F libc    : Fatal signal 4 (SIGILL), code 128, fault addr 0x0 in tid 1083 (ndroid.systemui)
01-01 00:01:22.832    99    99 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
01-01 00:01:22.832    99    99 F DEBUG   : Build fingerprint: 'Android/aosp_ci20/ci20:6.0.1/MOB30D/alistair05241026:userdebug/test-keys'
01-01 00:01:22.832    99    99 F DEBUG   : Revision: '0'
01-01 00:01:22.832    99    99 F DEBUG   : ABI: 'mips'
01-01 00:01:22.833    99    99 F DEBUG   : pid: 1083, tid: 1083, name: ndroid.systemui  >>> com.android.systemui <<<
01-01 00:01:22.833    99    99 F DEBUG   : signal 4 (SIGILL), code 128 (SI_KERNEL), fault addr 0x0
01-01 00:01:22.864    99    99 F DEBUG   :  zr 00000000  at 00000001  v0 00000000  v1 00000001
01-01 00:01:22.864    99    99 E DEBUG   : AM write failed: Broken pipe
01-01 00:01:22.864    99    99 F DEBUG   :  a0 70cb8160  a1 70cb8160  a2 721f890c  a3 00000001
01-01 00:01:22.864    99    99 F DEBUG   :  t0 00000000  t1 7fa2fb30  t2 00000001  t3 00000000
01-01 00:01:22.865    99    99 F DEBUG   :  t4 00000001  t5 00000000  t6 00000001  t7 7fa2fce0
01-01 00:01:22.865    99    99 F DEBUG   :  s0 721f890c  s1 00000001  s2 00000002  s3 721f8924
01-01 00:01:22.865    99    99 F DEBUG   :  s4 00000000  s5 00000000  s6 766f3000  s7 00000000
01-01 00:01:22.865    99    99 F DEBUG   :  t8 766f3000  t9 766f6810  k0 73a76500  k1 00000000
01-01 00:01:22.865    99    99 F DEBUG   :  gp 76a20040  sp 7fa2fac0  s8 7fa2fce0  ra 766f89b8
01-01 00:01:22.865    99    99 F DEBUG   :  hi 00000000  lo 55555556 bva 721f8924 epc 766f8674
01-01 00:01:22.893    99    99 F DEBUG   :
01-01 00:01:22.893    99    99 F DEBUG   : backtrace:
01-01 00:01:22.893    99    99 F DEBUG   :     #00 pc 00215674  /system/lib/libskia.so (SkOpSpan::sortableTop(SkOpContour*)+1052)
01-01 00:01:22.893    99    99 F DEBUG   :     #1 pc 00215d68  /system/lib/libskia.so (SkOpSegment::findSortableTop(SkOpContour*)+112)
01-01 00:01:22.893    99    99 F DEBUG   :     #2 pc 00215e10  /system/lib/libskia.so (SkOpContour::findSortableTop(SkOpContour*)+72)
01-01 00:01:22.893    99    99 F DEBUG   :     #3 pc 00215e90  /system/lib/libskia.so (FindSortableTop(SkOpContourHead*)+88)
01-01 00:01:22.893    99    99 F DEBUG   :     #4 pc 001e1194  /system/lib/libskia.so (OpDebug(SkPath const&, SkPath const&, SkPathOp, SkPath*, bool)+868)
01-01 00:01:22.893    99    99 F DEBUG   :     #5 pc 001e1940  /system/lib/libskia.so (Op(SkPath const&, SkPath const&, SkPathOp, SkPath*)+44)
01-01 00:01:22.893    99    99 F DEBUG   :     #6 pc 33b8884c  /data/dalvik-cache/mips/system@[email protected] (offset 0x215b000)

Ingenic introduced new cache driver in video acceleration support patch and disabled
selection if MIPS_CPU_SCACHE which was used earlier. Cache driver selection has been
provided in order to easily select between new and old cache driver in case that problems
are encountered in the future.

Enabled VPU support in ci20_android_defconfig.

Removed dummy functions and switch to the proper ones in drivers/staging/imgtec/ci20/

Change-Id: I399ea09bcd5454339260b9dfd942a87a19a3cfa2
Signed-off-by: Dragan Cecavac <[email protected]>
pcercuei referenced this issue in OpenDingux/linux May 6, 2017
This is a story about 4 distinct (and very old) btrfs bugs.

Commit c8b9781 ("Btrfs: Add zlib compression support") added
three data corruption bugs for inline extents (bugs #1-3).

Commit 93c82d5 ("Btrfs: zero page past end of inline file items")
fixed bug #1:  uncompressed inline extents followed by a hole and more
extents could get non-zero data in the hole as they were read.  The fix
was to add a memset in btrfs_get_extent to zero out the hole.

Commit 166ae5a ("btrfs: fix inline compressed read err corruption")
fixed bug #2:  compressed inline extents which contained non-zero bytes
might be replaced with zero bytes in some cases.  This patch removed an
unhelpful memset from uncompress_inline, but the case where memset is
required was missed.

There is also a memset in the decompression code, but this only covers
decompressed data that is shorter than the ram_bytes from the extent
ref record.  This memset doesn't cover the region between the end of the
decompressed data and the end of the page.  It has also moved around a
few times over the years, so there's no single patch to refer to.

This patch fixes bug #3:  compressed inline extents followed by a hole
and more extents could get non-zero data in the hole as they were read
(i.e. bug #3 is the same as bug #1, but s/uncompressed/compressed/).
The fix is the same:  zero out the hole in the compressed case too,
by putting a memset back in uncompress_inline, but this time with
correct parameters.

The last and oldest bug, bug #0, is the cause of the offending inline
extent/hole/extent pattern.  Bug #0 is a subtle and mostly-harmless quirk
of behavior somewhere in the btrfs write code.  In a few special cases,
an inline extent and hole are allowed to persist where they normally
would be combined with later extents in the file.

A fast reproducer for bug #0 is presented below.  A few offending extents
are also created in the wild during large rsync transfers with the -S
flag.  A Linux kernel build (git checkout; make allyesconfig; make -j8)
will produce a handful of offending files as well.  Once an offending
file is created, it can present different content to userspace each
time it is read.

Bug #0 is at least 4 and possibly 8 years old.  I verified every vX.Y
kernel back to v3.5 has this behavior.  There are fossil records of this
bug's effects in commits all the way back to v2.6.32.  I have no reason
to believe bug #0 wasn't present at the beginning of btrfs compression
support in v2.6.29, but I can't easily test kernels that old to be sure.

It is not clear whether bug #0 is worth fixing.  A fix would likely
require injecting extra reads into currently write-only paths, and most
of the exceptional cases caused by bug #0 are already handled now.

Whether we like them or not, bug #0's inline extents followed by holes
are part of the btrfs de-facto disk format now, and we need to be able
to read them without data corruption or an infoleak.  So enough about
bug #0, let's get back to bug #3 (this patch).

An example of on-disk structure leading to data corruption found in
the wild:

        item 61 key (606890 INODE_ITEM 0) itemoff 9662 itemsize 160
                inode generation 50 transid 50 size 47424 nbytes 49141
                block group 0 mode 100644 links 1 uid 0 gid 0
                rdev 0 flags 0x0(none)
        item 62 key (606890 INODE_REF 603050) itemoff 9642 itemsize 20
                inode ref index 3 namelen 10 name: DB_File.so
        item 63 key (606890 EXTENT_DATA 0) itemoff 8280 itemsize 1362
                inline extent data size 1341 ram 4085 compress(zlib)
        item 64 key (606890 EXTENT_DATA 4096) itemoff 8227 itemsize 53
                extent data disk byte 5367308288 nr 20480
                extent data offset 0 nr 45056 ram 45056
                extent compression(zlib)

Different data appears in userspace during each read of the 11 bytes
between 4085 and 4096.  The extent in item 63 is not long enough to
fill the first page of the file, so a memset is required to fill the
space between item 63 (ending at 4085) and item 64 (beginning at 4096)
with zero.

Here is a reproducer from Liu Bo, which demonstrates another method
of creating the same inline extent and hole pattern:

Using 'page_poison=on' kernel command line (or enable
CONFIG_PAGE_POISONING) run the following:

	# touch foo
	# chattr +c foo
	# xfs_io -f -c "pwrite -W 0 1000" foo
	# xfs_io -f -c "falloc 4 8188" foo
	# od -x foo
	# echo 3 >/proc/sys/vm/drop_caches
	# od -x foo

This produce the following on my box:

Correct output:  file contains 1000 data bytes followed
by zeros:

	0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
	*
	0001740 cdcd cdcd cdcd cdcd 0000 0000 0000 0000
	0001760 0000 0000 0000 0000 0000 0000 0000 0000
	*
	0020000

Actual output:  the data after the first 1000 bytes
will be different each run:

	0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
	*
	0001740 cdcd cdcd cdcd cdcd 6c63 7400 635f 006d
	0001760 5f74 6f43 7400 435f 0053 5f74 7363 7400
	0002000 435f 0056 5f74 6164 7400 645f 0062 5f74
	(...)

Signed-off-by: Zygo Blaxell <[email protected]>
Reviewed-by: Liu Bo <[email protected]>
Reviewed-by: Chris Mason <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
pcercuei referenced this issue in OpenDingux/linux May 6, 2017
The rcu_barrier() takes the cpu_hotplug mutex which itself is not
reclaim-safe, and so rcu_barrier() is illegal from inside the shrinker.

[  309.661373] =========================================================
[  309.661376] [ INFO: possible irq lock inversion dependency detected ]
[  309.661380] 4.11.0-rc1-CI-CI_DRM_2333+ #1 Tainted: G        W
[  309.661383] ---------------------------------------------------------
[  309.661386] gem_exec_gttfil/6435 just changed the state of lock:
[  309.661389]  (rcu_preempt_state.barrier_mutex){+.+.-.}, at: [<ffffffff81100731>] _rcu_barrier+0x31/0x160
[  309.661399] but this lock took another, RECLAIM_FS-unsafe lock in the past:
[  309.661402]  (cpu_hotplug.lock){+.+.+.}
[  309.661404]

               and interrupts could create inverse lock ordering between them.

[  309.661410]
               other info that might help us debug this:
[  309.661414]  Possible interrupt unsafe locking scenario:

[  309.661417]        CPU0                    CPU1
[  309.661419]        ----                    ----
[  309.661421]   lock(cpu_hotplug.lock);
[  309.661425]                                local_irq_disable();
[  309.661432]                                lock(rcu_preempt_state.barrier_mutex);
[  309.661441]                                lock(cpu_hotplug.lock);
[  309.661446]   <Interrupt>
[  309.661448]     lock(rcu_preempt_state.barrier_mutex);
[  309.661453]
                *** DEADLOCK ***

[  309.661460] 4 locks held by gem_exec_gttfil/6435:
[  309.661464]  #0:  (sb_writers#10){.+.+.+}, at: [<ffffffff8120d83d>] vfs_write+0x17d/0x1f0
[  309.661475]  #1:  (debugfs_srcu){......}, at: [<ffffffff81320491>] debugfs_use_file_start+0x41/0xa0
[  309.661486]  #2:  (&attr->mutex){+.+.+.}, at: [<ffffffff8123a3e7>] simple_attr_write+0x37/0xe0
[  309.661495]  #3:  (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0091b4a>] i915_drop_caches_set+0x3a/0x150 [i915]
[  309.661540]
               the shortest dependencies between 2nd lock and 1st lock:
[  309.661547]  -> (cpu_hotplug.lock){+.+.+.} ops: 829 {
[  309.661553]     HARDIRQ-ON-W at:
[  309.661560]                       __lock_acquire+0x5e5/0x1b50
[  309.661565]                       lock_acquire+0xc9/0x220
[  309.661572]                       __mutex_lock+0x6e/0x990
[  309.661576]                       mutex_lock_nested+0x16/0x20
[  309.661583]                       get_online_cpus+0x61/0x80
[  309.661590]                       kmem_cache_create+0x25/0x1d0
[  309.661596]                       debug_objects_mem_init+0x30/0x249
[  309.661602]                       start_kernel+0x341/0x3fe
[  309.661607]                       x86_64_start_reservations+0x2a/0x2c
[  309.661612]                       x86_64_start_kernel+0x173/0x186
[  309.661619]                       verify_cpu+0x0/0xfc
[  309.661622]     SOFTIRQ-ON-W at:
[  309.661627]                       __lock_acquire+0x611/0x1b50
[  309.661632]                       lock_acquire+0xc9/0x220
[  309.661636]                       __mutex_lock+0x6e/0x990
[  309.661641]                       mutex_lock_nested+0x16/0x20
[  309.661646]                       get_online_cpus+0x61/0x80
[  309.661650]                       kmem_cache_create+0x25/0x1d0
[  309.661655]                       debug_objects_mem_init+0x30/0x249
[  309.661660]                       start_kernel+0x341/0x3fe
[  309.661664]                       x86_64_start_reservations+0x2a/0x2c
[  309.661669]                       x86_64_start_kernel+0x173/0x186
[  309.661674]                       verify_cpu+0x0/0xfc
[  309.661677]     RECLAIM_FS-ON-W at:
[  309.661682]                          mark_held_locks+0x6f/0xa0
[  309.661687]                          lockdep_trace_alloc+0xb3/0x100
[  309.661693]                          kmem_cache_alloc_trace+0x31/0x2e0
[  309.661699]                          __smpboot_create_thread.part.1+0x27/0xe0
[  309.661704]                          smpboot_create_threads+0x61/0x90
[  309.661709]                          cpuhp_invoke_callback+0x9c/0x8a0
[  309.661713]                          cpuhp_up_callbacks+0x31/0xb0
[  309.661718]                          _cpu_up+0x7a/0xc0
[  309.661723]                          do_cpu_up+0x5f/0x80
[  309.661727]                          cpu_up+0xe/0x10
[  309.661734]                          smp_init+0x71/0xb3
[  309.661738]                          kernel_init_freeable+0x94/0x19e
[  309.661743]                          kernel_init+0x9/0xf0
[  309.661748]                          ret_from_fork+0x2e/0x40
[  309.661752]     INITIAL USE at:
[  309.661757]                      __lock_acquire+0x234/0x1b50
[  309.661761]                      lock_acquire+0xc9/0x220
[  309.661766]                      __mutex_lock+0x6e/0x990
[  309.661771]                      mutex_lock_nested+0x16/0x20
[  309.661775]                      get_online_cpus+0x61/0x80
[  309.661780]                      __cpuhp_setup_state+0x44/0x170
[  309.661785]                      page_alloc_init+0x23/0x3a
[  309.661790]                      start_kernel+0x124/0x3fe
[  309.661794]                      x86_64_start_reservations+0x2a/0x2c
[  309.661799]                      x86_64_start_kernel+0x173/0x186
[  309.661804]                      verify_cpu+0x0/0xfc
[  309.661807]   }
[  309.661813]   ... key      at: [<ffffffff81e37690>] cpu_hotplug+0xb0/0x100
[  309.661817]   ... acquired at:
[  309.661821]    lock_acquire+0xc9/0x220
[  309.661825]    __mutex_lock+0x6e/0x990
[  309.661829]    mutex_lock_nested+0x16/0x20
[  309.661833]    get_online_cpus+0x61/0x80
[  309.661837]    _rcu_barrier+0x9f/0x160
[  309.661841]    rcu_barrier+0x10/0x20
[  309.661847]    netdev_run_todo+0x5f/0x310
[  309.661852]    rtnl_unlock+0x9/0x10
[  309.661856]    default_device_exit_batch+0x133/0x150
[  309.661862]    ops_exit_list.isra.0+0x4d/0x60
[  309.661866]    cleanup_net+0x1d8/0x2c0
[  309.661872]    process_one_work+0x1f4/0x6d0
[  309.661876]    worker_thread+0x49/0x4a0
[  309.661881]    kthread+0x107/0x140
[  309.661884]    ret_from_fork+0x2e/0x40

[  309.661890] -> (rcu_preempt_state.barrier_mutex){+.+.-.} ops: 179 {
[  309.661896]    HARDIRQ-ON-W at:
[  309.661901]                     __lock_acquire+0x5e5/0x1b50
[  309.661905]                     lock_acquire+0xc9/0x220
[  309.661910]                     __mutex_lock+0x6e/0x990
[  309.661914]                     mutex_lock_nested+0x16/0x20
[  309.661919]                     _rcu_barrier+0x31/0x160
[  309.661923]                     rcu_barrier+0x10/0x20
[  309.661928]                     netdev_run_todo+0x5f/0x310
[  309.661932]                     rtnl_unlock+0x9/0x10
[  309.661936]                     default_device_exit_batch+0x133/0x150
[  309.661941]                     ops_exit_list.isra.0+0x4d/0x60
[  309.661946]                     cleanup_net+0x1d8/0x2c0
[  309.661951]                     process_one_work+0x1f4/0x6d0
[  309.661955]                     worker_thread+0x49/0x4a0
[  309.661960]                     kthread+0x107/0x140
[  309.661964]                     ret_from_fork+0x2e/0x40
[  309.661968]    SOFTIRQ-ON-W at:
[  309.661972]                     __lock_acquire+0x611/0x1b50
[  309.661977]                     lock_acquire+0xc9/0x220
[  309.661981]                     __mutex_lock+0x6e/0x990
[  309.661986]                     mutex_lock_nested+0x16/0x20
[  309.661990]                     _rcu_barrier+0x31/0x160
[  309.661995]                     rcu_barrier+0x10/0x20
[  309.661999]                     netdev_run_todo+0x5f/0x310
[  309.662003]                     rtnl_unlock+0x9/0x10
[  309.662008]                     default_device_exit_batch+0x133/0x150
[  309.662013]                     ops_exit_list.isra.0+0x4d/0x60
[  309.662017]                     cleanup_net+0x1d8/0x2c0
[  309.662022]                     process_one_work+0x1f4/0x6d0
[  309.662027]                     worker_thread+0x49/0x4a0
[  309.662031]                     kthread+0x107/0x140
[  309.662035]                     ret_from_fork+0x2e/0x40
[  309.662039]    IN-RECLAIM_FS-W at:
[  309.662043]                        __lock_acquire+0x638/0x1b50
[  309.662048]                        lock_acquire+0xc9/0x220
[  309.662053]                        __mutex_lock+0x6e/0x990
[  309.662058]                        mutex_lock_nested+0x16/0x20
[  309.662062]                        _rcu_barrier+0x31/0x160
[  309.662067]                        rcu_barrier+0x10/0x20
[  309.662089]                        i915_gem_shrink_all+0x33/0x40 [i915]
[  309.662109]                        i915_drop_caches_set+0x141/0x150 [i915]
[  309.662114]                        simple_attr_write+0xc7/0xe0
[  309.662119]                        full_proxy_write+0x4f/0x70
[  309.662124]                        __vfs_write+0x23/0x120
[  309.662128]                        vfs_write+0xc6/0x1f0
[  309.662133]                        SyS_write+0x44/0xb0
[  309.662138]                        entry_SYSCALL_64_fastpath+0x1c/0xb1
[  309.662142]    INITIAL USE at:
[  309.662147]                    __lock_acquire+0x234/0x1b50
[  309.662151]                    lock_acquire+0xc9/0x220
[  309.662156]                    __mutex_lock+0x6e/0x990
[  309.662160]                    mutex_lock_nested+0x16/0x20
[  309.662165]                    _rcu_barrier+0x31/0x160
[  309.662169]                    rcu_barrier+0x10/0x20
[  309.662174]                    netdev_run_todo+0x5f/0x310
[  309.662178]                    rtnl_unlock+0x9/0x10
[  309.662183]                    default_device_exit_batch+0x133/0x150
[  309.662188]                    ops_exit_list.isra.0+0x4d/0x60
[  309.662192]                    cleanup_net+0x1d8/0x2c0
[  309.662197]                    process_one_work+0x1f4/0x6d0
[  309.662202]                    worker_thread+0x49/0x4a0
[  309.662206]                    kthread+0x107/0x140
[  309.662210]                    ret_from_fork+0x2e/0x40
[  309.662214]  }
[  309.662220]  ... key      at: [<ffffffff81e4e1c8>] rcu_preempt_state+0x508/0x780
[  309.662225]  ... acquired at:
[  309.662229]    check_usage_forwards+0x12b/0x130
[  309.662233]    mark_lock+0x360/0x6f0
[  309.662237]    __lock_acquire+0x638/0x1b50
[  309.662241]    lock_acquire+0xc9/0x220
[  309.662245]    __mutex_lock+0x6e/0x990
[  309.662249]    mutex_lock_nested+0x16/0x20
[  309.662253]    _rcu_barrier+0x31/0x160
[  309.662257]    rcu_barrier+0x10/0x20
[  309.662279]    i915_gem_shrink_all+0x33/0x40 [i915]
[  309.662298]    i915_drop_caches_set+0x141/0x150 [i915]
[  309.662303]    simple_attr_write+0xc7/0xe0
[  309.662307]    full_proxy_write+0x4f/0x70
[  309.662311]    __vfs_write+0x23/0x120
[  309.662315]    vfs_write+0xc6/0x1f0
[  309.662319]    SyS_write+0x44/0xb0
[  309.662323]    entry_SYSCALL_64_fastpath+0x1c/0xb1

[  309.662329]
               stack backtrace:
[  309.662335] CPU: 1 PID: 6435 Comm: gem_exec_gttfil Tainted: G        W       4.11.0-rc1-CI-CI_DRM_2333+ #1
[  309.662342] Hardware name: Hewlett-Packard HP Compaq 8100 Elite SFF PC/304Ah, BIOS 786H1 v01.13 07/14/2011
[  309.662348] Call Trace:
[  309.662354]  dump_stack+0x67/0x92
[  309.662359]  print_irq_inversion_bug.part.19+0x1a4/0x1b0
[  309.662365]  check_usage_forwards+0x12b/0x130
[  309.662369]  mark_lock+0x360/0x6f0
[  309.662374]  ? print_shortest_lock_dependencies+0x1a0/0x1a0
[  309.662379]  __lock_acquire+0x638/0x1b50
[  309.662383]  ? __mutex_unlock_slowpath+0x3e/0x2e0
[  309.662388]  ? trace_hardirqs_on+0xd/0x10
[  309.662392]  ? _rcu_barrier+0x31/0x160
[  309.662396]  lock_acquire+0xc9/0x220
[  309.662400]  ? _rcu_barrier+0x31/0x160
[  309.662404]  ? _rcu_barrier+0x31/0x160
[  309.662409]  __mutex_lock+0x6e/0x990
[  309.662412]  ? _rcu_barrier+0x31/0x160
[  309.662416]  ? _rcu_barrier+0x31/0x160
[  309.662421]  ? synchronize_rcu_expedited+0x35/0xb0
[  309.662426]  ? _raw_spin_unlock_irqrestore+0x52/0x60
[  309.662434]  mutex_lock_nested+0x16/0x20
[  309.662438]  _rcu_barrier+0x31/0x160
[  309.662442]  rcu_barrier+0x10/0x20
[  309.662464]  i915_gem_shrink_all+0x33/0x40 [i915]
[  309.662484]  i915_drop_caches_set+0x141/0x150 [i915]
[  309.662489]  simple_attr_write+0xc7/0xe0
[  309.662494]  full_proxy_write+0x4f/0x70
[  309.662498]  __vfs_write+0x23/0x120
[  309.662503]  ? rcu_read_lock_sched_held+0x75/0x80
[  309.662507]  ? rcu_sync_lockdep_assert+0x2a/0x50
[  309.662512]  ? __sb_start_write+0x102/0x210
[  309.662516]  ? vfs_write+0x17d/0x1f0
[  309.662520]  vfs_write+0xc6/0x1f0
[  309.662524]  ? trace_hardirqs_on_caller+0xe7/0x200
[  309.662529]  SyS_write+0x44/0xb0
[  309.662533]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[  309.662537] RIP: 0033:0x7f507eac24a0
[  309.662541] RSP: 002b:00007fffda8720e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  309.662548] RAX: ffffffffffffffda RBX: ffffffff81482bd3 RCX: 00007f507eac24a0
[  309.662552] RDX: 0000000000000005 RSI: 00007fffda8720f0 RDI: 0000000000000005
[  309.662557] RBP: ffffc9000048bf88 R08: 0000000000000000 R09: 000000000000002c
[  309.662561] R10: 0000000000000014 R11: 0000000000000246 R12: 00007fffda872230
[  309.662566] R13: 00007fffda872228 R14: 0000000000000201 R15: 00007fffda8720f0
[  309.662572]  ? __this_cpu_preempt_check+0x13/0x20

Fixes: 0eafec6 ("drm/i915: Enable lockless lookup of request tracking via RCU")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100192
Signed-off-by: Chris Wilson <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: <[email protected]> # v4.9+
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Daniel Vetter <[email protected]>
(cherry picked from commit bd784b7)
Signed-off-by: Jani Nikula <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
pcercuei referenced this issue in OpenDingux/linux Feb 24, 2019
commit 57b0e31 upstream.

We need to check the return value of match_token() for Opt_err before
doing anything with it.

[ Not only did the old "-1" value for Opt_err cause problems for the
  __test_and_set_bit(), as fixed in commit 94c13f6 ("security:
  don't use a negative Opt_err token index"), but accessing
  "args[0].from" is invalid for the Opt_err case, as pointed out by Eric
  later.  - Linus ]

Reported-by: [email protected]
Fixes: 00d60fd ("KEYS: Provide keyctls to drive the new key type ops for asymmetric keys [ver #2]")
Signed-off-by: Eric Biggers <[email protected]>
Cc: [email protected] # 4.20
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Feb 24, 2019
…er shutdown

commit 60a161b upstream.

Suppose adapter (open) recovery is between opened QDIO queues and before
(the end of) initial posting of status read buffers (SRBs). This time
window can be seconds long due to FSF_PROT_HOST_CONNECTION_INITIALIZING
causing by design looping with exponential increase sleeps in the function
performing exchange config data during recovery
[zfcp_erp_adapter_strat_fsf_xconf()]. Recovery triggered by local link up.

Suppose an event occurs for which the FCP channel would send an unsolicited
notification to zfcp by means of a previously posted SRB.  We saw it with
local cable pull (link down) in multi-initiator zoning with multiple
NPIV-enabled subchannels of the same shared FCP channel.

As soon as zfcp_erp_adapter_strategy_open_fsf() starts posting the initial
status read buffers from within the adapter's ERP thread, the channel does
send an unsolicited notification.

Since v2.6.27 commit d26ab06 ("[SCSI] zfcp: receiving an unsolicted
status can lead to I/O stall"), zfcp_fsf_status_read_handler() schedules
adapter->stat_work to re-fill the just consumed SRB from a work item.

Now the ERP thread and the work item post SRBs in parallel.  Both contexts
call the helper function zfcp_status_read_refill().  The tracking of
missing (to be posted / re-filled) SRBs is not thread-safe due to separate
atomic_read() and atomic_dec(), in order to depend on posting
success. Hence, both contexts can see
atomic_read(&adapter->stat_miss) == 1. One of the two contexts posts
one too many SRB. Zfcp gets QDIO_ERROR_SLSB_STATE on the output queue
(trace tag "qdireq1") leading to zfcp_erp_adapter_shutdown() in
zfcp_qdio_handler_error().

An obvious and seemingly clean fix would be to schedule stat_work from the
ERP thread and wait for it to finish. This would serialize all SRB
re-fills. However, we already have another work item wait on the ERP
thread: adapter->scan_work runs zfcp_fc_scan_ports() which calls
zfcp_fc_eval_gpn_ft(). The latter calls zfcp_erp_wait() to wait for all the
open port recoveries during zfcp auto port scan, but in fact it waits for
any pending recovery including an adapter recovery. This approach leads to
a deadlock.  [see also v3.19 commit 18f87a6 ("zfcp: auto port scan
resiliency"); v2.6.37 commit d3e1088
("[SCSI] zfcp: No ERP escalation on gpn_ft eval");
v2.6.28 commit fca55b6
("[SCSI] zfcp: fix deadlock between wq triggered port scan and ERP")
fixing v2.6.27 commit c57a39a
("[SCSI] zfcp: wait until adapter is finished with ERP during auto-port");
v2.6.27 commit cc8c282
("[SCSI] zfcp: Automatically attach remote ports")]

Instead make the accounting of missing SRBs atomic for parallel execution
in both the ERP thread and adapter->stat_work.

Signed-off-by: Steffen Maier <[email protected]>
Fixes: d26ab06 ("[SCSI] zfcp: receiving an unsolicted status can lead to I/O stall")
Cc: <[email protected]> #2.6.27+
Reviewed-by: Jens Remus <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Feb 24, 2019
commit 8b66fee upstream.

syzbot reports following splat:

BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:486
CPU: 1 PID: 11057 Comm: syz-executor0 Not tainted 4.20.0-rc7+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x173/0x1d0 lib/dump_stack.c:113
 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
 __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:295
 strlen+0x3b/0xa0 lib/string.c:486
 nla_put_string include/net/netlink.h:1154 [inline]
 tipc_nl_compat_link_reset_stats+0x1f0/0x360 net/tipc/netlink_compat.c:760
 __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline]
 tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:344
 tipc_nl_compat_handle net/tipc/netlink_compat.c:1107 [inline]
 tipc_nl_compat_recv+0x14d7/0x2760 net/tipc/netlink_compat.c:1210
 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
 genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626
 netlink_rcv_skb+0x444/0x640 net/netlink/af_netlink.c:2477
 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
 netlink_unicast+0xf40/0x1020 net/netlink/af_netlink.c:1336
 netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917
 sock_sendmsg_nosec net/socket.c:621 [inline]
 sock_sendmsg net/socket.c:631 [inline]
 ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116
 __sys_sendmsg net/socket.c:2154 [inline]
 __do_sys_sendmsg net/socket.c:2163 [inline]
 __se_sys_sendmsg+0x305/0x460 net/socket.c:2161
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x457ec9
Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f2557338c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457ec9
RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f25573396d4
R13: 00000000004cb478 R14: 00000000004d86c8 R15: 00000000ffffffff

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
 kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158
 kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176
 kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185
 slab_post_alloc_hook mm/slab.h:446 [inline]
 slab_alloc_node mm/slub.c:2759 [inline]
 __kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383
 __kmalloc_reserve net/core/skbuff.c:137 [inline]
 __alloc_skb+0x309/0xa20 net/core/skbuff.c:205
 alloc_skb include/linux/skbuff.h:998 [inline]
 netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline]
 netlink_sendmsg+0xb82/0x1300 net/netlink/af_netlink.c:1892
 sock_sendmsg_nosec net/socket.c:621 [inline]
 sock_sendmsg net/socket.c:631 [inline]
 ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116
 __sys_sendmsg net/socket.c:2154 [inline]
 __do_sys_sendmsg net/socket.c:2163 [inline]
 __se_sys_sendmsg+0x305/0x460 net/socket.c:2161
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x63/0xe7

The uninitialised access happened in tipc_nl_compat_link_reset_stats:
    nla_put_string(skb, TIPC_NLA_LINK_NAME, name)

This is because name string is not validated before it's used.

Reported-by: [email protected]
Signed-off-by: Ying Xue <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Feb 24, 2019
commit edf5ff0 upstream.

syzbot reports following splat:

BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:486
CPU: 1 PID: 9306 Comm: syz-executor172 Not tainted 4.20.0-rc7+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x173/0x1d0 lib/dump_stack.c:113
  kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
  __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313
  strlen+0x3b/0xa0 lib/string.c:486
  nla_put_string include/net/netlink.h:1154 [inline]
  __tipc_nl_compat_link_set net/tipc/netlink_compat.c:708 [inline]
  tipc_nl_compat_link_set+0x929/0x1220 net/tipc/netlink_compat.c:744
  __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline]
  tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:344
  tipc_nl_compat_handle net/tipc/netlink_compat.c:1107 [inline]
  tipc_nl_compat_recv+0x14d7/0x2760 net/tipc/netlink_compat.c:1210
  genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
  genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626
  netlink_rcv_skb+0x444/0x640 net/netlink/af_netlink.c:2477
  genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
  netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
  netlink_unicast+0xf40/0x1020 net/netlink/af_netlink.c:1336
  netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917
  sock_sendmsg_nosec net/socket.c:621 [inline]
  sock_sendmsg net/socket.c:631 [inline]
  ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116
  __sys_sendmsg net/socket.c:2154 [inline]
  __do_sys_sendmsg net/socket.c:2163 [inline]
  __se_sys_sendmsg+0x305/0x460 net/socket.c:2161
  __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
  do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
  entry_SYSCALL_64_after_hwframe+0x63/0xe7

The uninitialised access happened in
    nla_put_string(skb, TIPC_NLA_LINK_NAME, lc->name)

This is because lc->name string is not validated before it's used.

Reported-by: [email protected]
Signed-off-by: Ying Xue <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Feb 24, 2019
[ Upstream commit d8797b1 ]

__install_bp_hardening_cb() is called via stop_machine() as part
of the cpu_enable callback. To force each CPU to take its turn
when allocating slots, they take a spinlock.

With the RT patches applied, the spinlock becomes a mutex,
and we get warnings about sleeping while in stop_machine():
| [    0.319176] CPU features: detected: RAS Extension Support
| [    0.319950] BUG: scheduling while atomic: migration/3/36/0x00000002
| [    0.319955] Modules linked in:
| [    0.319958] Preemption disabled at:
| [    0.319969] [<ffff000008181ae4>] cpu_stopper_thread+0x7c/0x108
| [    0.319973] CPU: 3 PID: 36 Comm: migration/3 Not tainted 4.19.1-rt3-00250-g330fc2c2a880 #2
| [    0.319975] Hardware name: linux,dummy-virt (DT)
| [    0.319976] Call trace:
| [    0.319981]  dump_backtrace+0x0/0x148
| [    0.319983]  show_stack+0x14/0x20
| [    0.319987]  dump_stack+0x80/0xa4
| [    0.319989]  __schedule_bug+0x94/0xb0
| [    0.319991]  __schedule+0x510/0x560
| [    0.319992]  schedule+0x38/0xe8
| [    0.319994]  rt_spin_lock_slowlock_locked+0xf0/0x278
| [    0.319996]  rt_spin_lock_slowlock+0x5c/0x90
| [    0.319998]  rt_spin_lock+0x54/0x58
| [    0.320000]  enable_smccc_arch_workaround_1+0xdc/0x260
| [    0.320001]  __enable_cpu_capability+0x10/0x20
| [    0.320003]  multi_cpu_stop+0x84/0x108
| [    0.320004]  cpu_stopper_thread+0x84/0x108
| [    0.320008]  smpboot_thread_fn+0x1e8/0x2b0
| [    0.320009]  kthread+0x124/0x128
| [    0.320010]  ret_from_fork+0x10/0x18

Switch this to a raw spinlock, as we know this is only called with
IRQs masked.

Signed-off-by: James Morse <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Feb 24, 2019
[ Upstream commit 5a86d68 ]

When network namespace is destroyed, cleanup_net() is called.
cleanup_net() holds pernet_ops_rwsem then calls each ->exit callback.
So that clusterip_tg_destroy() is called by cleanup_net().
And clusterip_tg_destroy() calls unregister_netdevice_notifier().

But both cleanup_net() and clusterip_tg_destroy() hold same
lock(pernet_ops_rwsem). hence deadlock occurrs.

After this patch, only 1 notifier is registered when module is inserted.
And all of configs are added to per-net list.

test commands:
   %ip netns add vm1
   %ip netns exec vm1 bash
   %ip link set lo up
   %iptables -A INPUT -p tcp -i lo -d 192.168.0.5 --dport 80 \
	-j CLUSTERIP --new --hashmode sourceip \
	--clustermac 01:00:5e:00:00:20 --total-nodes 2 --local-node 1
   %exit
   %ip netns del vm1

splat looks like:
[  341.809674] ============================================
[  341.809674] WARNING: possible recursive locking detected
[  341.809674] 4.19.0-rc5+ MIPS#16 Tainted: G        W
[  341.809674] --------------------------------------------
[  341.809674] kworker/u4:2/87 is trying to acquire lock:
[  341.809674] 000000005da2d519 (pernet_ops_rwsem){++++}, at: unregister_netdevice_notifier+0x8c/0x460
[  341.809674]
[  341.809674] but task is already holding lock:
[  341.809674] 000000005da2d519 (pernet_ops_rwsem){++++}, at: cleanup_net+0x119/0x900
[  341.809674]
[  341.809674] other info that might help us debug this:
[  341.809674]  Possible unsafe locking scenario:
[  341.809674]
[  341.809674]        CPU0
[  341.809674]        ----
[  341.809674]   lock(pernet_ops_rwsem);
[  341.809674]   lock(pernet_ops_rwsem);
[  341.809674]
[  341.809674]  *** DEADLOCK ***
[  341.809674]
[  341.809674]  May be due to missing lock nesting notation
[  341.809674]
[  341.809674] 3 locks held by kworker/u4:2/87:
[  341.809674]  #0: 00000000d9df6c92 ((wq_completion)"%s""netns"){+.+.}, at: process_one_work+0xafe/0x1de0
[  341.809674]  #1: 00000000c2cbcee2 (net_cleanup_work){+.+.}, at: process_one_work+0xb60/0x1de0
[  341.809674]  #2: 000000005da2d519 (pernet_ops_rwsem){++++}, at: cleanup_net+0x119/0x900
[  341.809674]
[  341.809674] stack backtrace:
[  341.809674] CPU: 1 PID: 87 Comm: kworker/u4:2 Tainted: G        W         4.19.0-rc5+ MIPS#16
[  341.809674] Workqueue: netns cleanup_net
[  341.809674] Call Trace:
[ ... ]
[  342.070196]  down_write+0x93/0x160
[  342.070196]  ? unregister_netdevice_notifier+0x8c/0x460
[  342.070196]  ? down_read+0x1e0/0x1e0
[  342.070196]  ? sched_clock_cpu+0x126/0x170
[  342.070196]  ? find_held_lock+0x39/0x1c0
[  342.070196]  unregister_netdevice_notifier+0x8c/0x460
[  342.070196]  ? register_netdevice_notifier+0x790/0x790
[  342.070196]  ? __local_bh_enable_ip+0xe9/0x1b0
[  342.070196]  ? __local_bh_enable_ip+0xe9/0x1b0
[  342.070196]  ? clusterip_tg_destroy+0x372/0x650 [ipt_CLUSTERIP]
[  342.070196]  ? trace_hardirqs_on+0x93/0x210
[  342.070196]  ? __bpf_trace_preemptirq_template+0x10/0x10
[  342.070196]  ? clusterip_tg_destroy+0x372/0x650 [ipt_CLUSTERIP]
[  342.123094]  clusterip_tg_destroy+0x3ad/0x650 [ipt_CLUSTERIP]
[  342.123094]  ? clusterip_net_init+0x3d0/0x3d0 [ipt_CLUSTERIP]
[  342.123094]  ? cleanup_match+0x17d/0x200 [ip_tables]
[  342.123094]  ? xt_unregister_table+0x215/0x300 [x_tables]
[  342.123094]  ? kfree+0xe2/0x2a0
[  342.123094]  cleanup_entry+0x1d5/0x2f0 [ip_tables]
[  342.123094]  ? cleanup_match+0x200/0x200 [ip_tables]
[  342.123094]  __ipt_unregister_table+0x9b/0x1a0 [ip_tables]
[  342.123094]  iptable_filter_net_exit+0x43/0x80 [iptable_filter]
[  342.123094]  ops_exit_list.isra.10+0x94/0x140
[  342.123094]  cleanup_net+0x45b/0x900
[ ... ]

Fixes: 202f59a ("netfilter: ipt_CLUSTERIP: do not hold dev")
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Feb 24, 2019
[ Upstream commit 4f4b374 ]

This is the much more correct fix for my earlier attempt at:

https://lkml.org/lkml/2018/12/10/118

Short recap:

- There's not actually a locking issue, it's just lockdep being a bit
  too eager to complain about a possible deadlock.

- Contrary to what I claimed the real problem is recursion on
  kn->count. Greg pointed me at sysfs_break_active_protection(), used
  by the scsi subsystem to allow a sysfs file to unbind itself. That
  would be a real deadlock, which isn't what's happening here. Also,
  breaking the active protection means we'd need to manually handle
  all the lifetime fun.

- With Rafael we discussed the task_work approach, which kinda works,
  but has two downsides: It's a functional change for a lockdep
  annotation issue, and it won't work for the bind file (which needs
  to get the errno from the driver load function back to userspace).

- Greg also asked why this never showed up: To hit this you need to
  unregister a 2nd driver from the unload code of your first driver. I
  guess only gpus do that. The bug has always been there, but only
  with a recent patch series did we add more locks so that lockdep
  built a chain from unbinding the snd-hda driver to the
  acpi_video_unregister call.

Full lockdep splat:

[12301.898799] ============================================
[12301.898805] WARNING: possible recursive locking detected
[12301.898811] 4.20.0-rc7+ MIPS#84 Not tainted
[12301.898815] --------------------------------------------
[12301.898821] bash/5297 is trying to acquire lock:
[12301.898826] 00000000f61c6093 (kn->count#39){++++}, at: kernfs_remove_by_name_ns+0x3b/0x80
[12301.898841] but task is already holding lock:
[12301.898847] 000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190
[12301.898856] other info that might help us debug this:
[12301.898862]  Possible unsafe locking scenario:
[12301.898867]        CPU0
[12301.898870]        ----
[12301.898874]   lock(kn->count#39);
[12301.898879]   lock(kn->count#39);
[12301.898883] *** DEADLOCK ***
[12301.898891]  May be due to missing lock nesting notation
[12301.898899] 5 locks held by bash/5297:
[12301.898903]  #0: 00000000cd800e54 (sb_writers#4){.+.+}, at: vfs_write+0x17f/0x1b0
[12301.898915]  #1: 000000000465e7c2 (&of->mutex){+.+.}, at: kernfs_fop_write+0xd3/0x190
[12301.898925]  #2: 000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190
[12301.898936]  #3: 00000000414ef7ac (&dev->mutex){....}, at: device_release_driver_internal+0x34/0x240
[12301.898950]  #4: 000000003218fbdf (register_count_mutex){+.+.}, at: acpi_video_unregister+0xe/0x40
[12301.898960] stack backtrace:
[12301.898968] CPU: 1 PID: 5297 Comm: bash Not tainted 4.20.0-rc7+ MIPS#84
[12301.898974] Hardware name: Hewlett-Packard HP EliteBook 8460p/161C, BIOS 68SCF Ver. F.01 03/11/2011
[12301.898982] Call Trace:
[12301.898989]  dump_stack+0x67/0x9b
[12301.898997]  __lock_acquire+0x6ad/0x1410
[12301.899003]  ? kernfs_remove_by_name_ns+0x3b/0x80
[12301.899010]  ? find_held_lock+0x2d/0x90
[12301.899017]  ? mutex_spin_on_owner+0xe4/0x150
[12301.899023]  ? find_held_lock+0x2d/0x90
[12301.899030]  ? lock_acquire+0x90/0x180
[12301.899036]  lock_acquire+0x90/0x180
[12301.899042]  ? kernfs_remove_by_name_ns+0x3b/0x80
[12301.899049]  __kernfs_remove+0x296/0x310
[12301.899055]  ? kernfs_remove_by_name_ns+0x3b/0x80
[12301.899060]  ? kernfs_name_hash+0xd/0x80
[12301.899066]  ? kernfs_find_ns+0x6c/0x100
[12301.899073]  kernfs_remove_by_name_ns+0x3b/0x80
[12301.899080]  bus_remove_driver+0x92/0xa0
[12301.899085]  acpi_video_unregister+0x24/0x40
[12301.899127]  i915_driver_unload+0x42/0x130 [i915]
[12301.899160]  i915_pci_remove+0x19/0x30 [i915]
[12301.899169]  pci_device_remove+0x36/0xb0
[12301.899176]  device_release_driver_internal+0x185/0x240
[12301.899183]  unbind_store+0xaf/0x180
[12301.899189]  kernfs_fop_write+0x104/0x190
[12301.899195]  __vfs_write+0x31/0x180
[12301.899203]  ? rcu_read_lock_sched_held+0x6f/0x80
[12301.899209]  ? rcu_sync_lockdep_assert+0x29/0x50
[12301.899216]  ? __sb_start_write+0x13c/0x1a0
[12301.899221]  ? vfs_write+0x17f/0x1b0
[12301.899227]  vfs_write+0xb9/0x1b0
[12301.899233]  ksys_write+0x50/0xc0
[12301.899239]  do_syscall_64+0x4b/0x180
[12301.899247]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[12301.899253] RIP: 0033:0x7f452ac7f7a4
[12301.899259] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 aa f0 2c 00 48 63 ff 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 55 53 48 89 d5 48 89 f3 48 83
[12301.899273] RSP: 002b:00007ffceafa6918 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[12301.899282] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f452ac7f7a4
[12301.899288] RDX: 000000000000000d RSI: 00005612a1abf7c0 RDI: 0000000000000001
[12301.899295] RBP: 00005612a1abf7c0 R08: 000000000000000a R09: 00005612a1c46730
[12301.899301] R10: 000000000000000a R11: 0000000000000246 R12: 000000000000000d
[12301.899308] R13: 0000000000000001 R14: 00007f452af4a740 R15: 000000000000000d

Looking around I've noticed that usb and i2c already handle similar
recursion problems, where a sysfs file can unbind the same type of
sysfs somewhere else in the hierarchy. Relevant commits are:

commit 356c05d
Author: Alan Stern <[email protected]>
Date:   Mon May 14 13:30:03 2012 -0400

    sysfs: get rid of some lockdep false positives

commit e9b526f
Author: Alexander Sverdlin <[email protected]>
Date:   Fri May 17 14:56:35 2013 +0200

    i2c: suppress lockdep warning on delete_device

Implement the same trick for driver bind/unbind.

v2: Put the macro into bus.c (Greg).

Reviewed-by: Rafael J. Wysocki <[email protected]>
Cc: Ramalingam C <[email protected]>
Cc: Arend van Spriel <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Bartosz Golaszewski <[email protected]>
Cc: Heikki Krogerus <[email protected]>
Cc: Vivek Gautam <[email protected]>
Cc: Joe Perches <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Feb 24, 2019
[ Upstream commit 532e1e5 ]

mount.ocfs2 ignore the inconsistent error that journal is clean but
local alloc is unrecovered.  After mount, local alloc not empty, then
reserver cluster didn't alloc a new local alloc window, reserveration
map is empty(ocfs2_reservation_map.m_bitmap_len = 0), that triggered the
following panic.

This issue was reported at

  https://oss.oracle.com/pipermail/ocfs2-devel/2015-May/010854.html

and was advised to fixed during mount.  But this is a very unusual
inconsistent state, usually journal dirty flag should be cleared at the
last stage of umount until every other things go right.  We may need do
further debug to check that.  Any way to avoid possible futher
corruption, mount should be abort and fsck should be run.

  (mount.ocfs2,1765,1):ocfs2_load_local_alloc:353 ERROR: Local alloc hasn't been recovered!
  found = 6518, set = 6518, taken = 8192, off = 15912372
  ocfs2: Mounting device (202,64) on (node 0, slot 3) with ordered data mode.
  o2dlm: Joining domain 89CEAC63CC4F4D03AC185B44E0EE0F3F ( 0 1 2 3 4 5 6 8 ) 8 nodes
  ocfs2: Mounting device (202,80) on (node 0, slot 3) with ordered data mode.
  o2hb: Region 89CEAC63CC4F4D03AC185B44E0EE0F3F (xvdf) is now a quorum device
  o2net: Accepted connection from node yvwsoa17p (num 7) at 172.22.77.88:7777
  o2dlm: Node 7 joins domain 64FE421C8C984E6D96ED12C55FEE2435 ( 0 1 2 3 4 5 6 7 8 ) 9 nodes
  o2dlm: Node 7 joins domain 89CEAC63CC4F4D03AC185B44E0EE0F3F ( 0 1 2 3 4 5 6 7 8 ) 9 nodes
  ------------[ cut here ]------------
  kernel BUG at fs/ocfs2/reservations.c:507!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: ocfs2 rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs fscache lockd grace ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sunrpc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 ovmapi ppdev parport_pc parport xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea acpi_cpufreq pcspkr i2c_piix4 i2c_core sg ext4 jbd2 mbcache2 sr_mod cdrom xen_blkfront pata_acpi ata_generic ata_piix floppy dm_mirror dm_region_hash dm_log dm_mod
  CPU: 0 PID: 4349 Comm: startWebLogic.s Not tainted 4.1.12-124.19.2.el6uek.x86_64 #2
  Hardware name: Xen HVM domU, BIOS 4.4.4OVM 09/06/2018
  task: ffff8803fb04e200 ti: ffff8800ea4d8000 task.ti: ffff8800ea4d8000
  RIP: 0010:[<ffffffffa05e96a8>]  [<ffffffffa05e96a8>] __ocfs2_resv_find_window+0x498/0x760 [ocfs2]
  Call Trace:
    ocfs2_resmap_resv_bits+0x10d/0x400 [ocfs2]
    ocfs2_claim_local_alloc_bits+0xd0/0x640 [ocfs2]
    __ocfs2_claim_clusters+0x178/0x360 [ocfs2]
    ocfs2_claim_clusters+0x1f/0x30 [ocfs2]
    ocfs2_convert_inline_data_to_extents+0x634/0xa60 [ocfs2]
    ocfs2_write_begin_nolock+0x1c6/0x1da0 [ocfs2]
    ocfs2_write_begin+0x13e/0x230 [ocfs2]
    generic_perform_write+0xbf/0x1c0
    __generic_file_write_iter+0x19c/0x1d0
    ocfs2_file_write_iter+0x589/0x1360 [ocfs2]
    __vfs_write+0xb8/0x110
    vfs_write+0xa9/0x1b0
    SyS_write+0x46/0xb0
    system_call_fastpath+0x18/0xd7
  Code: ff ff 8b 75 b8 39 75 b0 8b 45 c8 89 45 98 0f 84 e5 fe ff ff 45 8b 74 24 18 41 8b 54 24 1c e9 56 fc ff ff 85 c0 0f 85 48 ff ff ff <0f> 0b 48 8b 05 cf c3 de ff 48 ba 00 00 00 00 00 00 00 10 48 85
  RIP   __ocfs2_resv_find_window+0x498/0x760 [ocfs2]
   RSP <ffff8800ea4db668>
  ---[ end trace 566f07529f2edf3c ]---
  Kernel panic - not syncing: Fatal exception
  Kernel Offset: disabled

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Junxiao Bi <[email protected]>
Reviewed-by: Yiwen Jiang <[email protected]>
Acked-by: Joseph Qi <[email protected]>
Cc: Jun Piao <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Changwei Ge <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Feb 24, 2019
[ Upstream commit 427c5ce ]

As of v4.20, the swim3 driver crashes when loaded on a PowerBook G3
(Wallstreet).

MacIO PCI driver attached to Gatwick chipset
MacIO PCI driver attached to Heathrow chipset
swim3 0.00015000:floppy: [fd0] SWIM3 floppy controller in media bay
0.00013020:ch-a: ttyS0 at MMIO 0xf3013020 (irq = 16, base_baud = 230400) is a Z85c30 ESCC - Serial port
0.00013000:ch-b: ttyS1 at MMIO 0xf3013000 (irq = 17, base_baud = 230400) is a Z85c30 ESCC - Infrared port
macio: fixed media-bay irq on gatwick
macio: fixed left floppy irqs
swim3 1.00015000:floppy: [fd1] Couldn't request interrupt
Unable to handle kernel paging request for data at address 0x00000024
Faulting instruction address: 0xc02652f8
Oops: Kernel access of bad area, sig: 11 [#1]
BE SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0 #2
NIP:  c02652f8 LR: c026915c CTR: c0276d1c
REGS: df43ba10 TRAP: 0300   Not tainted  (4.20.0)
MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 28228288  XER: 00000100
DAR: 00000024 DSISR: 40000000
GPR00: c026915c df43bac0 df439060 c0731524 df494700 00000000 c06e1c08 00000001
GPR08: 00000001 00000000 df5ff220 00001032 28228282 00000000 c0004ca4 00000000
GPR16: 00000000 00000000 00000000 c073144c dfffe064 c0731524 00000120 c0586108
GPR24: c073132c c073143c c073143c 00000000 c0731524 df67cd70 df494700 00000001
NIP [c02652f8] blk_mq_free_rqs+0x28/0xf8
LR [c026915c] blk_mq_sched_tags_teardown+0x58/0x84
Call Trace:
[df43bac0] [c0045f50] flush_workqueue_prep_pwqs+0x178/0x1c4 (unreliable)
[df43bae0] [c026915c] blk_mq_sched_tags_teardown+0x58/0x84
[df43bb00] [c02697f0] blk_mq_exit_sched+0x9c/0xb8
[df43bb20] [c0252794] elevator_exit+0x84/0xa4
[df43bb40] [c0256538] blk_exit_queue+0x30/0x50
[df43bb50] [c0256640] blk_cleanup_queue+0xe8/0x184
[df43bb70] [c034732c] swim3_attach+0x330/0x5f0
[df43bbb0] [c034fb24] macio_device_probe+0x58/0xec
[df43bbd0] [c032ba88] really_probe+0x1e4/0x2f4
[df43bc00] [c032bd28] driver_probe_device+0x64/0x204
[df43bc20] [c0329ac4] bus_for_each_drv+0x60/0xac
[df43bc50] [c032b824] __device_attach+0xe8/0x160
[df43bc80] [c032ab38] bus_probe_device+0xa0/0xbc
[df43bca0] [c0327338] device_add+0x3d8/0x630
[df43bcf0] [c0350848] macio_add_one_device+0x444/0x48c
[df43bd50] [c03509f8] macio_pci_add_devices+0x168/0x1bc
[df43bd90] [c03500ec] macio_pci_probe+0xc0/0x10c
[df43bda0] [c02ad884] pci_device_probe+0xd4/0x184
[df43bdd0] [c032ba88] really_probe+0x1e4/0x2f4
[df43be00] [c032bd28] driver_probe_device+0x64/0x204
[df43be20] [c032bfcc] __driver_attach+0x104/0x108
[df43be40] [c0329a00] bus_for_each_dev+0x64/0xb4
[df43be70] [c032add8] bus_add_driver+0x154/0x238
[df43be90] [c032ca24] driver_register+0x84/0x148
[df43bea0] [c0004aa0] do_one_initcall+0x40/0x188
[df43bf00] [c0690100] kernel_init_freeable+0x138/0x1d4
[df43bf30] [c0004cbc] kernel_init+0x18/0x10c
[df43bf40] [c00121e4] ret_from_kernel_thread+0x14/0x1c
Instruction dump:
5484d97e 4bfff4f4 9421ffe0 7c0802a6 bf410008 7c9e2378 90010024 8124005c
2f890000 419e0078 81230004 7c7c1b78 <81290024> 2f890000 419e0064 81440000
---[ end trace 12025ab921a9784c ]---

Reverting commit 8ccb8cb ("swim3: convert to blk-mq") resolves the
problem.

That commit added a struct blk_mq_tag_set to struct floppy_state and
initialized it with a blk_mq_init_sq_queue() call. Unfortunately, there
is a memset() in swim3_add_device() that subsequently clears the
floppy_state struct. That means fs->tag_set->ops is a NULL pointer, and
it gets dereferenced by blk_mq_free_rqs() which gets called in the
request_irq() error path. Move the memset() to fix this bug.

BTW, the request_irq() failure for the left mediabay floppy (fd1) is not
a regression. I don't know why it happens. The right media bay floppy
(fd0) works fine however.

Reported-and-tested-by: Stan Johnson <[email protected]>
Fixes: 8ccb8cb ("swim3: convert to blk-mq")
Cc: [email protected]
Signed-off-by: Finn Thain <[email protected]>

Signed-off-by: Jens Axboe <[email protected]>

Signed-off-by: Sasha Levin <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Feb 24, 2019
[ Upstream commit 6347244 ]

Since __sanitizer_cov_trace_const_cmp4 is marked as notrace, the
function called from __sanitizer_cov_trace_const_cmp4 shouldn't be
traceable either.  ftrace_graph_caller() gets called every time func
write_comp_data() gets called if it isn't marked 'notrace'.  This is the
backtrace from gdb:

 #0  ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:179
 #1  0xffffff8010201920 in ftrace_caller () at ../arch/arm64/kernel/entry-ftrace.S:151
 #2  0xffffff8010439714 in write_comp_data (type=5, arg1=0, arg2=0, ip=18446743524224276596) at ../kernel/kcov.c:116
 #3  0xffffff8010439894 in __sanitizer_cov_trace_const_cmp4 (arg1=<optimized out>, arg2=<optimized out>) at ../kernel/kcov.c:188
 #4  0xffffff8010201874 in prepare_ftrace_return (self_addr=18446743524226602768, parent=0xffffff801014b918, frame_pointer=18446743524223531344) at ./include/generated/atomic-instrumented.h:27
 #5  0xffffff801020194c in ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:182

Rework so that write_comp_data() that are called from
__sanitizer_cov_trace_*_cmp*() are marked as 'notrace'.

Commit 903e8ff ("kernel/kcov.c: mark funcs in __sanitizer_cov_trace_pc() as notrace")
missed to mark write_comp_data() as 'notrace'. When that patch was
created gcc-7 was used. In lib/Kconfig.debug
config KCOV_ENABLE_COMPARISONS
	depends on $(cc-option,-fsanitize-coverage=trace-cmp)

That code path isn't hit with gcc-7. However, it were that with gcc-8.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Anders Roxell <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Co-developed-by: Arnd Bergmann <[email protected]>
Acked-by: Steven Rostedt (VMware) <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Feb 24, 2019
commit bb61b84 upstream.

Presently when an error is encountered during probe of the cxlflash
adapter, a deadlock is seen with cpu thread stuck inside
cxlflash_remove(). Below is the trace of the deadlock as logged by
khungtaskd:

cxlflash 0006:00:00.0: cxlflash_probe: init_afu failed rc=-16
INFO: task kworker/80:1:890 blocked for more than 120 seconds.
       Not tainted 5.0.0-rc4-capi2-kexec+ #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/80:1    D    0   890      2 0x00000808
Workqueue: events work_for_cpu_fn

Call Trace:
 0x4d72136320 (unreliable)
 __switch_to+0x2cc/0x460
 __schedule+0x2bc/0xac0
 schedule+0x40/0xb0
 cxlflash_remove+0xec/0x640 [cxlflash]
 cxlflash_probe+0x370/0x8f0 [cxlflash]
 local_pci_probe+0x6c/0x140
 work_for_cpu_fn+0x38/0x60
 process_one_work+0x260/0x530
 worker_thread+0x280/0x5d0
 kthread+0x1a8/0x1b0
 ret_from_kernel_thread+0x5c/0x80
INFO: task systemd-udevd:5160 blocked for more than 120 seconds.

The deadlock occurs as cxlflash_remove() is called from cxlflash_probe()
without setting 'cxlflash_cfg->state' to STATE_PROBED and the probe thread
starts to wait on 'cxlflash_cfg->reset_waitq'. Since the device was never
successfully probed the 'cxlflash_cfg->state' never changes from
STATE_PROBING hence the deadlock occurs.

We fix this deadlock by setting the variable 'cxlflash_cfg->state' to
STATE_PROBED in case an error occurs during cxlflash_probe() and just
before calling cxlflash_remove().

Cc: [email protected]
Fixes: c21e0bb("cxlflash: Base support for IBM CXL Flash Adapter")
Signed-off-by: Vaibhav Jain <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Feb 24, 2019
commit b284909 upstream.

With the following commit:

  73d5e2b ("cpu/hotplug: detect SMT disabled by BIOS")

... the hotplug code attempted to detect when SMT was disabled by BIOS,
in which case it reported SMT as permanently disabled.  However, that
code broke a virt hotplug scenario, where the guest is booted with only
primary CPU threads, and a sibling is brought online later.

The problem is that there doesn't seem to be a way to reliably
distinguish between the HW "SMT disabled by BIOS" case and the virt
"sibling not yet brought online" case.  So the above-mentioned commit
was a bit misguided, as it permanently disabled SMT for both cases,
preventing future virt sibling hotplugs.

Going back and reviewing the original problems which were attempted to
be solved by that commit, when SMT was disabled in BIOS:

  1) /sys/devices/system/cpu/smt/control showed "on" instead of
     "notsupported"; and

  2) vmx_vm_init() was incorrectly showing the L1TF_MSG_SMT warning.

I'd propose that we instead consider #1 above to not actually be a
problem.  Because, at least in the virt case, it's possible that SMT
wasn't disabled by BIOS and a sibling thread could be brought online
later.  So it makes sense to just always default the smt control to "on"
to allow for that possibility (assuming cpuid indicates that the CPU
supports SMT).

The real problem is #2, which has a simple fix: change vmx_vm_init() to
query the actual current SMT state -- i.e., whether any siblings are
currently online -- instead of looking at the SMT "control" sysfs value.

So fix it by:

  a) reverting the original "fix" and its followup fix:

     73d5e2b ("cpu/hotplug: detect SMT disabled by BIOS")
     bc2d8d2 ("cpu/hotplug: Fix SMT supported evaluation")

     and

  b) changing vmx_vm_init() to query the actual current SMT state --
     instead of the sysfs control value -- to determine whether the L1TF
     warning is needed.  This also requires the 'sched_smt_present'
     variable to exported, instead of 'cpu_smt_control'.

Fixes: 73d5e2b ("cpu/hotplug: detect SMT disabled by BIOS")
Reported-by: Igor Mammedov <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Joe Mario <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/e3a85d585da28cc333ecbc1e78ee9216e6da9396.1548794349.git.jpoimboe@redhat.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Feb 28, 2019
On ESP output, sk_wmem_alloc is incremented for the added padding if a
socket is associated to the skb. When replying with TCP SYNACKs over
IPsec, the associated sk is a casted request socket, only. Increasing
sk_wmem_alloc on a request socket results in a write at an arbitrary
struct offset. In the best case, this produces the following WARNING:

WARNING: CPU: 1 PID: 0 at lib/refcount.c:102 esp_output_head+0x2e4/0x308 [esp4]
refcount_t: addition on 0; use-after-free.
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.0.0-rc3 #2
Hardware name: Marvell Armada 380/385 (Device Tree)
[...]
[<bf0ff354>] (esp_output_head [esp4]) from [<bf1006a4>] (esp_output+0xb8/0x180 [esp4])
[<bf1006a4>] (esp_output [esp4]) from [<c05dee64>] (xfrm_output_resume+0x558/0x664)
[<c05dee64>] (xfrm_output_resume) from [<c05d07b0>] (xfrm4_output+0x44/0xc4)
[<c05d07b0>] (xfrm4_output) from [<c05956bc>] (tcp_v4_send_synack+0xa8/0xe8)
[<c05956bc>] (tcp_v4_send_synack) from [<c0586ad8>] (tcp_conn_request+0x7f4/0x948)
[<c0586ad8>] (tcp_conn_request) from [<c058c404>] (tcp_rcv_state_process+0x2a0/0xe64)
[<c058c404>] (tcp_rcv_state_process) from [<c05958ac>] (tcp_v4_do_rcv+0xf0/0x1f4)
[<c05958ac>] (tcp_v4_do_rcv) from [<c0598a4c>] (tcp_v4_rcv+0xdb8/0xe20)
[<c0598a4c>] (tcp_v4_rcv) from [<c056eb74>] (ip_protocol_deliver_rcu+0x2c/0x2dc)
[<c056eb74>] (ip_protocol_deliver_rcu) from [<c056ee6c>] (ip_local_deliver_finish+0x48/0x54)
[<c056ee6c>] (ip_local_deliver_finish) from [<c056eecc>] (ip_local_deliver+0x54/0xec)
[<c056eecc>] (ip_local_deliver) from [<c056efac>] (ip_rcv+0x48/0xb8)
[<c056efac>] (ip_rcv) from [<c0519c2c>] (__netif_receive_skb_one_core+0x50/0x6c)
[...]

The issue triggers only when not using TCP syncookies, as for syncookies
no socket is associated.

Fixes: cac2661 ("esp4: Avoid skb_cow_data whenever possible")
Fixes: 03e2a30 ("esp6: Avoid skb_cow_data whenever possible")
Signed-off-by: Martin Willi <[email protected]>
Signed-off-by: Steffen Klassert <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Mar 2, 2019
This is the much more correct fix for my earlier attempt at:

https://lkml.org/lkml/2018/12/10/118

Short recap:

- There's not actually a locking issue, it's just lockdep being a bit
  too eager to complain about a possible deadlock.

- Contrary to what I claimed the real problem is recursion on
  kn->count. Greg pointed me at sysfs_break_active_protection(), used
  by the scsi subsystem to allow a sysfs file to unbind itself. That
  would be a real deadlock, which isn't what's happening here. Also,
  breaking the active protection means we'd need to manually handle
  all the lifetime fun.

- With Rafael we discussed the task_work approach, which kinda works,
  but has two downsides: It's a functional change for a lockdep
  annotation issue, and it won't work for the bind file (which needs
  to get the errno from the driver load function back to userspace).

- Greg also asked why this never showed up: To hit this you need to
  unregister a 2nd driver from the unload code of your first driver. I
  guess only gpus do that. The bug has always been there, but only
  with a recent patch series did we add more locks so that lockdep
  built a chain from unbinding the snd-hda driver to the
  acpi_video_unregister call.

Full lockdep splat:

[12301.898799] ============================================
[12301.898805] WARNING: possible recursive locking detected
[12301.898811] 4.20.0-rc7+ MIPS#84 Not tainted
[12301.898815] --------------------------------------------
[12301.898821] bash/5297 is trying to acquire lock:
[12301.898826] 00000000f61c6093 (kn->count#39){++++}, at: kernfs_remove_by_name_ns+0x3b/0x80
[12301.898841] but task is already holding lock:
[12301.898847] 000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190
[12301.898856] other info that might help us debug this:
[12301.898862]  Possible unsafe locking scenario:
[12301.898867]        CPU0
[12301.898870]        ----
[12301.898874]   lock(kn->count#39);
[12301.898879]   lock(kn->count#39);
[12301.898883] *** DEADLOCK ***
[12301.898891]  May be due to missing lock nesting notation
[12301.898899] 5 locks held by bash/5297:
[12301.898903]  #0: 00000000cd800e54 (sb_writers#4){.+.+}, at: vfs_write+0x17f/0x1b0
[12301.898915]  #1: 000000000465e7c2 (&of->mutex){+.+.}, at: kernfs_fop_write+0xd3/0x190
[12301.898925]  #2: 000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190
[12301.898936]  #3: 00000000414ef7ac (&dev->mutex){....}, at: device_release_driver_internal+0x34/0x240
[12301.898950]  #4: 000000003218fbdf (register_count_mutex){+.+.}, at: acpi_video_unregister+0xe/0x40
[12301.898960] stack backtrace:
[12301.898968] CPU: 1 PID: 5297 Comm: bash Not tainted 4.20.0-rc7+ MIPS#84
[12301.898974] Hardware name: Hewlett-Packard HP EliteBook 8460p/161C, BIOS 68SCF Ver. F.01 03/11/2011
[12301.898982] Call Trace:
[12301.898989]  dump_stack+0x67/0x9b
[12301.898997]  __lock_acquire+0x6ad/0x1410
[12301.899003]  ? kernfs_remove_by_name_ns+0x3b/0x80
[12301.899010]  ? find_held_lock+0x2d/0x90
[12301.899017]  ? mutex_spin_on_owner+0xe4/0x150
[12301.899023]  ? find_held_lock+0x2d/0x90
[12301.899030]  ? lock_acquire+0x90/0x180
[12301.899036]  lock_acquire+0x90/0x180
[12301.899042]  ? kernfs_remove_by_name_ns+0x3b/0x80
[12301.899049]  __kernfs_remove+0x296/0x310
[12301.899055]  ? kernfs_remove_by_name_ns+0x3b/0x80
[12301.899060]  ? kernfs_name_hash+0xd/0x80
[12301.899066]  ? kernfs_find_ns+0x6c/0x100
[12301.899073]  kernfs_remove_by_name_ns+0x3b/0x80
[12301.899080]  bus_remove_driver+0x92/0xa0
[12301.899085]  acpi_video_unregister+0x24/0x40
[12301.899127]  i915_driver_unload+0x42/0x130 [i915]
[12301.899160]  i915_pci_remove+0x19/0x30 [i915]
[12301.899169]  pci_device_remove+0x36/0xb0
[12301.899176]  device_release_driver_internal+0x185/0x240
[12301.899183]  unbind_store+0xaf/0x180
[12301.899189]  kernfs_fop_write+0x104/0x190
[12301.899195]  __vfs_write+0x31/0x180
[12301.899203]  ? rcu_read_lock_sched_held+0x6f/0x80
[12301.899209]  ? rcu_sync_lockdep_assert+0x29/0x50
[12301.899216]  ? __sb_start_write+0x13c/0x1a0
[12301.899221]  ? vfs_write+0x17f/0x1b0
[12301.899227]  vfs_write+0xb9/0x1b0
[12301.899233]  ksys_write+0x50/0xc0
[12301.899239]  do_syscall_64+0x4b/0x180
[12301.899247]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[12301.899253] RIP: 0033:0x7f452ac7f7a4
[12301.899259] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 aa f0 2c 00 48 63 ff 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 55 53 48 89 d5 48 89 f3 48 83
[12301.899273] RSP: 002b:00007ffceafa6918 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[12301.899282] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f452ac7f7a4
[12301.899288] RDX: 000000000000000d RSI: 00005612a1abf7c0 RDI: 0000000000000001
[12301.899295] RBP: 00005612a1abf7c0 R08: 000000000000000a R09: 00005612a1c46730
[12301.899301] R10: 000000000000000a R11: 0000000000000246 R12: 000000000000000d
[12301.899308] R13: 0000000000000001 R14: 00007f452af4a740 R15: 000000000000000d

Locking around I've noticed that usb and i2c already handle similar
recursion problems, where a sysfs file can unbind the same type of
sysfs somewhere else in the hierarchy. Relevant commits are:

commit 356c05d
Author: Alan Stern <[email protected]>
Date:   Mon May 14 13:30:03 2012 -0400

    sysfs: get rid of some lockdep false positives

commit e9b526f
Author: Alexander Sverdlin <[email protected]>
Date:   Fri May 17 14:56:35 2013 +0200

    i2c: suppress lockdep warning on delete_device

Implement the same trick for driver bind/unbind.

v2: Put the macro into bus.c (Greg).

Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Ramalingam C <[email protected]>
Cc: Arend van Spriel <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Bartosz Golaszewski <[email protected]>
Cc: Heikki Krogerus <[email protected]>
Cc: Vivek Gautam <[email protected]>
Cc: Joe Perches <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>

Note that Greg already merged this, so this should drop out as soon as
we've moved forward past 4.21-rc1.
-Daniel
pcercuei referenced this issue in OpenDingux/linux Mar 2, 2019
The idea of taking the reset lock around writing the fence register was
to serialise the mmio write we also perform during the reset where those
registers get clobbered. However, the lock is overkill as write tearing
between reset and fence_update() is harmless; the final value of the
fence register is the same. A race between revoke_fences() and
fence_update() is also harmless at this point as on the fault path where
this is necessary, we acquire the reset lock to coordinate ourselves in
the upper layer.

The danger of acquiring the reset lock again in fence_update() is that
we may recurse from the shrinker along the i915_gem_fault() path.

<4> [125.739646] ============================================
<4> [125.739652] WARNING: possible recursive locking detected
<4> [125.739659] 5.0.0-rc6-ga6e4cbf00557-drmtip_223+ #1 Tainted: G     U
<4> [125.739666] --------------------------------------------
<4> [125.739672] gem_mmap_gtt/1017 is trying to acquire lock:
<4> [125.739679] 00000000a730190a (&dev_priv->gpu_error.reset_backoff_srcu){+.+.}, at: i915_reset_trylock+0x0/0x310 [i915]
<4> [125.739848]
but task is already holding lock:
<4> [125.739854] 00000000a730190a (&dev_priv->gpu_error.reset_backoff_srcu){+.+.}, at: i915_reset_trylock+0x192/0x310 [i915]
<4> [125.739918]
other info that might help us debug this:
<4> [125.739925]  Possible unsafe locking scenario:

<4> [125.739930]        CPU0
<4> [125.739934]        ----
<4> [125.739937]   lock(&dev_priv->gpu_error.reset_backoff_srcu);
<4> [125.739944]   lock(&dev_priv->gpu_error.reset_backoff_srcu);
<4> [125.739950]
 *** DEADLOCK ***

<4> [125.739958]  May be due to missing lock nesting notation

<4> [125.739966] 5 locks held by gem_mmap_gtt/1017:
<4> [125.739972]  #0: 00000000471f682c (&mm->mmap_sem){++++}, at: __do_page_fault+0x133/0x500
<4> [125.739987]  #1: 0000000026542685 (&dev->struct_mutex){+.+.}, at: i915_gem_fault+0x1f6/0x860 [i915]
<4> [125.740061]  #2: 00000000a730190a (&dev_priv->gpu_error.reset_backoff_srcu){+.+.}, at: i915_reset_trylock+0x192/0x310 [i915]
<4> [125.740126]  #3: 00000000c828eb4f (fs_reclaim){+.+.}, at: fs_reclaim_acquire.part.25+0x0/0x30
<4> [125.740140]  #4: 000000002d360d65 (shrinker_rwsem){++++}, at: shrink_slab+0x1cb/0x2c0
<4> [125.740151]
stack backtrace:
<4> [125.740159] CPU: 1 PID: 1017 Comm: gem_mmap_gtt Tainted: G     U            5.0.0-rc6-ga6e4cbf00557-drmtip_223+ #1
<4> [125.740170] Hardware name: Dell Inc.                 OptiPlex 745                 /0GW726, BIOS 2.3.1  05/21/2007
<4> [125.740180] Call Trace:
<4> [125.740189]  dump_stack+0x67/0x9b
<4> [125.740199]  __lock_acquire+0xc75/0x1b00
<4> [125.740209]  ? arch_tlb_finish_mmu+0x2a/0xa0
<4> [125.740216]  ? tlb_finish_mmu+0x1a/0x30
<4> [125.740222]  ? zap_page_range_single+0xe2/0x130
<4> [125.740230]  ? lock_acquire+0xa6/0x1c0
<4> [125.740237]  lock_acquire+0xa6/0x1c0
<4> [125.740296]  ? i915_clear_error_registers+0x280/0x280 [i915]
<4> [125.740357]  i915_reset_trylock+0x44/0x310 [i915]
<4> [125.740417]  ? i915_clear_error_registers+0x280/0x280 [i915]
<4> [125.740426]  ? lockdep_hardirqs_on+0xe0/0x1b0
<4> [125.740434]  ? _raw_spin_unlock_irqrestore+0x39/0x60
<4> [125.740499]  fence_update+0x218/0x470 [i915]
<4> [125.740571]  i915_vma_unbind+0xa6/0x550 [i915]
<4> [125.740640]  i915_gem_object_unbind+0xfa/0x190 [i915]
<4> [125.740711]  i915_gem_shrink+0x2dc/0x590 [i915]
<4> [125.740722]  ? ___preempt_schedule+0x16/0x18
<4> [125.740792]  ? i915_gem_shrinker_scan+0xc9/0x130 [i915]
<4> [125.740861]  i915_gem_shrinker_scan+0xc9/0x130 [i915]
<4> [125.740870]  do_shrink_slab+0x143/0x3f0
<4> [125.740878]  shrink_slab+0x228/0x2c0
<4> [125.740886]  shrink_node+0x167/0x450
<4> [125.740894]  do_try_to_free_pages+0xc4/0x340
<4> [125.740902]  try_to_free_pages+0xdc/0x2e0
<4> [125.740911]  __alloc_pages_nodemask+0x662/0x1110
<4> [125.740921]  ? reacquire_held_locks+0xb5/0x1b0
<4> [125.740928]  ? reacquire_held_locks+0xb5/0x1b0
<4> [125.740986]  ? i915_reset_trylock+0x192/0x310 [i915]
<4> [125.741045]  ? i915_memcpy_init_early+0x30/0x30 [i915]
<4> [125.741054]  pte_alloc_one+0x12/0x70
<4> [125.741060]  __pte_alloc+0x11/0xf0
<4> [125.741067]  apply_to_page_range+0x37e/0x440
<4> [125.741127]  remap_io_mapping+0x6c/0x100 [i915]
<4> [125.741196]  i915_gem_fault+0x5a9/0x860 [i915]
<4> [125.741204]  ? ptlock_alloc+0x15/0x30
<4> [125.741212]  __do_fault+0x2c/0xb0
<4> [125.741218]  __handle_mm_fault+0x8ee/0xfa0
<4> [125.741227]  handle_mm_fault+0x196/0x3a0
<4> [125.741235]  __do_page_fault+0x246/0x500
<4> [125.741243]  ? page_fault+0x8/0x30
<4> [125.741250]  page_fault+0x1e/0x30
<4> [125.741256] RIP: 0033:0x55d0cc456e12
<4> [125.741264] Code: b0 df ff ff 89 c2 8b 85 70 df ff ff 01 c2 8b 85 70 df ff ff 48 98 48 8d 0c 85 00 00 00 00 48 8b 85 e0 df ff ff 48 01 c8 f7 d2 <89> 10 83 85 70 df ff ff 01 81 bd 70 df ff ff ff 03 00 00 7e be 48
<4> [125.741280] RSP: 002b:00007ffc1bab7ab0 EFLAGS: 00010206
<4> [125.741287] RAX: 00007fc787cb6000 RBX: 0000000000000000 RCX: 0000000000000000
<4> [125.741295] RDX: 00000000ffffffff RSI: 0000000000005401 RDI: 0000000000000002
<4> [125.741303] RBP: 00007ffc1bab9b70 R08: 00007ffc1bab7920 R09: 000000000000001b
<4> [125.741310] R10: 7165722074736554 R11: 0000000000000246 R12: 000055d0cc454a80
<4> [125.741318] R13: 00007ffc1bab9f60 R14: 0000000000000000 R15: 0000000000000000

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109665
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Reviewed-by: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
pcercuei referenced this issue in OpenDingux/linux Apr 14, 2019
Reference counting in amdgpu_dm_connector for amdgpu_dm_connector::dc_sink
and amdgpu_dm_connector::dc_em_sink as well as in dc_link::local_sink seems
to be out of shape. Thus make reference counting consistent for these
members and just plain increment the reference count when the variable
gets assigned and decrement when the pointer is set to zero or replaced.
Also simplify reference counting in selected function sopes to be sure the
reference is released in any case. In some cases add NULL pointer check
before dereferencing.
At a hand full of places a comment is placed to stat that the reference
increment happened already somewhere else.

This actually fixes the following kernel bug on my system when enabling
display core in amdgpu. There are some more similar bug reports around,
so it probably helps at more places.

   kernel BUG at mm/slub.c:294!
   invalid opcode: 0000 [#1] SMP PTI
   CPU: 9 PID: 1180 Comm: Xorg Not tainted 5.0.0-rc1+ #2
   Hardware name: Supermicro X10DAi/X10DAI, BIOS 3.0a 02/05/2018
   RIP: 0010:__slab_free+0x1e2/0x3d0
   Code: 8b 54 24 30 48 89 4c 24 28 e8 da fb ff ff 4c 8b 54 24 28 85 c0 0f 85 67 fe ff ff 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 49 3b 5c 24 28 75 ab 48 8b 44 24 30 49 89 4c 24 28 49 89 44
   RSP: 0018:ffffb0978589fa90 EFLAGS: 00010246
   RAX: ffff92f12806c400 RBX: 0000000080200019 RCX: ffff92f12806c400
   RDX: ffff92f12806c400 RSI: ffffdd6421a01a00 RDI: ffff92ed2f406e80
   RBP: ffffb0978589fb40 R08: 0000000000000001 R09: ffffffffc0ee4748
   R10: ffff92f12806c400 R11: 0000000000000001 R12: ffffdd6421a01a00
   R13: ffff92f12806c400 R14: ffff92ed2f406e80 R15: ffffdd6421a01a20
   FS:  00007f4170be0ac0(0000) GS:ffff92ed2fb40000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: 0000562818aaa000 CR3: 000000045745a002 CR4: 00000000003606e0
   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
   Call Trace:
    ? drm_dbg+0x87/0x90 [drm]
    dc_stream_release+0x28/0x50 [amdgpu]
    amdgpu_dm_connector_mode_valid+0xb4/0x1f0 [amdgpu]
    drm_helper_probe_single_connector_modes+0x492/0x6b0 [drm_kms_helper]
    drm_mode_getconnector+0x457/0x490 [drm]
    ? drm_connector_property_set_ioctl+0x60/0x60 [drm]
    drm_ioctl_kernel+0xa9/0xf0 [drm]
    drm_ioctl+0x201/0x3a0 [drm]
    ? drm_connector_property_set_ioctl+0x60/0x60 [drm]
    amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
    do_vfs_ioctl+0xa4/0x630
    ? __sys_recvmsg+0x83/0xa0
    ksys_ioctl+0x60/0x90
    __x64_sys_ioctl+0x16/0x20
    do_syscall_64+0x5b/0x160
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
   RIP: 0033:0x7f417110809b
   Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01 48
   RSP: 002b:00007ffdd8d1c268 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
   RAX: ffffffffffffffda RBX: 0000562818a8ebc0 RCX: 00007f417110809b
   RDX: 00007ffdd8d1c2a0 RSI: 00000000c05064a7 RDI: 0000000000000012
   RBP: 00007ffdd8d1c2a0 R08: 0000562819012280 R09: 0000000000000007
   R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c05064a7
   R13: 0000000000000012 R14: 0000000000000012 R15: 00007ffdd8d1c2a0
   Modules linked in: nfsv4 dns_resolver nfs lockd grace fscache fuse vfat fat amdgpu intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul chash gpu_sched crc32_pclmul snd_hda_codec_realtek ghash_clmulni_intel amd_iommu_v2 iTCO_wdt iTCO_vendor_support ttm snd_hda_codec_generic snd_hda_codec_hdmi ledtrig_audio snd_hda_intel drm_kms_helper snd_hda_codec intel_cstate snd_hda_core drm snd_hwdep snd_seq snd_seq_device intel_uncore snd_pcm intel_rapl_perf snd_timer snd soundcore ioatdma pcspkr intel_wmi_thunderbolt mxm_wmi i2c_i801 lpc_ich pcc_cpufreq auth_rpcgss sunrpc igb crc32c_intel i2c_algo_bit dca wmi hid_cherry analog gameport joydev

This patch is based on agd5f/drm-next-5.1-wip. This patch does not require
all of that, but agd5f/drm-next-5.1-wip contains at least one more dc_sink
counting fix that I could spot.

Signed-off-by: Mathias Fröhlich <[email protected]>
Reviewed-by: Leo Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Apr 14, 2019
In func check_6rd,tunnel->ip6rd.relay_prefixlen may equal to
32,so UBSAN complain about it.

UBSAN: Undefined behaviour in net/ipv6/sit.c:781:47
shift exponent 32 is too large for 32-bit type 'unsigned int'
CPU: 6 PID: 20036 Comm: syz-executor.0 Not tainted 4.19.27 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1
04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xca/0x13e lib/dump_stack.c:113
ubsan_epilogue+0xe/0x81 lib/ubsan.c:159
__ubsan_handle_shift_out_of_bounds+0x293/0x2e8 lib/ubsan.c:425
check_6rd.constprop.9+0x433/0x4e0 net/ipv6/sit.c:781
try_6rd net/ipv6/sit.c:806 [inline]
ipip6_tunnel_xmit net/ipv6/sit.c:866 [inline]
sit_tunnel_xmit+0x141c/0x2720 net/ipv6/sit.c:1033
__netdev_start_xmit include/linux/netdevice.h:4300 [inline]
netdev_start_xmit include/linux/netdevice.h:4309 [inline]
xmit_one net/core/dev.c:3243 [inline]
dev_hard_start_xmit+0x17c/0x780 net/core/dev.c:3259
__dev_queue_xmit+0x1656/0x2500 net/core/dev.c:3829
neigh_output include/net/neighbour.h:501 [inline]
ip6_finish_output2+0xa36/0x2290 net/ipv6/ip6_output.c:120
ip6_finish_output+0x3e7/0xa20 net/ipv6/ip6_output.c:154
NF_HOOK_COND include/linux/netfilter.h:278 [inline]
ip6_output+0x1e2/0x720 net/ipv6/ip6_output.c:171
dst_output include/net/dst.h:444 [inline]
ip6_local_out+0x99/0x170 net/ipv6/output_core.c:176
ip6_send_skb+0x9d/0x2f0 net/ipv6/ip6_output.c:1697
ip6_push_pending_frames+0xc0/0x100 net/ipv6/ip6_output.c:1717
rawv6_push_pending_frames net/ipv6/raw.c:616 [inline]
rawv6_sendmsg+0x2435/0x3530 net/ipv6/raw.c:946
inet_sendmsg+0xf8/0x5c0 net/ipv4/af_inet.c:798
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xc8/0x110 net/socket.c:631
___sys_sendmsg+0x6cf/0x890 net/socket.c:2114
__sys_sendmsg+0xf0/0x1b0 net/socket.c:2152
do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Signed-off-by: linmiaohe <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
pcercuei referenced this issue in OpenDingux/linux Apr 14, 2019
Ido Schimmel says:

====================
mlxsw: Various fixes

Patch #1 fixes the recently introduced QSFP thermal zones to correctly
work with split ports, where several ports are mapped to the same
module.

Patch #2 initializes the base MAC in the minimal driver. The driver is
using the base MAC as its parent ID and without initializing it, it is
reported as all zeroes to user space.
====================

Signed-off-by: David S. Miller <[email protected]>
nemunaire pushed a commit to nemunaire/CI20_linux that referenced this issue Jun 16, 2019
[ Upstream commit a843dc4 ]

In func check_6rd,tunnel->ip6rd.relay_prefixlen may equal to
32,so UBSAN complain about it.

UBSAN: Undefined behaviour in net/ipv6/sit.c:781:47
shift exponent 32 is too large for 32-bit type 'unsigned int'
CPU: 6 PID: 20036 Comm: syz-executor.0 Not tainted 4.19.27 MIPS#2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1
04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xca/0x13e lib/dump_stack.c:113
ubsan_epilogue+0xe/0x81 lib/ubsan.c:159
__ubsan_handle_shift_out_of_bounds+0x293/0x2e8 lib/ubsan.c:425
check_6rd.constprop.9+0x433/0x4e0 net/ipv6/sit.c:781
try_6rd net/ipv6/sit.c:806 [inline]
ipip6_tunnel_xmit net/ipv6/sit.c:866 [inline]
sit_tunnel_xmit+0x141c/0x2720 net/ipv6/sit.c:1033
__netdev_start_xmit include/linux/netdevice.h:4300 [inline]
netdev_start_xmit include/linux/netdevice.h:4309 [inline]
xmit_one net/core/dev.c:3243 [inline]
dev_hard_start_xmit+0x17c/0x780 net/core/dev.c:3259
__dev_queue_xmit+0x1656/0x2500 net/core/dev.c:3829
neigh_output include/net/neighbour.h:501 [inline]
ip6_finish_output2+0xa36/0x2290 net/ipv6/ip6_output.c:120
ip6_finish_output+0x3e7/0xa20 net/ipv6/ip6_output.c:154
NF_HOOK_COND include/linux/netfilter.h:278 [inline]
ip6_output+0x1e2/0x720 net/ipv6/ip6_output.c:171
dst_output include/net/dst.h:444 [inline]
ip6_local_out+0x99/0x170 net/ipv6/output_core.c:176
ip6_send_skb+0x9d/0x2f0 net/ipv6/ip6_output.c:1697
ip6_push_pending_frames+0xc0/0x100 net/ipv6/ip6_output.c:1717
rawv6_push_pending_frames net/ipv6/raw.c:616 [inline]
rawv6_sendmsg+0x2435/0x3530 net/ipv6/raw.c:946
inet_sendmsg+0xf8/0x5c0 net/ipv4/af_inet.c:798
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xc8/0x110 net/socket.c:631
___sys_sendmsg+0x6cf/0x890 net/socket.c:2114
__sys_sendmsg+0xf0/0x1b0 net/socket.c:2152
do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Signed-off-by: linmiaohe <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
nemunaire pushed a commit to nemunaire/CI20_linux that referenced this issue Jun 16, 2019
[ Upstream commit cee51c3 ]

There may be a race condition if f_fs calls unregister_gadget_item in
ffs_closed() when unregister_gadget is called by UDC store at the same time.
this leads to a kernel NULL pointer dereference:

[  310.644928] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[  310.645053] init: Service 'adbd' is being killed...
[  310.658938] pgd = c9528000
[  310.662515] [00000004] *pgd=19451831, *pte=00000000, *ppte=00000000
[  310.669702] Internal error: Oops: 817 [MIPS#1] PREEMPT SMP ARM
[  310.675211] Modules linked in:
[  310.678294] CPU: 0 PID: 1537 Comm: ->transport Not tainted 4.1.15-03725-g793404c MIPS#2
[  310.685958] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[  310.692493] task: c8e24200 ti: c945e000 task.ti: c945e000
[  310.697911] PC is at usb_gadget_unregister_driver+0xb4/0xd0
[  310.703502] LR is at __mutex_lock_slowpath+0x10c/0x16c
[  310.708648] pc : [<c075efc0>]    lr : [<c0bfb0bc>]    psr: 600f0113
<snip..>
[  311.565585] [<c075efc0>] (usb_gadget_unregister_driver) from [<c075e2b8>] (unregister_gadget_item+0x1c/0x34)
[  311.575426] [<c075e2b8>] (unregister_gadget_item) from [<c076fcc8>] (ffs_closed+0x8c/0x9c)
[  311.583702] [<c076fcc8>] (ffs_closed) from [<c07736b8>] (ffs_data_reset+0xc/0xa0)
[  311.591194] [<c07736b8>] (ffs_data_reset) from [<c07738ac>] (ffs_data_closed+0x90/0xd0)
[  311.599208] [<c07738ac>] (ffs_data_closed) from [<c07738f8>] (ffs_ep0_release+0xc/0x14)
[  311.607224] [<c07738f8>] (ffs_ep0_release) from [<c023e030>] (__fput+0x80/0x1d0)
[  311.614635] [<c023e030>] (__fput) from [<c014e688>] (task_work_run+0xb0/0xe8)
[  311.621788] [<c014e688>] (task_work_run) from [<c010afdc>] (do_work_pending+0x7c/0xa4)
[  311.629718] [<c010afdc>] (do_work_pending) from [<c010770c>] (work_pending+0xc/0x20)

for functions using functionFS, i.e. android adbd will close /dev/usb-ffs/adb/ep0
when usb IO thread fails, but switch adb from on to off also triggers write
"none" > UDC. These 2 operations both call unregister_gadget, which will lead
to the panic above.

add a mutex before calling unregister_gadget for api used in f_fs.

Signed-off-by: Winter Wang <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
nemunaire pushed a commit to nemunaire/CI20_linux that referenced this issue Jun 16, 2019
[ Upstream commit 92d1d07 ]

Kmemleak throws endless warnings during boot due to in
__alloc_alien_cache(),

    alc = kmalloc_node(memsize, gfp, node);
    init_arraycache(&alc->ac, entries, batch);
    kmemleak_no_scan(ac);

Kmemleak does not track the array cache (alc->ac) but the alien cache
(alc) instead, so let it track the latter by lifting kmemleak_no_scan()
out of init_arraycache().

There is another place that calls init_arraycache(), but
alloc_kmem_cache_cpus() uses the percpu allocation where will never be
considered as a leak.

  kmemleak: Found object by alias at 0xffff8007b9aa7e38
  CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ MIPS#2
  Call trace:
   dump_backtrace+0x0/0x168
   show_stack+0x24/0x30
   dump_stack+0x88/0xb0
   lookup_object+0x84/0xac
   find_and_get_object+0x84/0xe4
   kmemleak_no_scan+0x74/0xf4
   setup_kmem_cache_node+0x2b4/0x35c
   __do_tune_cpucache+0x250/0x2d4
   do_tune_cpucache+0x4c/0xe4
   enable_cpucache+0xc8/0x110
   setup_cpu_cache+0x40/0x1b8
   __kmem_cache_create+0x240/0x358
   create_cache+0xc0/0x198
   kmem_cache_create_usercopy+0x158/0x20c
   kmem_cache_create+0x50/0x64
   fsnotify_init+0x58/0x6c
   do_one_initcall+0x194/0x388
   kernel_init_freeable+0x668/0x688
   kernel_init+0x18/0x124
   ret_from_fork+0x10/0x18
  kmemleak: Object 0xffff8007b9aa7e00 (size 256):
  kmemleak:   comm "swapper/0", pid 1, jiffies 4294697137
  kmemleak:   min_count = 1
  kmemleak:   count = 0
  kmemleak:   flags = 0x1
  kmemleak:   checksum = 0
  kmemleak:   backtrace:
       kmemleak_alloc+0x84/0xb8
       kmem_cache_alloc_node_trace+0x31c/0x3a0
       __kmalloc_node+0x58/0x78
       setup_kmem_cache_node+0x26c/0x35c
       __do_tune_cpucache+0x250/0x2d4
       do_tune_cpucache+0x4c/0xe4
       enable_cpucache+0xc8/0x110
       setup_cpu_cache+0x40/0x1b8
       __kmem_cache_create+0x240/0x358
       create_cache+0xc0/0x198
       kmem_cache_create_usercopy+0x158/0x20c
       kmem_cache_create+0x50/0x64
       fsnotify_init+0x58/0x6c
       do_one_initcall+0x194/0x388
       kernel_init_freeable+0x668/0x688
       kernel_init+0x18/0x124
  kmemleak: Not scanning unknown object at 0xffff8007b9aa7e38
  CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ MIPS#2
  Call trace:
   dump_backtrace+0x0/0x168
   show_stack+0x24/0x30
   dump_stack+0x88/0xb0
   kmemleak_no_scan+0x90/0xf4
   setup_kmem_cache_node+0x2b4/0x35c
   __do_tune_cpucache+0x250/0x2d4
   do_tune_cpucache+0x4c/0xe4
   enable_cpucache+0xc8/0x110
   setup_cpu_cache+0x40/0x1b8
   __kmem_cache_create+0x240/0x358
   create_cache+0xc0/0x198
   kmem_cache_create_usercopy+0x158/0x20c
   kmem_cache_create+0x50/0x64
   fsnotify_init+0x58/0x6c
   do_one_initcall+0x194/0x388
   kernel_init_freeable+0x668/0x688
   kernel_init+0x18/0x124
   ret_from_fork+0x10/0x18

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 1fe00d5 ("slab: factor out initialization of array cache")
Signed-off-by: Qian Cai <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Catalin Marinas <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
nemunaire pushed a commit to nemunaire/CI20_linux that referenced this issue Jun 16, 2019
[ Upstream commit d982b33 ]

  =================================================================
  ==20875==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 1160 byte(s) in 1 object(s) allocated from:
      #0 0x7f1b6fc84138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
      MIPS#1 0x55bd50005599 in zalloc util/util.h:23
      MIPS#2 0x55bd500068f5 in perf_evsel__newtp_idx util/evsel.c:327
      MIPS#3 0x55bd4ff810fc in perf_evsel__newtp /home/work/linux/tools/perf/util/evsel.h:216
      MIPS#4 0x55bd4ff81608 in test__perf_evsel__tp_sched_test tests/evsel-tp-sched.c:69
      MIPS#5 0x55bd4ff528e6 in run_test tests/builtin-test.c:358
      MIPS#6 0x55bd4ff52baf in test_and_print tests/builtin-test.c:388
      MIPS#7 0x55bd4ff543fe in __cmd_test tests/builtin-test.c:583
      MIPS#8 0x55bd4ff5572f in cmd_test tests/builtin-test.c:722
      MIPS#9 0x55bd4ffc4087 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
      MIPS#10 0x55bd4ffc45c6 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
      MIPS#11 0x55bd4ffc49ca in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
      MIPS#12 0x55bd4ffc5138 in main /home/changbin/work/linux/tools/perf/perf.c:520
      MIPS#13 0x7f1b6e34809a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

  Indirect leak of 19 byte(s) in 1 object(s) allocated from:
      #0 0x7f1b6fc83f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30)
      MIPS#1 0x7f1b6e3ac30f in vasprintf (/lib/x86_64-linux-gnu/libc.so.6+0x8830f)

Signed-off-by: Changbin Du <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Fixes: 6a6cd11 ("perf test: Add test for the sched tracepoint format fields")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
nemunaire pushed a commit to nemunaire/CI20_linux that referenced this issue Jun 16, 2019
commit 9002b21 upstream.

Commit 32a5ad9 ("sysctl: handle overflow for file-max") hooked up
min/max values for the file-max sysctl parameter via the .extra1 and
.extra2 fields in the corresponding struct ctl_table entry.

Unfortunately, the minimum value points at the global 'zero' variable,
which is an int.  This results in a KASAN splat when accessed as a long
by proc_doulongvec_minmax on 64-bit architectures:

  | BUG: KASAN: global-out-of-bounds in __do_proc_doulongvec_minmax+0x5d8/0x6a0
  | Read of size 8 at addr ffff2000133d1c20 by task systemd/1
  |
  | CPU: 0 PID: 1 Comm: systemd Not tainted 5.1.0-rc3-00012-g40b114779944 MIPS#2
  | Hardware name: linux,dummy-virt (DT)
  | Call trace:
  |  dump_backtrace+0x0/0x228
  |  show_stack+0x14/0x20
  |  dump_stack+0xe8/0x124
  |  print_address_description+0x60/0x258
  |  kasan_report+0x140/0x1a0
  |  __asan_report_load8_noabort+0x18/0x20
  |  __do_proc_doulongvec_minmax+0x5d8/0x6a0
  |  proc_doulongvec_minmax+0x4c/0x78
  |  proc_sys_call_handler.isra.19+0x144/0x1d8
  |  proc_sys_write+0x34/0x58
  |  __vfs_write+0x54/0xe8
  |  vfs_write+0x124/0x3c0
  |  ksys_write+0xbc/0x168
  |  __arm64_sys_write+0x68/0x98
  |  el0_svc_common+0x100/0x258
  |  el0_svc_handler+0x48/0xc0
  |  el0_svc+0x8/0xc
  |
  | The buggy address belongs to the variable:
  |  zero+0x0/0x40
  |
  | Memory state around the buggy address:
  |  ffff2000133d1b00: 00 00 00 00 00 00 00 00 fa fa fa fa 04 fa fa fa
  |  ffff2000133d1b80: fa fa fa fa 04 fa fa fa fa fa fa fa 04 fa fa fa
  | >ffff2000133d1c00: fa fa fa fa 04 fa fa fa fa fa fa fa 00 00 00 00
  |                                ^
  |  ffff2000133d1c80: fa fa fa fa 00 fa fa fa fa fa fa fa 00 00 00 00
  |  ffff2000133d1d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Fix the splat by introducing a unsigned long 'zero_ul' and using that
instead.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 32a5ad9 ("sysctl: handle overflow for file-max")
Signed-off-by: Will Deacon <[email protected]>
Acked-by: Christian Brauner <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Matteo Croce <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
nemunaire pushed a commit to nemunaire/CI20_linux that referenced this issue Jun 16, 2019
[ Upstream commit 47b1682 ]

If xace hardware reports a bad version number, the error handling code
in ace_setup() calls put_disk(), followed by queue cleanup. However, since
the disk data structure has the queue pointer set, put_disk() also
cleans and releases the queue. This results in blk_cleanup_queue()
accessing an already released data structure, which in turn may result
in a crash such as the following.

[   10.681671] BUG: Kernel NULL pointer dereference at 0x00000040
[   10.681826] Faulting instruction address: 0xc0431480
[   10.682072] Oops: Kernel access of bad area, sig: 11 [MIPS#1]
[   10.682251] BE PAGE_SIZE=4K PREEMPT Xilinx Virtex440
[   10.682387] Modules linked in:
[   10.682528] CPU: 0 PID: 1 Comm: swapper Tainted: G        W         5.0.0-rc6-next-20190218+ MIPS#2
[   10.682733] NIP:  c0431480 LR: c043147c CTR: c0422ad8
[   10.682863] REGS: cf82fbe0 TRAP: 0300   Tainted: G        W          (5.0.0-rc6-next-20190218+)
[   10.683065] MSR:  00029000 <CE,EE,ME>  CR: 22000222  XER: 00000000
[   10.683236] DEAR: 00000040 ESR: 00000000
[   10.683236] GPR00: c043147c cf82fc90 cf82ccc0 00000000 00000000 00000000 00000002 00000000
[   10.683236] GPR08: 00000000 00000000 c04310bc 00000000 22000222 00000000 c0002c54 00000000
[   10.683236] GPR16: 00000000 00000001 c09aa39c c09021b0 c09021dc 00000007 c0a68c08 00000000
[   10.683236] GPR24: 00000001 ced6d400 ced6dcf0 c0815d9c 00000000 00000000 00000000 cedf0800
[   10.684331] NIP [c0431480] blk_mq_run_hw_queue+0x28/0x114
[   10.684473] LR [c043147c] blk_mq_run_hw_queue+0x24/0x114
[   10.684602] Call Trace:
[   10.684671] [cf82fc90] [c043147c] blk_mq_run_hw_queue+0x24/0x114 (unreliable)
[   10.684854] [cf82fcc0] [c04315bc] blk_mq_run_hw_queues+0x50/0x7c
[   10.685002] [cf82fce0] [c0422b24] blk_set_queue_dying+0x30/0x68
[   10.685154] [cf82fcf0] [c0423ec0] blk_cleanup_queue+0x34/0x14c
[   10.685306] [cf82fd10] [c054d73c] ace_probe+0x3dc/0x508
[   10.685445] [cf82fd50] [c052d740] platform_drv_probe+0x4c/0xb8
[   10.685592] [cf82fd70] [c052abb0] really_probe+0x20c/0x32c
[   10.685728] [cf82fda0] [c052ae58] driver_probe_device+0x68/0x464
[   10.685877] [cf82fdc0] [c052b500] device_driver_attach+0xb4/0xe4
[   10.686024] [cf82fde0] [c052b5dc] __driver_attach+0xac/0xfc
[   10.686161] [cf82fe00] [c0528428] bus_for_each_dev+0x80/0xc0
[   10.686314] [cf82fe30] [c0529b3c] bus_add_driver+0x144/0x234
[   10.686457] [cf82fe50] [c052c46] driver_register+0x88/0x15c
[   10.686610] [cf82fe60] [c09de288] ace_init+0x4c/0xac
[   10.686742] [cf82fe80] [c0002730] do_one_initcall+0xac/0x330
[   10.686888] [cf82fee0] [c09aafd0] kernel_init_freeable+0x34c/0x478
[   10.687043] [cf82ff30] [c0002c6c] kernel_init+0x18/0x114
[   10.687188] [cf82ff40] [c000f2f0] ret_from_kernel_thread+0x14/0x1c
[   10.687349] Instruction dump:
[   10.687435] 3863ffd4 4bfffd70 9421ffd0 7c0802a6 93c10028 7c9e2378 93e1002c 38810008
[   10.687637] 7c7f1b78 90010034 4bfffc25 813f008c <81290040> 75290100 4182002c 80810008
[   10.688056] ---[ end trace 13c9ff51d41b9d40 ]---

Fix the problem by setting the disk queue pointer to NULL before calling
put_disk(). A more comprehensive fix might be to rearrange the code
to check the hardware version before initializing data structures,
but I don't know if this would have undesirable side effects, and
it would increase the complexity of backporting the fix to older kernels.

Fixes: 74489a9 ("Add support for Xilinx SystemACE CompactFlash interface")
Acked-by: Michal Simek <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
gabrielesvelto pushed a commit to gabrielesvelto/CI20_linux that referenced this issue Jan 17, 2020
[ Upstream commit a843dc4 ]

In func check_6rd,tunnel->ip6rd.relay_prefixlen may equal to
32,so UBSAN complain about it.

UBSAN: Undefined behaviour in net/ipv6/sit.c:781:47
shift exponent 32 is too large for 32-bit type 'unsigned int'
CPU: 6 PID: 20036 Comm: syz-executor.0 Not tainted 4.19.27 MIPS#2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1
04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xca/0x13e lib/dump_stack.c:113
ubsan_epilogue+0xe/0x81 lib/ubsan.c:159
__ubsan_handle_shift_out_of_bounds+0x293/0x2e8 lib/ubsan.c:425
check_6rd.constprop.9+0x433/0x4e0 net/ipv6/sit.c:781
try_6rd net/ipv6/sit.c:806 [inline]
ipip6_tunnel_xmit net/ipv6/sit.c:866 [inline]
sit_tunnel_xmit+0x141c/0x2720 net/ipv6/sit.c:1033
__netdev_start_xmit include/linux/netdevice.h:4300 [inline]
netdev_start_xmit include/linux/netdevice.h:4309 [inline]
xmit_one net/core/dev.c:3243 [inline]
dev_hard_start_xmit+0x17c/0x780 net/core/dev.c:3259
__dev_queue_xmit+0x1656/0x2500 net/core/dev.c:3829
neigh_output include/net/neighbour.h:501 [inline]
ip6_finish_output2+0xa36/0x2290 net/ipv6/ip6_output.c:120
ip6_finish_output+0x3e7/0xa20 net/ipv6/ip6_output.c:154
NF_HOOK_COND include/linux/netfilter.h:278 [inline]
ip6_output+0x1e2/0x720 net/ipv6/ip6_output.c:171
dst_output include/net/dst.h:444 [inline]
ip6_local_out+0x99/0x170 net/ipv6/output_core.c:176
ip6_send_skb+0x9d/0x2f0 net/ipv6/ip6_output.c:1697
ip6_push_pending_frames+0xc0/0x100 net/ipv6/ip6_output.c:1717
rawv6_push_pending_frames net/ipv6/raw.c:616 [inline]
rawv6_sendmsg+0x2435/0x3530 net/ipv6/raw.c:946
inet_sendmsg+0xf8/0x5c0 net/ipv4/af_inet.c:798
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xc8/0x110 net/socket.c:631
___sys_sendmsg+0x6cf/0x890 net/socket.c:2114
__sys_sendmsg+0xf0/0x1b0 net/socket.c:2152
do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Signed-off-by: linmiaohe <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
gabrielesvelto pushed a commit to gabrielesvelto/CI20_linux that referenced this issue Jan 17, 2020
[ Upstream commit cee51c3 ]

There may be a race condition if f_fs calls unregister_gadget_item in
ffs_closed() when unregister_gadget is called by UDC store at the same time.
this leads to a kernel NULL pointer dereference:

[  310.644928] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[  310.645053] init: Service 'adbd' is being killed...
[  310.658938] pgd = c9528000
[  310.662515] [00000004] *pgd=19451831, *pte=00000000, *ppte=00000000
[  310.669702] Internal error: Oops: 817 [MIPS#1] PREEMPT SMP ARM
[  310.675211] Modules linked in:
[  310.678294] CPU: 0 PID: 1537 Comm: ->transport Not tainted 4.1.15-03725-g793404c MIPS#2
[  310.685958] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[  310.692493] task: c8e24200 ti: c945e000 task.ti: c945e000
[  310.697911] PC is at usb_gadget_unregister_driver+0xb4/0xd0
[  310.703502] LR is at __mutex_lock_slowpath+0x10c/0x16c
[  310.708648] pc : [<c075efc0>]    lr : [<c0bfb0bc>]    psr: 600f0113
<snip..>
[  311.565585] [<c075efc0>] (usb_gadget_unregister_driver) from [<c075e2b8>] (unregister_gadget_item+0x1c/0x34)
[  311.575426] [<c075e2b8>] (unregister_gadget_item) from [<c076fcc8>] (ffs_closed+0x8c/0x9c)
[  311.583702] [<c076fcc8>] (ffs_closed) from [<c07736b8>] (ffs_data_reset+0xc/0xa0)
[  311.591194] [<c07736b8>] (ffs_data_reset) from [<c07738ac>] (ffs_data_closed+0x90/0xd0)
[  311.599208] [<c07738ac>] (ffs_data_closed) from [<c07738f8>] (ffs_ep0_release+0xc/0x14)
[  311.607224] [<c07738f8>] (ffs_ep0_release) from [<c023e030>] (__fput+0x80/0x1d0)
[  311.614635] [<c023e030>] (__fput) from [<c014e688>] (task_work_run+0xb0/0xe8)
[  311.621788] [<c014e688>] (task_work_run) from [<c010afdc>] (do_work_pending+0x7c/0xa4)
[  311.629718] [<c010afdc>] (do_work_pending) from [<c010770c>] (work_pending+0xc/0x20)

for functions using functionFS, i.e. android adbd will close /dev/usb-ffs/adb/ep0
when usb IO thread fails, but switch adb from on to off also triggers write
"none" > UDC. These 2 operations both call unregister_gadget, which will lead
to the panic above.

add a mutex before calling unregister_gadget for api used in f_fs.

Signed-off-by: Winter Wang <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
gabrielesvelto pushed a commit to gabrielesvelto/CI20_linux that referenced this issue Jan 17, 2020
[ Upstream commit 92d1d07 ]

Kmemleak throws endless warnings during boot due to in
__alloc_alien_cache(),

    alc = kmalloc_node(memsize, gfp, node);
    init_arraycache(&alc->ac, entries, batch);
    kmemleak_no_scan(ac);

Kmemleak does not track the array cache (alc->ac) but the alien cache
(alc) instead, so let it track the latter by lifting kmemleak_no_scan()
out of init_arraycache().

There is another place that calls init_arraycache(), but
alloc_kmem_cache_cpus() uses the percpu allocation where will never be
considered as a leak.

  kmemleak: Found object by alias at 0xffff8007b9aa7e38
  CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ MIPS#2
  Call trace:
   dump_backtrace+0x0/0x168
   show_stack+0x24/0x30
   dump_stack+0x88/0xb0
   lookup_object+0x84/0xac
   find_and_get_object+0x84/0xe4
   kmemleak_no_scan+0x74/0xf4
   setup_kmem_cache_node+0x2b4/0x35c
   __do_tune_cpucache+0x250/0x2d4
   do_tune_cpucache+0x4c/0xe4
   enable_cpucache+0xc8/0x110
   setup_cpu_cache+0x40/0x1b8
   __kmem_cache_create+0x240/0x358
   create_cache+0xc0/0x198
   kmem_cache_create_usercopy+0x158/0x20c
   kmem_cache_create+0x50/0x64
   fsnotify_init+0x58/0x6c
   do_one_initcall+0x194/0x388
   kernel_init_freeable+0x668/0x688
   kernel_init+0x18/0x124
   ret_from_fork+0x10/0x18
  kmemleak: Object 0xffff8007b9aa7e00 (size 256):
  kmemleak:   comm "swapper/0", pid 1, jiffies 4294697137
  kmemleak:   min_count = 1
  kmemleak:   count = 0
  kmemleak:   flags = 0x1
  kmemleak:   checksum = 0
  kmemleak:   backtrace:
       kmemleak_alloc+0x84/0xb8
       kmem_cache_alloc_node_trace+0x31c/0x3a0
       __kmalloc_node+0x58/0x78
       setup_kmem_cache_node+0x26c/0x35c
       __do_tune_cpucache+0x250/0x2d4
       do_tune_cpucache+0x4c/0xe4
       enable_cpucache+0xc8/0x110
       setup_cpu_cache+0x40/0x1b8
       __kmem_cache_create+0x240/0x358
       create_cache+0xc0/0x198
       kmem_cache_create_usercopy+0x158/0x20c
       kmem_cache_create+0x50/0x64
       fsnotify_init+0x58/0x6c
       do_one_initcall+0x194/0x388
       kernel_init_freeable+0x668/0x688
       kernel_init+0x18/0x124
  kmemleak: Not scanning unknown object at 0xffff8007b9aa7e38
  CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ MIPS#2
  Call trace:
   dump_backtrace+0x0/0x168
   show_stack+0x24/0x30
   dump_stack+0x88/0xb0
   kmemleak_no_scan+0x90/0xf4
   setup_kmem_cache_node+0x2b4/0x35c
   __do_tune_cpucache+0x250/0x2d4
   do_tune_cpucache+0x4c/0xe4
   enable_cpucache+0xc8/0x110
   setup_cpu_cache+0x40/0x1b8
   __kmem_cache_create+0x240/0x358
   create_cache+0xc0/0x198
   kmem_cache_create_usercopy+0x158/0x20c
   kmem_cache_create+0x50/0x64
   fsnotify_init+0x58/0x6c
   do_one_initcall+0x194/0x388
   kernel_init_freeable+0x668/0x688
   kernel_init+0x18/0x124
   ret_from_fork+0x10/0x18

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 1fe00d5 ("slab: factor out initialization of array cache")
Signed-off-by: Qian Cai <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Catalin Marinas <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
gabrielesvelto pushed a commit to gabrielesvelto/CI20_linux that referenced this issue Jan 17, 2020
[ Upstream commit d982b33 ]

  =================================================================
  ==20875==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 1160 byte(s) in 1 object(s) allocated from:
      #0 0x7f1b6fc84138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
      MIPS#1 0x55bd50005599 in zalloc util/util.h:23
      MIPS#2 0x55bd500068f5 in perf_evsel__newtp_idx util/evsel.c:327
      MIPS#3 0x55bd4ff810fc in perf_evsel__newtp /home/work/linux/tools/perf/util/evsel.h:216
      MIPS#4 0x55bd4ff81608 in test__perf_evsel__tp_sched_test tests/evsel-tp-sched.c:69
      MIPS#5 0x55bd4ff528e6 in run_test tests/builtin-test.c:358
      MIPS#6 0x55bd4ff52baf in test_and_print tests/builtin-test.c:388
      MIPS#7 0x55bd4ff543fe in __cmd_test tests/builtin-test.c:583
      MIPS#8 0x55bd4ff5572f in cmd_test tests/builtin-test.c:722
      MIPS#9 0x55bd4ffc4087 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
      MIPS#10 0x55bd4ffc45c6 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
      MIPS#11 0x55bd4ffc49ca in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
      MIPS#12 0x55bd4ffc5138 in main /home/changbin/work/linux/tools/perf/perf.c:520
      MIPS#13 0x7f1b6e34809a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

  Indirect leak of 19 byte(s) in 1 object(s) allocated from:
      #0 0x7f1b6fc83f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30)
      MIPS#1 0x7f1b6e3ac30f in vasprintf (/lib/x86_64-linux-gnu/libc.so.6+0x8830f)

Signed-off-by: Changbin Du <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Fixes: 6a6cd11 ("perf test: Add test for the sched tracepoint format fields")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
gabrielesvelto pushed a commit to gabrielesvelto/CI20_linux that referenced this issue Jan 17, 2020
commit 9002b21 upstream.

Commit 32a5ad9 ("sysctl: handle overflow for file-max") hooked up
min/max values for the file-max sysctl parameter via the .extra1 and
.extra2 fields in the corresponding struct ctl_table entry.

Unfortunately, the minimum value points at the global 'zero' variable,
which is an int.  This results in a KASAN splat when accessed as a long
by proc_doulongvec_minmax on 64-bit architectures:

  | BUG: KASAN: global-out-of-bounds in __do_proc_doulongvec_minmax+0x5d8/0x6a0
  | Read of size 8 at addr ffff2000133d1c20 by task systemd/1
  |
  | CPU: 0 PID: 1 Comm: systemd Not tainted 5.1.0-rc3-00012-g40b114779944 MIPS#2
  | Hardware name: linux,dummy-virt (DT)
  | Call trace:
  |  dump_backtrace+0x0/0x228
  |  show_stack+0x14/0x20
  |  dump_stack+0xe8/0x124
  |  print_address_description+0x60/0x258
  |  kasan_report+0x140/0x1a0
  |  __asan_report_load8_noabort+0x18/0x20
  |  __do_proc_doulongvec_minmax+0x5d8/0x6a0
  |  proc_doulongvec_minmax+0x4c/0x78
  |  proc_sys_call_handler.isra.19+0x144/0x1d8
  |  proc_sys_write+0x34/0x58
  |  __vfs_write+0x54/0xe8
  |  vfs_write+0x124/0x3c0
  |  ksys_write+0xbc/0x168
  |  __arm64_sys_write+0x68/0x98
  |  el0_svc_common+0x100/0x258
  |  el0_svc_handler+0x48/0xc0
  |  el0_svc+0x8/0xc
  |
  | The buggy address belongs to the variable:
  |  zero+0x0/0x40
  |
  | Memory state around the buggy address:
  |  ffff2000133d1b00: 00 00 00 00 00 00 00 00 fa fa fa fa 04 fa fa fa
  |  ffff2000133d1b80: fa fa fa fa 04 fa fa fa fa fa fa fa 04 fa fa fa
  | >ffff2000133d1c00: fa fa fa fa 04 fa fa fa fa fa fa fa 00 00 00 00
  |                                ^
  |  ffff2000133d1c80: fa fa fa fa 00 fa fa fa fa fa fa fa 00 00 00 00
  |  ffff2000133d1d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Fix the splat by introducing a unsigned long 'zero_ul' and using that
instead.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 32a5ad9 ("sysctl: handle overflow for file-max")
Signed-off-by: Will Deacon <[email protected]>
Acked-by: Christian Brauner <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Matteo Croce <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
gabrielesvelto pushed a commit to gabrielesvelto/CI20_linux that referenced this issue Jan 17, 2020
[ Upstream commit 47b1682 ]

If xace hardware reports a bad version number, the error handling code
in ace_setup() calls put_disk(), followed by queue cleanup. However, since
the disk data structure has the queue pointer set, put_disk() also
cleans and releases the queue. This results in blk_cleanup_queue()
accessing an already released data structure, which in turn may result
in a crash such as the following.

[   10.681671] BUG: Kernel NULL pointer dereference at 0x00000040
[   10.681826] Faulting instruction address: 0xc0431480
[   10.682072] Oops: Kernel access of bad area, sig: 11 [MIPS#1]
[   10.682251] BE PAGE_SIZE=4K PREEMPT Xilinx Virtex440
[   10.682387] Modules linked in:
[   10.682528] CPU: 0 PID: 1 Comm: swapper Tainted: G        W         5.0.0-rc6-next-20190218+ MIPS#2
[   10.682733] NIP:  c0431480 LR: c043147c CTR: c0422ad8
[   10.682863] REGS: cf82fbe0 TRAP: 0300   Tainted: G        W          (5.0.0-rc6-next-20190218+)
[   10.683065] MSR:  00029000 <CE,EE,ME>  CR: 22000222  XER: 00000000
[   10.683236] DEAR: 00000040 ESR: 00000000
[   10.683236] GPR00: c043147c cf82fc90 cf82ccc0 00000000 00000000 00000000 00000002 00000000
[   10.683236] GPR08: 00000000 00000000 c04310bc 00000000 22000222 00000000 c0002c54 00000000
[   10.683236] GPR16: 00000000 00000001 c09aa39c c09021b0 c09021dc 00000007 c0a68c08 00000000
[   10.683236] GPR24: 00000001 ced6d400 ced6dcf0 c0815d9c 00000000 00000000 00000000 cedf0800
[   10.684331] NIP [c0431480] blk_mq_run_hw_queue+0x28/0x114
[   10.684473] LR [c043147c] blk_mq_run_hw_queue+0x24/0x114
[   10.684602] Call Trace:
[   10.684671] [cf82fc90] [c043147c] blk_mq_run_hw_queue+0x24/0x114 (unreliable)
[   10.684854] [cf82fcc0] [c04315bc] blk_mq_run_hw_queues+0x50/0x7c
[   10.685002] [cf82fce0] [c0422b24] blk_set_queue_dying+0x30/0x68
[   10.685154] [cf82fcf0] [c0423ec0] blk_cleanup_queue+0x34/0x14c
[   10.685306] [cf82fd10] [c054d73c] ace_probe+0x3dc/0x508
[   10.685445] [cf82fd50] [c052d740] platform_drv_probe+0x4c/0xb8
[   10.685592] [cf82fd70] [c052abb0] really_probe+0x20c/0x32c
[   10.685728] [cf82fda0] [c052ae58] driver_probe_device+0x68/0x464
[   10.685877] [cf82fdc0] [c052b500] device_driver_attach+0xb4/0xe4
[   10.686024] [cf82fde0] [c052b5dc] __driver_attach+0xac/0xfc
[   10.686161] [cf82fe00] [c0528428] bus_for_each_dev+0x80/0xc0
[   10.686314] [cf82fe30] [c0529b3c] bus_add_driver+0x144/0x234
[   10.686457] [cf82fe50] [c052c46] driver_register+0x88/0x15c
[   10.686610] [cf82fe60] [c09de288] ace_init+0x4c/0xac
[   10.686742] [cf82fe80] [c0002730] do_one_initcall+0xac/0x330
[   10.686888] [cf82fee0] [c09aafd0] kernel_init_freeable+0x34c/0x478
[   10.687043] [cf82ff30] [c0002c6c] kernel_init+0x18/0x114
[   10.687188] [cf82ff40] [c000f2f0] ret_from_kernel_thread+0x14/0x1c
[   10.687349] Instruction dump:
[   10.687435] 3863ffd4 4bfffd70 9421ffd0 7c0802a6 93c10028 7c9e2378 93e1002c 38810008
[   10.687637] 7c7f1b78 90010034 4bfffc25 813f008c <81290040> 75290100 4182002c 80810008
[   10.688056] ---[ end trace 13c9ff51d41b9d40 ]---

Fix the problem by setting the disk queue pointer to NULL before calling
put_disk(). A more comprehensive fix might be to rearrange the code
to check the hardware version before initializing data structures,
but I don't know if this would have undesirable side effects, and
it would increase the complexity of backporting the fix to older kernels.

Fixes: 74489a9 ("Add support for Xilinx SystemACE CompactFlash interface")
Acked-by: Michal Simek <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants