Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync up with Linus #46

Merged
merged 52 commits into from
Mar 3, 2015
Merged

Sync up with Linus #46

merged 52 commits into from
Mar 3, 2015

Conversation

dabrace
Copy link
Owner

@dabrace dabrace commented Mar 3, 2015

No description provided.

pmladek and others added 30 commits February 21, 2015 10:33
can_probe() checks if the given address points to the beginning
of an instruction. It analyzes all the instructions from the
beginning of the function until the given address. The code
might be modified by another Kprobe. In this case, the current
code is read into a buffer, int3 breakpoint is replaced by the
saved opcode in the buffer, and can_probe() analyzes the buffer
instead.

There is a bug that __recover_probed_insn() tries to restore
the original code even for Kprobes using the ftrace framework.
But in this case, the opcode is not stored. See the difference
between arch_prepare_kprobe() and arch_prepare_kprobe_ftrace().
The opcode is stored by arch_copy_kprobe() only from
arch_prepare_kprobe().

This patch makes Kprobe to use the ideal 5-byte NOP when the
code can be modified by ftrace. It is the original instruction,
see ftrace_make_nop() and ftrace_nop_replace().

Note that we always need to use the NOP for ftrace locations.
Kprobes do not block ftrace and the instruction might get
modified at anytime. It might even be in an inconsistent state
because it is modified step by step using the int3 breakpoint.

The patch also fixes indentation of the touched comment.

Note that I found this problem when playing with Kprobes. I did
it on x86_64 with gcc-4.8.3 that supported -mfentry. I modified
samples/kprobes/kprobe_example.c and added offset 5 to put
the probe right after the fentry area:

 static struct kprobe kp = {
 	.symbol_name	= "do_fork",
+	.offset = 5,
 };

Then I was able to load kprobe_example before jprobe_example
but not the other way around:

  $> modprobe jprobe_example
  $> modprobe kprobe_example
  modprobe: ERROR: could not insert 'kprobe_example': Invalid or incomplete multibyte or wide character

It did not make much sense and debugging pointed to the bug
described above.

Signed-off-by: Petr Mladek <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Ananth NMavinakayanahalli <[email protected]>
Cc: Anil S Keshavamurthy <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
…sn()

__recover_probed_insn() should always be called from an address
where an instructions starts. The check for ftrace_location()
might help to discover a potential inconsistency.

This patch adds WARN_ON() when the inconsistency is detected.
Also it adds handling of the situation when the original code
can not get recovered.

Suggested-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Petr Mladek <[email protected]>
Cc: Ananth NMavinakayanahalli <[email protected]>
Cc: Anil S Keshavamurthy <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Change 'ssociative' to 'associative'

Signed-off-by: Yannick Guerrini <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Bryan O'Donoghue <[email protected]>
Cc: Chris Bainbridge <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Steven Honeyman <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
This was causing the destination instead of the source to be filled.  As
a result, the source was typically all mapped to one zero page, and
hence very cacheable.

Signed-off-by: Bruce Merry <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/20150115092022.GA11292@kryton
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
…hip per node

The change:

7b8792b
gpiolib: of: Correct error handling in of_get_named_gpiod_flags

assumed that only one gpio-chip is registred per of-node.
Some drivers register more than one chip per of-node, so
adjust the matching function of_gpiochip_find_and_xlate to
not stop looking for chips if a node-match is found and
the translation fails.

Cc: Stable <[email protected]>
Fixes: 7b8792b ("gpiolib: of: Correct error handling in of_get_named_gpiod_flags")
Signed-off-by: Hans Holmberg <[email protected]>
Acked-by: Alexandre Courbot <[email protected]>
Tested-by: Robert Jarzmik <[email protected]>
Tested-by: Tyler Hall <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
The gpio_chip operations receive a pointer the gpio_chip struct which is
contained in the driver's private struct, yet the container_of call in those
functions point to the mfd struct defined in include/linux/mfd/tps65912.h.

Cc: Stable <[email protected]>
Signed-off-by: Nicolas Saenz Julienne <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
…arch_setup()

Change 'Uknown' to 'Unknown'

Signed-off-by: Yannick Guerrini <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
The KSTK_EIP() and KSTK_ESP() macros should return the user program
counter (PC) and stack pointer (A0StP) of the given task. These are used
to determine which VMA corresponds to the user stack in
/proc/<pid>/maps, and for the user PC & A0StP in /proc/<pid>/stat.

However for Meta the PC & A0StP from the task's kernel context are used,
resulting in broken output. For example in following /proc/<pid>/maps
output, the 3afff000-3b021000 VMA should be described as the stack:

  # cat /proc/self/maps
  ...
  100b0000-100b1000 rwxp 00000000 00:00 0          [heap]
  3afff000-3b021000 rwxp 00000000 00:00 0

And in the following /proc/<pid>/stat output, the PC is in kernel code
(1074234964 = 0x40078654) and the A0StP is in the kernel heap
(1335981392 = 0x4fa17550):

  # cat /proc/self/stat
  51 (cat) R ... 1335981392 1074234964 ...

Fix the definitions of KSTK_EIP() and KSTK_ESP() to use
task_pt_regs(tsk)->ctx rather than (tsk)->thread.kernel_context. This
gets the registers from the user context stored after the thread info at
the base of the kernel stack, which is from the last entry into the
kernel from userland, regardless of where in the kernel the task may
have been interrupted, which results in the following more correct
/proc/<pid>/maps output:

  # cat /proc/self/maps
  ...
  0800b000-08070000 r-xp 00000000 00:02 207        /lib/libuClibc-0.9.34-git.so
  ...
  100b0000-100b1000 rwxp 00000000 00:00 0          [heap]
  3afff000-3b021000 rwxp 00000000 00:00 0          [stack]

And /proc/<pid>/stat now correctly reports the PC in libuClibc
(134320308 = 0x80190b4) and the A0StP in the [stack] region (989864576 =
0x3b002280):

  # cat /proc/self/stat
  51 (cat) R ... 989864576 134320308 ...

Reported-by: Alexey Brodkin <[email protected]>
Reported-by: Vineet Gupta <[email protected]>
Signed-off-by: James Hogan <[email protected]>
Cc: [email protected]
Cc: <[email protected]> # v3.9+
Fix following build warning if CONFIG_PM_SLEEP is not set:

drivers/thermal/ti-soc-thermal/ti-bandgap.c:1478:12: warning: 'ti_bandgap_suspend' defined but not used [-Wunused-function]
 static int ti_bandgap_suspend(struct device *dev)
            ^
drivers/thermal/ti-soc-thermal/ti-bandgap.c:1492:12: warning: 'ti_bandgap_resume' defined but not used [-Wunused-function]
 static int ti_bandgap_resume(struct device *dev)
            ^
Acked-by: Nishanth Menon <[email protected]>
Signed-off-by: Grygorii Strashko <[email protected]>
Signed-off-by: Eduardo Valentin <[email protected]>
…"cpufreq_cooling_unregister"

The cpufreq_cooling_unregister() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <[email protected]>
Signed-off-by: Eduardo Valentin <[email protected]>
When CONFIG_THERMAL is not enabled, it is better to introduce
equivalent dummy functions in the exported header than to
introduce #ifdeffery in drivers using the function.

This will prevent issues such as that reported in:
http://www.spinics.net/lists/linux-next/msg31573.html

While at it switch over to IS_ENABLED for thermal macros
to allow for thermal framework to be built as framework
and relevant APIs be usable by relevant drivers as a result.

Reported-by: Guenter Roeck <[email protected]>
Signed-off-by: Nishanth Menon <[email protected]>
Signed-off-by: Eduardo Valentin <[email protected]>
As soon as the interrupt has been enabled by devm_request_irq(), the
interrupt routine may be called, depending on the current status of the
hardware.

However, at that point rcar_thermal_common hasn't been initialized
complely yet. E.g. rcar_thermal_common.base is still NULL, causing a
NULL pointer dereference:

    Unable to handle kernel NULL pointer dereference at virtual address 0000000c
    pgd = c0004000
    [0000000c] *pgd=00000000
    Internal error: Oops: 5 [#1] SMP ARM
    CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.19.0-rc7-ape6evm-04564-gb6e46cb7cbe82389 #30
    Hardware name: Generic R8A73A4 (Flattened Device Tree)
    task: ee8953c0 ti: ee896000 task.ti: ee896000
    PC is at rcar_thermal_irq+0x1c/0xf0
    LR is at _raw_spin_lock_irqsave+0x48/0x54

Postpone the call to devm_request_irq() until all initialization has
been done to fix this.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Acked-by: Kuninori Morimoto <[email protected]>
Signed-off-by: Eduardo Valentin <[email protected]>
Swap interrupt disable and thermal zone unregistration in the error and
remove paths, to make them more symmetrical with the initialization
path.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Acked-by: Kuninori Morimoto <[email protected]>
Signed-off-by: Eduardo Valentin <[email protected]>
…ible table

This patch cleanup the code to use oneline for entry of exynos compatible
table.

Cc: Zhang Rui <[email protected]>
Cc: Eduardo Valentin <[email protected]>
Signed-off-by: Chanwoo Choi <[email protected]>
Acked-by: Lukasz Majewski <[email protected]>
Signed-off-by: Eduardo Valentin <[email protected]>
When a drive is marked write-mostly it should only be the
target of reads if there is no other option.

This behaviour was broken by

commit 9dedf60
    md/raid1: read balance chooses idlest disk for SSD

which causes a write-mostly device to be *preferred* is some cases.

Restore correct behaviour by checking and setting
best_dist_disk and best_pending_disk rather than best_disk.

We only need to test one of these as they are both changed
from -1 or >=0 at the same time.

As we leave min_pending and best_dist unchanged, any non-write-mostly
device will appear better than the write-mostly device.

Reported-by: Tomáš Hodek <[email protected]>
Reported-by: Dark Penguin <[email protected]>
Signed-off-by: NeilBrown <[email protected]>
Link: http://marc.info/?l=linux-raid&m=135982797322422
Fixes: 9dedf60
Cc: [email protected] (3.6+)
When we have more than 1 drive failure, it's possible we start
rebuild one drive while leaving another faulty drive in array.
To determine whether array will be optimal after building, current
code only check whether a drive is missing, which could potentially
lead to data corruption. This patch is to add checking Faulty flag.

Signed-off-by: NeilBrown <[email protected]>
Since __ATTR_PREALLOC was introduced in v3.19-rc1~78^2~18
it can now be used by md.

This ensure that writing to these sysfs attributes will never
block due to a memory allocation.
Such blocking could become a deadlock if mdmon is trying to
reconfigure an array after a failure prior to re-enabling writes.

Signed-off-by: NeilBrown <[email protected]>
…config

The Kconfig options for the asm9260 timer is wrong as it can be selected by
another platform with allyes config and thus leading to a compilation failure
as some non arch related code is pulled by the compilation.

Fix this by having the platform Kconfig to select the timer as it is done for
the others drivers.

Signed-off-by: Daniel Lezcano <[email protected]>
Acked-by: Guenter Roeck <[email protected]>
Acked-by: Oleksij Rempel <[email protected]>

Conflicts:
	drivers/clocksource/Kconfig
We have two race conditions in the probe code which could lead to a null
pointer dereference in the interrupt handler.

The interrupt handler accesses the clockevent device, which may not yet be
registered.

First race condition happens when the interrupt handler gets registered before
the interrupts get disabled. The second race condition happens when the
interrupts get enabled, but the clockevent device is not yet registered.

Fix that by disabling the interrupts before we register the interrupt and enable
the interrupts after the clockevent device got registered.

Reported-by: Gongbae Park <[email protected]>
Signed-off-by: Matthias Brugger <[email protected]>
Cc: [email protected]
Signed-off-by: Daniel Lezcano <[email protected]>
As pxa_timer_common_init() is only called in init context, mark it as
such, and quiesce the compiler warnings :
WARNING: vmlinux.o(.text.unlikely+0x45d4): Section mismatch in reference
from the function pxa_timer_common_init() to the function
.init.text:sched_clock_register()

WARNING: vmlinux.o(.text.unlikely+0x4610): Section mismatch in reference
from the function pxa_timer_common_init() to the function
.init.text:clocksource_mmio_init()

Signed-off-by: Robert Jarzmik <[email protected]>
Signed-off-by: Daniel Lezcano <[email protected]>
…iel.lezcano/linux into timers/urgent

Pull clockevents driver fixes from Daniel Lezcano:

  - Fix the Kconfig to prevent the asm9260 timer to be compiled with
    allyesconfig with sparc/sparc64 (Daniel Lezcano)

  - Reorder the mtk driver init sequence in order to prevent a potential race
    when the clock is registered before the irq handler is set (Matthias Brugger)

  - Fix a section mismatch for the pxa driver (Robert Jarzmik)

Signed-off-by: Ingo Molnar <[email protected]>
… check

The man page for pthread_attr_set_affinity_np states that _GNU_SOURCE
must be defined before pthread.h is included in order to get the proper
function declaration.  Define this in the Makefile.

Without this defined, the feature check fails on a Fedora system with
gcc5 and then the perf build later fails with conflicting prototypes for
the function.

Signed-off-by: Josh Boyer <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Vineet Gupta <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Feature detection for pthread_attr_setaffinity_np was failing, producing
this error:

  In file included from bench/futex-hash.c:17:0:
  bench/futex.h:73:19: error: conflicting types for ‘pthread_attr_setaffinity_np’
   static inline int pthread_attr_setaffinity_np(pthread_attr_t *attr,
                   ^
  In file included from bench/futex.h:72:0,
                   from bench/futex-hash.c:17:
  /usr/include/pthread.h:407:12: note: previous declaration of ‘pthread_attr_setaffinity_np’ was here
   extern int pthread_attr_setaffinity_np (pthread_attr_t *__attr,
            ^
  make[3]: *** [bench/futex-hash.o] Error 1
  make[2]: *** [bench] Error 2
  make[2]: *** Waiting for unfinished jobs....

  This was because compiling test-pthread-attr-setaffinity-np.c
  failed due to the function arguments:

  test-pthread-attr-setaffinity-np.c: In function ‘main’:
  test-pthread-attr-setaffinity-np.c:11:2: warning: null argument where non-null required (argument 3) [-Wnonnull]
    ret = pthread_attr_setaffinity_np(&thread_attr, 0, NULL);
    ^
  So fix the arguments.

Signed-off-by: Adrian Hunter <[email protected]>
Tested-by: Stephane Eranian <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Commit f6edb53 converted the probe to
a CPU wide event first (pid == -1). For kernels that do not support
the PERF_FLAG_FD_CLOEXEC flag the probe fails with EINVAL. Since this
errno is not handled pid is not reset to 0 and the subsequent use of
pid = -1 as an argument brings in an additional failure path if
perf_event_paranoid > 0:

$ perf record -- sleep 1
perf_event_open(..., 0) failed unexpectedly with error 13 (Permission denied)
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.007 MB /tmp/perf.data (11 samples) ]

Also, ensure the fd of the confirmation check is closed and comment why
pid = -1 is used.

Needs to go to 3.18 stable tree as well.

Signed-off-by: Adrian Hunter <[email protected]>
Based-on-patch-by: David Ahern <[email protected]>
Acked-by: David Ahern <[email protected]>
Cc: David Ahern <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Cc: [email protected]  # v3.18+
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
perf-top is terminating due to SIGBUS on sparc64. git bisect points to:

    commit 8239698
    Author: Arnaldo Carvalho de Melo <[email protected]>
    Date:   Mon Sep 8 13:26:35 2014 -0300

        perf evlist: Refcount mmaps

        We need to know how many fds are using a perf mmap via
        PERF_EVENT_IOC_SET_OUTPUT, so that we can know when to ditch an mmap,
        refcount it.

This commit added 'int refcnt' to struct perf_mmap and the addition makes the
event_copy element no longer 8-byte aligned.

Fix by adding __attribute__((aligned(8))) to the event_copy struct
member.

Signed-off-by: David Ahern <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Switched from 'int pad;' to using __attribute__, David tested/acked that ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
4886f2c added an arm-64 check, but the EM_AARCH64 macro is not
defined in older releases (e.g., RHEL6). Define if it is not defined.

Signed-off-by: David Ahern <[email protected]>
Cc: Victor Kamensky <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
The recent build changes cause perf to not compile for sparc64 since the
arch/sparc64/Build file does not exist:

/home/dahern/kernels/linux.git/tools/build/Makefile.build:40: arch/sparc64/Build: No such file or directory

Fix by converting the sparc64 RAW_ARCH to sparc ARCH -- similar to what
is done for x86_64.

Signed-off-by: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
If we launch in daemon mode (--daemon), we don't have the ncurses UI,
but we might want to set the target temperature still. For example,
someone might stick the following in their boot script:

  tmon --control intel_powerclamp --target-temp 90 --log --daemon

This would turn on CPU idle injection when we're around 90 degrees
celsius, and would log temperature and throttling info to
/var/tmp/tmon.log.

Signed-off-by: Brian Norris <[email protected]>
Acked-by: Jacob Pan <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>
Signed-off-by: Brian Norris <[email protected]>
Acked-by: Jacob Pan <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>
We can use the ncurses API to get the number of rows.

Signed-off-by: Brian Norris <[email protected]>
Acked-by: Jacob Pan <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>
spandruvada and others added 16 commits February 28, 2015 13:58
It is possible that _ART/_TRT tables are missing or have errors.
Ignore those failures, as INT3400 thermal zone is still required
for _OSC or mode switch.

Signed-off-by: Srinivas Pandruvada <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>
Commit:

   1e02ce4 ("x86: Store a per-cpu shadow copy of CR4")

added a shadow CR4 such that reads and writes that do not
modify the CR4 execute much faster than always reading the
register itself.

The change modified cpu_init() in common.c, so that the
shadow CR4 gets initialized before anything uses it.

Unfortunately, there's two cpu_init()s in common.c. There's
one for 64-bit and one for 32-bit. The commit only added
the shadow init to the 64-bit path, but the 32-bit path
needs the init too.

Link: http://lkml.kernel.org/r/[email protected] Fixes: 1e02ce4 "x86: Store a per-cpu shadow copy of CR4"
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
The "usual" path is:

 - rt_mutex_slowlock()
 - set_current_state()
 - task_blocks_on_rt_mutex() (ret 0)
 - __rt_mutex_slowlock()
   - sleep or not but do return with __set_current_state(TASK_RUNNING)
 - back to caller.

In the early error case where task_blocks_on_rt_mutex() return
-EDEADLK we never change the task's state back to RUNNING. I
assume this is intended. Without this change after ww_mutex
using rt_mutex the selftest passes but later I get plenty of:

  | bad: scheduling from the idle thread!

backtraces.

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Acked-by: Mike Galbraith <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Fixes: afffc6c ("locking/rtmutex: Optimize setting task running after being blocked")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
…ux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

  - pthread_attr_setaffinity_np() feature detection build fixes (Adrian Hunter, Josh Boyer)

  - Fix probing for PERF_FLAG_FD_CLOEXEC flag (Adrian Hunter)

  - Fix order of arguments to memcpy_alloc_mem in 'perf bench' (Bruce Merry)

  - Sparc64 and Aarch64 build and segfault fixes (David Ahern)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
…cm/linux/kernel/git/tip/tip

Pull locking fix from Ingo Molnar:
 "An rtmutex deadlock path fixlet"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rtmutex: Set state back to running on error
…linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "Two kprobes fixes and a handful of tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools: Make sparc64 arch point to sparc
  perf symbols: Define EM_AARCH64 for older OSes
  perf top: Fix SIGBUS on sparc64
  perf tools: Fix probing for PERF_FLAG_FD_CLOEXEC flag
  perf tools: Fix pthread_attr_setaffinity_np build error
  perf tools: Define _GNU_SOURCE on pthread_attr_setaffinity_np feature check
  perf bench: Fix order of arguments to memcpy_alloc_mem
  kprobes/x86: Check for invalid ftrace location in __recover_probed_insn()
  kprobes/x86: Use 5-byte NOP when the code might be modified by ftrace
…m/linux/kernel/git/tip/tip

Pull timer fixes from Ingo Molnar:
 "Three clockevents/clocksource driver fixes"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource: pxa: Fix section mismatch
  clocksource: mtk: Fix race conditions in probe code
  clockevents: asm9260: Fix compilation error with sparc/sparc64 allyesconfig
…inux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "A CR4-shadow 32-bit init fix, plus two typo fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Init per-cpu shadow copy of CR4 on 32-bit CPUs too
  x86/platform/intel-mid: Fix trivial printk message typo in intel_mid_arch_setup()
  x86/cpu/intel: Fix trivial typo in intel_tlb_table[]
…kernel/git/jhogan/metag

Pull arch/metag fix from James Hogan:
 "This is just a single patch to fix the KSTK_EIP() and KSTK_ESP()
  macros for metag which have always been erronously returning the PC
  and stack pointer of the task's kernel context rather than from its
  user context saved at entry from userland into the kernel, which
  affects the contents of /proc/<pid>/maps and /proc/<pid>/stat"

* tag 'metag-fixes-v4.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
  metag: Fix KSTK_EIP() and KSTK_ESP() macros
Pull md fixes from Neil Brown:
 "Three md fixes:

   - fix a read-balance problem that was reported 2 years ago, but that
     I never noticed the report :-(

   - fix for rare RAID6 problem causing incorrect bitmap updates when
     two devices fail.

   - add __ATTR_PREALLOC annotation now that it is possible"

* tag 'md/4.0-fixes' of git://neil.brown.name/md:
  md: mark some attributes as pre-alloc
  raid5: check faulty flag for array status during recovery.
  md/raid1: fix read balance when a drive is write-mostly.
…x/kernel/git/evalenti/linux-soc-thermal

Pull thermal management fixes from Eduardo Valentin:
 "Specifics:

   - Several fixes in tmon tool.

   - Fixes in intel int340x for _ART and _TRT tables.

   - Add id for Avoton SoC into powerclamp driver.

   - Fixes in RCAR thermal driver to remove race conditions and fix fail
     path

   - Fixes in TI thermal driver: removal of unnecessary code and build
     fix if !CONFIG_PM_SLEEP

   - Cleanups in exynos thermal driver

   - Add stubs for include/linux/thermal.h.  Now drivers using thermal
     calls but that also work without CONFIG_THERMAL will be able to
     compile for systems that don't care about thermal.

  Note: I am sending this pull on Rui's behalf while he fixes issues in
  his Linux box"

* 'fixes-for-4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
  thermal: int340x_thermal: Ignore missing _ART, _TRT tables
  thermal/intel_powerclamp: add id for Avoton SoC
  tools/thermal: tmon: silence 'set but not used' warnings
  tools/thermal: tmon: use pkg-config to determine library dependencies
  tools/thermal: tmon: support cross-compiling
  tools/thermal: tmon: add .gitignore
  tools/thermal: tmon: fixup tui windowing calculations
  tools/thermal: tmon: tui: don't hard-code dialog window size assumptions
  tools/thermal: tmon: add min/max macros
  tools/thermal: tmon: add --target-temp parameter
  thermal: exynos: Clean-up code to use oneline entry for exynos compatible table
  thermal: rcar: Make error and remove paths symmetrical with init
  thermal: rcar: Fix race condition between init and interrupt
  thermal: Introduce dummy functions when thermal is not defined
  ti-soc-thermal: Delete an unnecessary check before the function call "cpufreq_cooling_unregister"
  thermal: ti-soc-thermal: bandgap: Fix build warning if !CONFIG_PM_SLEEP
…git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "Two GPIO fixes:

   - Fix a translation problem in of_get_named_gpiod_flags()

   - Fix a long standing container_of() mistake in the TPS65912 driver"

* tag 'gpio-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: tps65912: fix wrong container_of arguments
  gpiolib: of: allow of_gpiochip_find_and_xlate to find more than one chip per node
This is a tricky story of the new atomic state handling and the legacy
code fighting over each another. The bug at hand is an underrun of the
framebuffer reference with subsequent hilarity caused by the load
detect code. Which is peculiar since the the exact same code works
fine as the implementation of the legacy setcrtc ioctl.

Let's look at the ingredients:

- Currently our code is a crazy mix of legacy modeset interfaces to
  set the parameters and half-baked atomic state tracking underneath.
  While this transition is going we're using the transitional plane
  helpers to update the atomic side (drm_plane_helper_disable/update
  and friends), i.e. plane->state->fb. Since the state structure owns
  the fb those functions take care of that themselves.

  The legacy state (specifically crtc->primary->fb) is still managed
  by the old code (and mostly by the drm core), with the fb reference
  counting done by callers (core drm for the ioctl or the i915 load
  detect code). The relevant commit is

  commit ea2c67b
  Author: Matt Roper <[email protected]>
  Date:   Tue Dec 23 10:41:52 2014 -0800

      drm/i915: Move to atomic plane helpers (v9)

- drm_plane_helper_disable has special code to handle multiple calls
  in a row - it checks plane->crtc == NULL and bails out. This is to
  match the proper atomic implementation which needs the crtc to get
  at the implied locking context atomic updates always need. See

  commit acf24a3
  Author: Daniel Vetter <[email protected]>
  Date:   Tue Jul 29 15:33:05 2014 +0200

      drm/plane-helper: transitional atomic plane helpers

- The universal plane code split out the implicit primary plane from
  the CRTC into it's own full-blown drm_plane object. As part of that
  the setcrtc ioctl (which updated both the crtc mode and primary
  plane) learned to set crtc->primary->crtc on modeset to make sure
  the plane->crtc assignments statate up to date in

  commit e13161a
  Author: Matt Roper <[email protected]>
  Date:   Tue Apr 1 15:22:38 2014 -0700

      drm: Add drm_crtc_init_with_planes() (v2)

  Unfortunately we've forgotten to update the load detect code. Which
  wasn't a problem since the load detect modeset is temporary and
  always undone before we drop the locks.

- Finally there is a organically grown history (i.e. don't ask) around
  who sets the legacy plane->fb for the various driver entry points.
  Originally updating that was the drivers duty, but for almost all
  places we've moved that (plus updating the refcounts) into the core.
  Again the exception is the load detect code.

Taking all together the following happens:
- The load detect code doesn't set crtc->primary->crtc. This is only
  really an issue on crtcs never before used or when userspace
  explicitly disabled the primary plane.

- The plane helper glue code short-circuits because of that and leaves
  a non-NULL fb behind in plane->state->fb and plane->fb. The state
  fb isn't a real problem (it's properly refcounted on its own), it's
  just the canary.

- Load detect code drops the reference for that fb, but doesn't set
  plane->fb = NULL. This is ok since it's still living in that old
  world where drivers had to clear the pointer but the core/callers
  handled the refcounting.

- On the next modeset the drm core notices plane->fb and takes care of
  refcounting it properly by doing another unref. This drops the
  refcount to zero, leaving state->plane now pointing at freed memory.

- intel_plane_duplicate_state still assume it owns a reference to that
  very state->fb and bad things start to happen.

Fix this all by applying the same duct-tape as for the legacy setcrtc
ioctl code and set crtc->primary->crtc properly.

Cc: Matt Roper <[email protected]>
Cc: Paul Bolle <[email protected]>
Cc: Rob Clark <[email protected]>
Cc: Paulo Zanoni <[email protected]>
Cc: Sean Paul <[email protected]>
Cc: Matt Roper <[email protected]>
Reported-and-tested-by: Linus Torvalds <[email protected]>
Reported-by: Paul Bolle <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
dabrace added a commit that referenced this pull request Mar 3, 2015
@dabrace dabrace merged commit 2c7bace into dabrace:master Mar 3, 2015
dabrace pushed a commit that referenced this pull request Mar 17, 2016
The mutex lock at rc_register_device() was added by commit 08aeb7c
("[media] rc: add locking to fix register/show race").

It is meant to avoid race issues when trying to open a sysfs file while
the RC register didn't complete.

Adding a lock there causes troubles, as detected by the Kernel lock
debug instrumentation at the Kernel:

    ======================================================
    [ INFO: possible circular locking dependency detected ]
    4.5.0-rc3+ #46 Not tainted
    -------------------------------------------------------
    systemd-udevd/2681 is trying to acquire lock:
     (s_active#171){++++.+}, at: [<ffffffff8171a115>] kernfs_remove_by_name_ns+0x45/0xa0

    but task is already holding lock:
     (&dev->lock){+.+.+.}, at: [<ffffffffa0724def>] rc_register_device+0xb2f/0x1450 [rc_core]

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (&dev->lock){+.+.+.}:
           [<ffffffff8124817d>] lock_acquire+0x13d/0x320
           [<ffffffff822de966>] mutex_lock_nested+0xb6/0x860
           [<ffffffffa0721f2b>] show_protocols+0x3b/0x3f0 [rc_core]
           [<ffffffff81cdaba5>] dev_attr_show+0x45/0xc0
           [<ffffffff8171f1b3>] sysfs_kf_seq_show+0x203/0x3c0
           [<ffffffff8171a6a1>] kernfs_seq_show+0x121/0x1b0
           [<ffffffff81617c71>] seq_read+0x2f1/0x1160
           [<ffffffff8171c911>] kernfs_fop_read+0x321/0x460
           [<ffffffff815abc20>] __vfs_read+0xe0/0x3d0
           [<ffffffff815ae90e>] vfs_read+0xde/0x2d0
           [<ffffffff815b1d01>] SyS_read+0x111/0x230
           [<ffffffff822e8636>] entry_SYSCALL_64_fastpath+0x16/0x76

    -> #0 (s_active#171){++++.+}:
           [<ffffffff81244f24>] __lock_acquire+0x4304/0x5990
           [<ffffffff8124817d>] lock_acquire+0x13d/0x320
           [<ffffffff81717d3a>] __kernfs_remove+0x58a/0x810
           [<ffffffff8171a115>] kernfs_remove_by_name_ns+0x45/0xa0
           [<ffffffff81721592>] remove_files.isra.0+0x72/0x190
           [<ffffffff8172174b>] sysfs_remove_group+0x9b/0x150
           [<ffffffff81721854>] sysfs_remove_groups+0x54/0xa0
           [<ffffffff81cd97d0>] device_remove_attrs+0xb0/0x140
           [<ffffffff81cdb27c>] device_del+0x38c/0x6b0
           [<ffffffffa0724b8b>] rc_register_device+0x8cb/0x1450 [rc_core]
           [<ffffffffa1326a7b>] dvb_usb_remote_init+0x66b/0x14d0 [dvb_usb]
           [<ffffffffa1321c81>] dvb_usb_device_init+0xf21/0x1860 [dvb_usb]
           [<ffffffffa13517dc>] dib0700_probe+0x14c/0x410 [dvb_usb_dib0700]
           [<ffffffff81dbb1dd>] usb_probe_interface+0x45d/0x940
           [<ffffffff81ce7e7a>] driver_probe_device+0x21a/0xc30
           [<ffffffff81ce89b1>] __driver_attach+0x121/0x160
           [<ffffffff81ce21bf>] bus_for_each_dev+0x11f/0x1a0
           [<ffffffff81ce6cdd>] driver_attach+0x3d/0x50
           [<ffffffff81ce5df9>] bus_add_driver+0x4c9/0x770
           [<ffffffff81cea39c>] driver_register+0x18c/0x3b0
           [<ffffffff81db6e98>] usb_register_driver+0x1f8/0x440
           [<ffffffffa074001e>] dib0700_driver_init+0x1e/0x1000 [dvb_usb_dib0700]
           [<ffffffff810021b1>] do_one_initcall+0x141/0x300
           [<ffffffff8144d8eb>] do_init_module+0x1d0/0x5ad
           [<ffffffff812f27b6>] load_module+0x6666/0x9ba0
           [<ffffffff812f5fe8>] SyS_finit_module+0x108/0x130
           [<ffffffff822e8636>] entry_SYSCALL_64_fastpath+0x16/0x76

    other info that might help us debug this:

     Possible unsafe locking scenario:

           CPU0                    CPU1
           ----                    ----
      lock(&dev->lock);
                                   lock(s_active#171);
                                   lock(&dev->lock);
      lock(s_active#171);

     *** DEADLOCK ***

    3 locks held by systemd-udevd/2681:
     #0:  (&dev->mutex){......}, at: [<ffffffff81ce8933>] __driver_attach+0xa3/0x160
     #1:  (&dev->mutex){......}, at: [<ffffffff81ce8941>] __driver_attach+0xb1/0x160
     #2:  (&dev->lock){+.+.+.}, at: [<ffffffffa0724def>] rc_register_device+0xb2f/0x1450 [rc_core]

In this specific case, some error happened during device init,
causing IR to be disabled.

Let's fix it by adding a var that will tell when the device is
initialized. Any calls before that will return a -EINVAL.

That should prevent the race issues.

Signed-off-by: Mauro Carvalho Chehab <[email protected]>
dabrace pushed a commit that referenced this pull request Nov 28, 2016
This patch fixes the lockdep warning below

DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
------------[ cut here ]------------
WARNING: CPU: 1 PID: 1 at linux-next/kernel/locking/lockdep.c:2876 lockdep_trace_alloc+0xe0/0xf0
 Modules linked in:

 CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.8.0-11756-g86c5152 #46
...
 Call trace:
 Exception stack(0xffff8007da837890 to 0xffff8007da8379c0)
 7880:                                   ffff8007da834000 0001000000000000
 78a0: ffff8007da837a70 ffff0000081111a0 00000000600000c5 000000000000003d
 78c0: 9374bc6a7f3c7832 0000000000381878 ffff000009db7ab8 000000000000002f
 78e0: ffff00000811aabc ffff000008be2548 ffff8007da837990 ffff00000811adf8
 7900: ffff8007da834000 00000000024080c0 00000000000000c0 ffff000009021000
 7920: 0000000000000000 0000000000000000 ffff000008c8f7c8 ffff8007da579810
 7940: 000000000000002f ffff8007da858000 0000000000000000 0000000000000001
 7960: 0000000000000001 0000000000000000 ffff00000811a468 0000000000000002
 7980: 656c62617369645f 0000000000038187 00000000000000ee ffff8007da837850
 79a0: ffff000009db50c0 ffff000009db569d 0000000000000006 ffff000089db568f
 [<ffff0000081111a0>] lockdep_trace_alloc+0xe0/0xf0
 [<ffff0000081f4950>] __kmalloc_track_caller+0x50/0x250
 [<ffff00000857c088>] devres_alloc_node+0x28/0x60
 [<ffff0000081220e0>] devm_request_threaded_irq+0x50/0xe0
 [<ffff0000087e6220>] pcc_mbox_request_channel+0x110/0x170
 [<ffff0000084b2660>] acpi_cppc_processor_probe+0x264/0x414
 [<ffff0000084ae9f4>] __acpi_processor_start+0x28/0xa0
 [<ffff0000084aeab0>] acpi_processor_start+0x44/0x54
 [<ffff00000857897c>] driver_probe_device+0x1fc/0x2b0
 [<ffff000008578ae4>] __driver_attach+0xb4/0xc0
 [<ffff00000857683c>] bus_for_each_dev+0x5c/0xa0
 [<ffff000008578110>] driver_attach+0x20/0x30
 [<ffff000008577c20>] bus_add_driver+0x110/0x230
 [<ffff000008579320>] driver_register+0x60/0x100
 [<ffff000008d478b8>] acpi_processor_driver_init+0x2c/0xb0
 [<ffff000008083168>] do_one_initcall+0x38/0x130
 [<ffff000008d20d6c>] kernel_init_freeable+0x210/0x2b4
 [<ffff000008945d90>] kernel_init+0x10/0x110
 [<ffff000008082e80>] ret_from_fork+0x10/0x50

It's because the spinlock inside pcc_mbox_request_channel() is
kept too long. This patch releases spinlock before request_irq()
and free_irq() to fix this issue  as spinlock is only needed to
protect the channel data.

Signed-off-by: Hoan Tran <[email protected]>
Reviewed-by: Prashanth Prakash <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
dabrace pushed a commit that referenced this pull request Nov 17, 2017
When running the crash tool on a s390 live system we get a kernel panic
for reading memory within the kernel image:

 # uname -a
   Linux r3545011 4.14.0-rc8-00066-g1c9dbd4615fd #45 SMP PREEMPT Fri Nov 10 16:16:22 CET 2017 s390x s390x s390x GNU/Linux
 # crash /boot/vmlinux-devel /dev/mem
 # crash> rd 0x100000

 usercopy: kernel memory exposure attempt detected from 0000000000100000 (<kernel text>) (8 bytes)
 ------------[ cut here ]------------
 kernel BUG at mm/usercopy.c:72!
 illegal operation: 0001 ilc:1 [#1] PREEMPT SMP.
 Modules linked in:
 CPU: 0 PID: 1461 Comm: crash Not tainted 4.14.0-rc8-00066-g1c9dbd4615fd-dirty #46
 Hardware name: IBM 2827 H66 706 (z/VM 6.3.0)
 task: 000000001ad10100 task.stack: 000000001df78000
 Krnl PSW : 0704d00180000000 000000000038165c (__check_object_size+0x164/0x1d0)
            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
 Krnl GPRS: 0000000012440e1d 0000000080000000 0000000000000061 00000000001cabc0
            00000000001cc6d6 0000000000000000 0000000000cc4ed2 0000000000001000
            000003ffc22fdd20 0000000000000008 0000000000100008 0000000000000001
            0000000000000008 0000000000100000 0000000000381658 000000001df7bc90
 Krnl Code: 000000000038164c: c020004a1c4a        larl    %r2,cc4ee0
            0000000000381652: c0e5fff2581b        brasl   %r14,1cc688
           #0000000000381658: a7f40001            brc     15,38165a
           >000000000038165c: eb42000c000c        srlg    %r4,%r2,12
            0000000000381662: eb32001c000c        srlg    %r3,%r2,28
            0000000000381668: c0110003ffff        lgfi    %r1,262143
            000000000038166e: ec31ff752065        clgrj   %r3,%r1,2,381558
            0000000000381674: a7f4ff67            brc     15,381542
 Call Trace:
 ([<0000000000381658>] __check_object_size+0x160/0x1d0)
  [<000000000082263a>] read_mem+0xaa/0x130.
  [<0000000000386182>] __vfs_read+0x42/0x168.
  [<000000000038632e>] vfs_read+0x86/0x140.
  [<0000000000386a26>] SyS_read+0x66/0xc0.
  [<0000000000ace6a4>] system_call+0xc4/0x2b0.
 INFO: lockdep is turned off.
 Last Breaking-Event-Address:
  [<0000000000381658>] __check_object_size+0x160/0x1d0

 Kernel panic - not syncing: Fatal exception: panic_on_oops

With CONFIG_HARDENED_USERCOPY copy_to_user() checks in __check_object_size()
if the source address is within the kernel image. When the crash tool reads
from 0x100000, this check leads to the kernel BUG().

So disable the kernel config option until this bug is fixed.

Corresponding bug report on LKML: https://lkml.org/lkml/2017/11/10/341

Signed-off-by: Michael Holzheu <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
dabrace pushed a commit that referenced this pull request May 21, 2018
syzbot caught an infinite recursion in nsh_gso_segment().

Problem here is that we need to make sure the NSH header is of
reasonable length.

BUG: MAX_LOCK_DEPTH too low!
turning off the locking correctness validator.
depth: 48  max: 48!
48 locks held by syz-executor0/10189:
 #0:         (ptrval) (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x30f/0x34c0 net/core/dev.c:3517
 #1:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #1:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #2:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #2:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #3:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #3:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #4:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #4:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #5:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #5:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #6:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #6:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #7:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #7:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #8:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #8:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #9:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #9:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #10:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #10:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #11:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #11:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #12:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #12:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #13:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #13:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #14:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #14:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #15:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #15:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #16:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #16:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #17:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #17:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #18:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #18:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #19:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #19:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #20:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #20:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #21:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #21:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #22:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #22:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #23:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #23:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #24:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #24:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #25:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #25:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #26:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #26:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #27:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #27:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #28:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #28:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #29:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #29:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #30:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #30:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #31:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #31:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
dccp_close: ABORT with 65423 bytes unread
 #32:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #32:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #33:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #33:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #34:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #34:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #35:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #35:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #36:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #36:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #37:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #37:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #38:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #38:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #39:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #39:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #40:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #40:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #41:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #41:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #42:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #42:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #43:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #43:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #44:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #44:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #45:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #45:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #46:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #46:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #47:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #47:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
INFO: lockdep is turned off.
CPU: 1 PID: 10189 Comm: syz-executor0 Not tainted 4.17.0-rc2+ #26
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 __lock_acquire+0x1788/0x5140 kernel/locking/lockdep.c:3449
 lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
 rcu_lock_acquire include/linux/rcupdate.h:246 [inline]
 rcu_read_lock include/linux/rcupdate.h:632 [inline]
 skb_mac_gso_segment+0x25b/0x720 net/core/dev.c:2789
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 __skb_gso_segment+0x3bb/0x870 net/core/dev.c:2865
 skb_gso_segment include/linux/netdevice.h:4025 [inline]
 validate_xmit_skb+0x54d/0xd90 net/core/dev.c:3118
 validate_xmit_skb_list+0xbf/0x120 net/core/dev.c:3168
 sch_direct_xmit+0x354/0x11e0 net/sched/sch_generic.c:312
 qdisc_restart net/sched/sch_generic.c:399 [inline]
 __qdisc_run+0x741/0x1af0 net/sched/sch_generic.c:410
 __dev_xmit_skb net/core/dev.c:3243 [inline]
 __dev_queue_xmit+0x28ea/0x34c0 net/core/dev.c:3551
 dev_queue_xmit+0x17/0x20 net/core/dev.c:3616
 packet_snd net/packet/af_packet.c:2951 [inline]
 packet_sendmsg+0x40f8/0x6070 net/packet/af_packet.c:2976
 sock_sendmsg_nosec net/socket.c:629 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:639
 __sys_sendto+0x3d7/0x670 net/socket.c:1789
 __do_sys_sendto net/socket.c:1801 [inline]
 __se_sys_sendto net/socket.c:1797 [inline]
 __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1797
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: c411ed8 ("nsh: add GSO support")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Jiri Benc <[email protected]>
Reported-by: syzbot <[email protected]>
Acked-by: Jiri Benc <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
dabrace pushed a commit that referenced this pull request Aug 22, 2018
If a device gets removed right after having registered a power_supply node,
we might enter in a deadlock between the remove call (that has a lock on
the parent device) and the deferred register work.

Allow the deferred register work to exit without taking the lock when
we are in the remove state.

Stack trace on a Ubuntu 16.04:

[16072.109121] INFO: task kworker/u16:2:1180 blocked for more than 120 seconds.
[16072.109127]       Not tainted 4.13.0-41-generic #46~16.04.1-Ubuntu
[16072.109129] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[16072.109132] kworker/u16:2   D    0  1180      2 0x80000000
[16072.109142] Workqueue: events_power_efficient power_supply_deferred_register_work
[16072.109144] Call Trace:
[16072.109152]  __schedule+0x3d6/0x8b0
[16072.109155]  schedule+0x36/0x80
[16072.109158]  schedule_preempt_disabled+0xe/0x10
[16072.109161]  __mutex_lock.isra.2+0x2ab/0x4e0
[16072.109166]  __mutex_lock_slowpath+0x13/0x20
[16072.109168]  ? __mutex_lock_slowpath+0x13/0x20
[16072.109171]  mutex_lock+0x2f/0x40
[16072.109174]  power_supply_deferred_register_work+0x2b/0x50
[16072.109179]  process_one_work+0x15b/0x410
[16072.109182]  worker_thread+0x4b/0x460
[16072.109186]  kthread+0x10c/0x140
[16072.109189]  ? process_one_work+0x410/0x410
[16072.109191]  ? kthread_create_on_node+0x70/0x70
[16072.109194]  ret_from_fork+0x35/0x40
[16072.109199] INFO: task test:2257 blocked for more than 120 seconds.
[16072.109202]       Not tainted 4.13.0-41-generic #46~16.04.1-Ubuntu
[16072.109204] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[16072.109206] test            D    0  2257   2256 0x00000004
[16072.109208] Call Trace:
[16072.109211]  __schedule+0x3d6/0x8b0
[16072.109215]  schedule+0x36/0x80
[16072.109218]  schedule_timeout+0x1f3/0x360
[16072.109221]  ? check_preempt_curr+0x5a/0xa0
[16072.109224]  ? ttwu_do_wakeup+0x1e/0x150
[16072.109227]  wait_for_completion+0xb4/0x140
[16072.109230]  ? wait_for_completion+0xb4/0x140
[16072.109233]  ? wake_up_q+0x70/0x70
[16072.109236]  flush_work+0x129/0x1e0
[16072.109240]  ? worker_detach_from_pool+0xb0/0xb0
[16072.109243]  __cancel_work_timer+0x10f/0x190
[16072.109247]  ? device_del+0x264/0x310
[16072.109250]  ? __wake_up+0x44/0x50
[16072.109253]  cancel_delayed_work_sync+0x13/0x20
[16072.109257]  power_supply_unregister+0x37/0xb0
[16072.109260]  devm_power_supply_release+0x11/0x20
[16072.109263]  release_nodes+0x110/0x200
[16072.109266]  devres_release_group+0x7c/0xb0
[16072.109274]  wacom_remove+0xc2/0x110 [wacom]
[16072.109279]  hid_device_remove+0x6e/0xd0 [hid]
[16072.109284]  device_release_driver_internal+0x158/0x210
[16072.109288]  device_release_driver+0x12/0x20
[16072.109291]  bus_remove_device+0xec/0x160
[16072.109293]  device_del+0x1de/0x310
[16072.109298]  hid_destroy_device+0x27/0x60 [hid]
[16072.109303]  usbhid_disconnect+0x51/0x70 [usbhid]
[16072.109308]  usb_unbind_interface+0x77/0x270
[16072.109311]  device_release_driver_internal+0x158/0x210
[16072.109315]  device_release_driver+0x12/0x20
[16072.109318]  usb_driver_release_interface+0x77/0x80
[16072.109321]  proc_ioctl+0x20f/0x250
[16072.109325]  usbdev_do_ioctl+0x57f/0x1140
[16072.109327]  ? __wake_up+0x44/0x50
[16072.109331]  usbdev_ioctl+0xe/0x20
[16072.109336]  do_vfs_ioctl+0xa4/0x600
[16072.109339]  ? vfs_write+0x15a/0x1b0
[16072.109343]  SyS_ioctl+0x79/0x90
[16072.109347]  entry_SYSCALL_64_fastpath+0x24/0xab
[16072.109349] RIP: 0033:0x7f20da807f47
[16072.109351] RSP: 002b:00007ffc422ae398 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[16072.109353] RAX: ffffffffffffffda RBX: 00000000010b8560 RCX: 00007f20da807f47
[16072.109355] RDX: 00007ffc422ae3a0 RSI: 00000000c0105512 RDI: 0000000000000009
[16072.109356] RBP: 0000000000000000 R08: 00007ffc422ae3e0 R09: 0000000000000010
[16072.109357] R10: 00000000000000a6 R11: 0000000000000246 R12: 0000000000000000
[16072.109359] R13: 00000000010b8560 R14: 00007ffc422ae2e0 R15: 0000000000000000

Reported-and-tested-by: Richard Hughes <[email protected]>
Tested-by: Aaron Skomra <[email protected]>
Signed-off-by: Benjamin Tissoires <[email protected]>
Fixes: 7f1a57f ("power_supply: Fix possible NULL pointer dereference on early uevent")
Signed-off-by: Sebastian Reichel <[email protected]>
dabrace pushed a commit that referenced this pull request Oct 4, 2018
Syzkaller reported this on a slightly older kernel but it's still
applicable to the current kernel -

======================================================
WARNING: possible circular locking dependency detected
4.18.0-next-20180823+ #46 Not tainted
------------------------------------------------------
syz-executor4/26841 is trying to acquire lock:
00000000dd41ef48 ((wq_completion)bond_dev->name){+.+.}, at: flush_workqueue+0x2db/0x1e10 kernel/workqueue.c:2652

but task is already holding lock:
00000000768ab431 (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:77 [inline]
00000000768ab431 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x412/0xc30 net/core/rtnetlink.c:4708

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (rtnl_mutex){+.+.}:
       __mutex_lock_common kernel/locking/mutex.c:925 [inline]
       __mutex_lock+0x171/0x1700 kernel/locking/mutex.c:1073
       mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1088
       rtnl_lock+0x17/0x20 net/core/rtnetlink.c:77
       bond_netdev_notify drivers/net/bonding/bond_main.c:1310 [inline]
       bond_netdev_notify_work+0x44/0xd0 drivers/net/bonding/bond_main.c:1320
       process_one_work+0xc73/0x1aa0 kernel/workqueue.c:2153
       worker_thread+0x189/0x13c0 kernel/workqueue.c:2296
       kthread+0x35a/0x420 kernel/kthread.c:246
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415

-> #1 ((work_completion)(&(&nnw->work)->work)){+.+.}:
       process_one_work+0xc0b/0x1aa0 kernel/workqueue.c:2129
       worker_thread+0x189/0x13c0 kernel/workqueue.c:2296
       kthread+0x35a/0x420 kernel/kthread.c:246
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415

-> #0 ((wq_completion)bond_dev->name){+.+.}:
       lock_acquire+0x1e4/0x4f0 kernel/locking/lockdep.c:3901
       flush_workqueue+0x30a/0x1e10 kernel/workqueue.c:2655
       drain_workqueue+0x2a9/0x640 kernel/workqueue.c:2820
       destroy_workqueue+0xc6/0x9d0 kernel/workqueue.c:4155
       __alloc_workqueue_key+0xef9/0x1190 kernel/workqueue.c:4138
       bond_init+0x269/0x940 drivers/net/bonding/bond_main.c:4734
       register_netdevice+0x337/0x1100 net/core/dev.c:8410
       bond_newlink+0x49/0xa0 drivers/net/bonding/bond_netlink.c:453
       rtnl_newlink+0xef4/0x1d50 net/core/rtnetlink.c:3099
       rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4711
       netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
       rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4729
       netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
       netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
       netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
       sock_sendmsg_nosec net/socket.c:622 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:632
       ___sys_sendmsg+0x7fd/0x930 net/socket.c:2115
       __sys_sendmsg+0x11d/0x290 net/socket.c:2153
       __do_sys_sendmsg net/socket.c:2162 [inline]
       __se_sys_sendmsg net/socket.c:2160 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2160
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

Chain exists of:
  (wq_completion)bond_dev->name --> (work_completion)(&(&nnw->work)->work) --> rtnl_mutex

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(rtnl_mutex);
                               lock((work_completion)(&(&nnw->work)->work));
                               lock(rtnl_mutex);
  lock((wq_completion)bond_dev->name);

 *** DEADLOCK ***

1 lock held by syz-executor4/26841:

stack backtrace:
CPU: 1 PID: 26841 Comm: syz-executor4 Not tainted 4.18.0-next-20180823+ #46
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
 print_circular_bug.isra.34.cold.55+0x1bd/0x27d kernel/locking/lockdep.c:1222
 check_prev_add kernel/locking/lockdep.c:1862 [inline]
 check_prevs_add kernel/locking/lockdep.c:1975 [inline]
 validate_chain kernel/locking/lockdep.c:2416 [inline]
 __lock_acquire+0x3449/0x5020 kernel/locking/lockdep.c:3412
 lock_acquire+0x1e4/0x4f0 kernel/locking/lockdep.c:3901
 flush_workqueue+0x30a/0x1e10 kernel/workqueue.c:2655
 drain_workqueue+0x2a9/0x640 kernel/workqueue.c:2820
 destroy_workqueue+0xc6/0x9d0 kernel/workqueue.c:4155
 __alloc_workqueue_key+0xef9/0x1190 kernel/workqueue.c:4138
 bond_init+0x269/0x940 drivers/net/bonding/bond_main.c:4734
 register_netdevice+0x337/0x1100 net/core/dev.c:8410
 bond_newlink+0x49/0xa0 drivers/net/bonding/bond_netlink.c:453
 rtnl_newlink+0xef4/0x1d50 net/core/rtnetlink.c:3099
 rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4711
 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4729
 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
 netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
 netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
 sock_sendmsg_nosec net/socket.c:622 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:632
 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2115
 __sys_sendmsg+0x11d/0x290 net/socket.c:2153
 __do_sys_sendmsg net/socket.c:2162 [inline]
 __se_sys_sendmsg net/socket.c:2160 [inline]
 __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2160
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457089
Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f2df20a5c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f2df20a66d4 RCX: 0000000000457089
RDX: 0000000000000000 RSI: 0000000020000180 RDI: 0000000000000003
RBP: 0000000000930140 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000004d40b8 R14: 00000000004c8ad8 R15: 0000000000000001

Signed-off-by: Mahesh Bandewar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
dabrace pushed a commit that referenced this pull request Nov 12, 2018
Increase kasan instrumented kernel stack size from 32k to 64k. Other
architectures seems to get away with just doubling kernel stack size under
kasan, but on s390 this appears to be not enough due to bigger frame size.
The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE
vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting
stack overflow is fs sync on xfs filesystem:

 #0 [9a0681e8]  704 bytes  check_usage at 34b1fc
 #1 [9a0684a8]  432 bytes  check_usage at 34c710
 #2 [9a068658]  1048 bytes  validate_chain at 35044a
 #3 [9a068a70]  312 bytes  __lock_acquire at 3559fe
 #4 [9a068ba8]  440 bytes  lock_acquire at 3576ee
 #5 [9a068d60]  104 bytes  _raw_spin_lock at 21b44e0
 #6 [9a068dc8]  1992 bytes  enqueue_entity at 2dbf72
 #7 [9a069590]  1496 bytes  enqueue_task_fair at 2df5f0
 #8 [9a069b68]  64 bytes  ttwu_do_activate at 28f438
 #9 [9a069ba8]  552 bytes  try_to_wake_up at 298c4c
 #10 [9a069dd0]  168 bytes  wake_up_worker at 23f97c
 #11 [9a069e78]  200 bytes  insert_work at 23fc2e
 #12 [9a069f40]  648 bytes  __queue_work at 2487c0
 #13 [9a06a1c8]  200 bytes  __queue_delayed_work at 24db28
 #14 [9a06a290]  248 bytes  mod_delayed_work_on at 24de84
 #15 [9a06a388]  24 bytes  kblockd_mod_delayed_work_on at 153e2a0
 #16 [9a06a3a0]  288 bytes  __blk_mq_delay_run_hw_queue at 158168c
 #17 [9a06a4c0]  192 bytes  blk_mq_run_hw_queue at 1581a3c
 #18 [9a06a580]  184 bytes  blk_mq_sched_insert_requests at 15a2192
 #19 [9a06a638]  1024 bytes  blk_mq_flush_plug_list at 1590f3a
 #20 [9a06aa38]  704 bytes  blk_flush_plug_list at 1555028
 #21 [9a06acf8]  320 bytes  schedule at 219e476
 #22 [9a06ae38]  760 bytes  schedule_timeout at 21b0aac
 #23 [9a06b130]  408 bytes  wait_for_common at 21a1706
 #24 [9a06b2c8]  360 bytes  xfs_buf_iowait at fa1540
 #25 [9a06b430]  256 bytes  __xfs_buf_submit at fadae6
 #26 [9a06b530]  264 bytes  xfs_buf_read_map at fae3f6
 #27 [9a06b638]  656 bytes  xfs_trans_read_buf_map at 10ac9a8
 #28 [9a06b8c8]  304 bytes  xfs_btree_kill_root at e72426
 #29 [9a06b9f8]  288 bytes  xfs_btree_lookup_get_block at e7bc5e
 #30 [9a06bb18]  624 bytes  xfs_btree_lookup at e7e1a6
 #31 [9a06bd88]  2664 bytes  xfs_alloc_ag_vextent_near at dfa070
 #32 [9a06c7f0]  144 bytes  xfs_alloc_ag_vextent at dff3ca
 #33 [9a06c880]  1128 bytes  xfs_alloc_vextent at e05fce
 #34 [9a06cce8]  584 bytes  xfs_bmap_btalloc at e58342
 #35 [9a06cf30]  1336 bytes  xfs_bmapi_write at e618de
 #36 [9a06d468]  776 bytes  xfs_iomap_write_allocate at ff678e
 #37 [9a06d770]  720 bytes  xfs_map_blocks at f82af8
 #38 [9a06da40]  928 bytes  xfs_writepage_map at f83cd6
 #39 [9a06dde0]  320 bytes  xfs_do_writepage at f85872
 #40 [9a06df20]  1320 bytes  write_cache_pages at 73dfe8
 #41 [9a06e448]  208 bytes  xfs_vm_writepages at f7f892
 #42 [9a06e518]  88 bytes  do_writepages at 73fe6a
 #43 [9a06e570]  872 bytes  __writeback_single_inode at a20cb6
 #44 [9a06e8d8]  664 bytes  writeback_sb_inodes at a23be2
 #45 [9a06eb70]  296 bytes  __writeback_inodes_wb at a242e0
 #46 [9a06ec98]  928 bytes  wb_writeback at a2500e
 #47 [9a06f038]  848 bytes  wb_do_writeback at a260ae
 #48 [9a06f388]  536 bytes  wb_workfn at a28228
 #49 [9a06f5a0]  1088 bytes  process_one_work at 24a234
 #50 [9a06f9e0]  1120 bytes  worker_thread at 24ba26
 #51 [9a06fe40]  104 bytes  kthread at 26545a
 #52 [9a06fea8]             kernel_thread_starter at 21b6b62

To be able to increase the stack size to 64k reuse LLILL instruction
in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE
(65192) value as unsigned.

Reported-by: Benjamin Block <[email protected]>
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
dabrace pushed a commit that referenced this pull request Feb 18, 2019
Otherwise we introduce a race condition where userspace can request input
before we're ready leading to null pointer dereference such as

input: bma150 as /devices/platform/i2c-gpio-2/i2c-5/5-0038/input/input3
Unable to handle kernel NULL pointer dereference at virtual address 00000018
pgd = (ptrval)
[00000018] *pgd=55dac831, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1] PREEMPT ARM
Modules linked in: bma150 input_polldev [last unloaded: bma150]
CPU: 0 PID: 2870 Comm: accelerometer Not tainted 5.0.0-rc3-dirty #46
Hardware name: Samsung S5PC110/S5PV210-based board
PC is at input_event+0x8/0x60
LR is at bma150_report_xyz+0x9c/0xe0 [bma150]
pc : [<80450f70>]    lr : [<7f0a614c>]    psr: 800d0013
sp : a4c1fd78  ip : 00000081  fp : 00020000
r10: 00000000  r9 : a5e2944c  r8 : a7455000
r7 : 00000016  r6 : 00000101  r5 : a7617940  r4 : 80909048
r3 : fffffff2  r2 : 00000000  r1 : 00000003  r0 : 00000000
Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 54e34019  DAC: 00000051
Process accelerometer (pid: 2870, stack limit = 0x(ptrval))
Stackck: (0xa4c1fd78 to 0xa4c20000)
fd60:                                                       fffffff3 fc813f6c
fd80: 40410581 d7530ce3 a5e2817c a7617f00 a5e29404 a5e2817c 00000000 7f008324
fda0: a5e28000 8044f59c a5fdd9d0 a5e2945c a46a4a00 a5e29668 a7455000 80454f10
fdc0: 80909048 a5e29668 a5fdd9d0 a46a4a00 806316d0 00000000 a46a4a00 801df5f0
fde0: 00000000 d7530ce3 a4c1fec0 a46a4a00 00000000 a5fdd9d0 a46a4a08 801df53c
fe00: 00000000 801d74bc a4c1fec0 00000000 a4c1ff70 00000000 a7038da8 00000000
fe20: a46a4a00 801e91fc a411bbe0 801f2e88 00000004 00000000 80909048 00000041
fe40: 00000000 00020000 00000000 dead4ead a6a88da0 00000000 ffffe000 806fcae8
fe60: a4c1fec8 00000000 80909048 00000002 a5fdd9d0 a7660110 a411bab0 00000001
fe80: dead4ead ffffffff ffffffff a4c1fe8c a4c1fe8c d7530ce3 20000013 80909048
fea0: 80909048 a4c1ff70 00000001 fffff000 a4c1e000 00000005 00026038 801eabd8
fec0: a7660110 a411bab0 b9394901 00000006 a696201b 76fb3000 00000000 a7039720
fee0: a5fdd9d0 00000101 00000002 00000096 00000000 00000000 00000000 a4c1ff00
ff00: a6b310f4 805cb174 a6b310f4 00000010 00000fe0 00000010 a4c1e000 d7530ce3
ff20: 00000003 a5f41400 a5f41424 00000000 a6962000 00000000 00000003 00000002
ff40: ffffff9c 000a0000 80909048 d7530ce3 a6962000 00000003 80909048 ffffff9c
ff60: a6962000 801d890c 00000000 00000000 00020000 a7590000 00000004 00000100
ff80: 00000001 d7530ce3 000288b8 00026320 000288b8 00000005 80101204 a4c1e000
ffa0: 00000005 80101000 000288b8 00026320 000288b8 000a0000 00000000 00000000
ffc0: 000288b8 00026320 000288b8 00000005 7eef3bac 000264e8 00028ad8 00026038
ffe0: 00000005 7eef3300 76f76e91 76f78546 800d0030 000288b8 00000000 00000000
[<80450f70>] (input_event) from [<a5e2817c>] (0xa5e2817c)
Code: e1a08148 eaffffa8 e351001f 812fff1e (e590c018)
---[ end trace 1c691ee85f2ff243 ]---

Signed-off-by: Jonathan Bakker <[email protected]>
Signed-off-by: Paweł Chmiel <[email protected]>
Cc: [email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>
dabrace pushed a commit that referenced this pull request Jun 21, 2019
After commit d4fc506 ("mm: switch s_mem and slab_cache in struct page")
page->mapping will be re-used by slab allocations and page->mapping->host
will be used in balance_dirty_pages_ratelimited() as an inode member but
it's not an inode in fact and leads an oops.

[  159.906493] Unable to handle kernel paging request at virtual address ffff200012d90be8
[  159.908029] Mem abort info:
[  159.908552]   ESR = 0x96000007
[  159.909138]   Exception class = DABT (current EL), IL = 32 bits
[  159.910155]   SET = 0, FnV = 0
[  159.910690]   EA = 0, S1PTW = 0
[  159.911241] Data abort info:
[  159.911846]   ISV = 0, ISS = 0x00000007
[  159.912567]   CM = 0, WnR = 0
[  159.913105] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000042acd000
[  159.914269] [ffff200012d90be8] pgd=000000043ffff003, pud=000000043fffe003, pmd=000000043fffa003, pte=0000000000000000
[  159.916280] Internal error: Oops: 96000007 [#1] SMP
[  159.917195] Dumping ftrace buffer:
[  159.917845]    (ftrace buffer empty)
[  159.918521] Modules linked in: uio_dev(OE)
[  159.919276] CPU: 1 PID: 295 Comm: uio_test Tainted: G           OE     5.2.0-rc4+ #46
[  159.920859] Hardware name: linux,dummy-virt (DT)
[  159.921815] pstate: 60000005 (nZCv daif -PAN -UAO)
[  159.922809] pc : balance_dirty_pages_ratelimited+0x68/0xc38
[  159.923965] lr : fault_dirty_shared_page.isra.8+0xe4/0x100
[  159.925134] sp : ffff800368a77ae0
[  159.925824] x29: ffff800368a77ae0 x28: 1ffff0006d14ce1a
[  159.926906] x27: ffff800368a670d0 x26: ffff800368a67120
[  159.927985] x25: 1ffff0006d10f5fe x24: ffff200012d90be8
[  159.929089] x23: ffff200013732000 x22: ffff80036ec03200
[  159.930172] x21: ffff200012d90bc0 x20: 1fffe400025b217d
[  159.931253] x19: ffff80036ec03200 x18: 0000000000000000
[  159.932348] x17: 0000000000000000 x16: 0ffffe0000010208
[  159.933439] x15: 0000000000000000 x14: 0000000000000000
[  159.934518] x13: 0000000000000000 x12: 0000000000000000
[  159.935596] x11: 1fffefc001b452c0 x10: ffff0fc001b452c0
[  159.936697] x9 : dfff200000000000 x8 : dfff200000000001
[  159.937781] x7 : ffff7e000da29607 x6 : ffff0fc001b452c1
[  159.938859] x5 : ffff0fc001b452c1 x4 : ffff0fc001b452c1
[  159.939944] x3 : ffff200010523ad4 x2 : 1fffe400026e659b
[  159.941065] x1 : dfff200000000000 x0 : ffff200013732cd8
[  159.942205] Call trace:
[  159.942732]  balance_dirty_pages_ratelimited+0x68/0xc38
[  159.943797]  fault_dirty_shared_page.isra.8+0xe4/0x100
[  159.944867]  do_fault+0x608/0x1250
[  159.945571]  __handle_mm_fault+0x93c/0xfb8
[  159.946412]  handle_mm_fault+0x1c0/0x360
[  159.947224]  do_page_fault+0x358/0x8d0
[  159.947993]  do_translation_fault+0xf8/0x124
[  159.948884]  do_mem_abort+0x70/0x190
[  159.949624]  el0_da+0x24/0x28

According another commit 5e901d0 ("scsi: qedi: Fix bad pte call trace
when iscsiuio is stopped."), using kmalloc also cause other problem.

But the documentation about UIO_MEM_LOGICAL allows using kmalloc(), remove
and don't allow using kmalloc() in documentation.

Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
dabrace pushed a commit that referenced this pull request Sep 18, 2020
The crash log as the below:

[Thu Aug 20 23:18:14 2020] general protection fault: 0000 [#1] SMP NOPTI
[Thu Aug 20 23:18:14 2020] CPU: 152 PID: 1837 Comm: kworker/152:1 Tainted: G           OE     5.4.0-42-generic #46~18.04.1-Ubuntu
[Thu Aug 20 23:18:14 2020] Hardware name: GIGABYTE G482-Z53-YF/MZ52-G40-00, BIOS R12 05/13/2020
[Thu Aug 20 23:18:14 2020] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
[Thu Aug 20 23:18:14 2020] RIP: 0010:evict_process_queues_cpsch+0xc9/0x130 [amdgpu]
[Thu Aug 20 23:18:14 2020] Code: 49 8d 4d 10 48 39 c8 75 21 eb 44 83 fa 03 74 36 80 78 72 00 74 0c 83 ab 68 01 00 00 01 41 c6 45 41 00 48 8b 00 48 39 c8 74 25 <80> 78 70 00 c6 40 6d 01 74 ee 8b 50 28 c6 40 70 00 83 ab 60 01 00
[Thu Aug 20 23:18:14 2020] RSP: 0018:ffffb29b52f6fc90 EFLAGS: 00010213
[Thu Aug 20 23:18:14 2020] RAX: 1c884edb0a118914 RBX: ffff8a0d45ff3c00 RCX: ffff8a2d83e41038
[Thu Aug 20 23:18:14 2020] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff8a0e2e4178c0
[Thu Aug 20 23:18:14 2020] RBP: ffffb29b52f6fcb0 R08: 0000000000001b64 R09: 0000000000000004
[Thu Aug 20 23:18:14 2020] R10: ffffb29b52f6fb78 R11: 0000000000000001 R12: ffff8a0d45ff3d28
[Thu Aug 20 23:18:14 2020] R13: ffff8a2d83e41028 R14: 0000000000000000 R15: 0000000000000000
[Thu Aug 20 23:18:14 2020] FS:  0000000000000000(0000) GS:ffff8a0e2e400000(0000) knlGS:0000000000000000
[Thu Aug 20 23:18:14 2020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Thu Aug 20 23:18:14 2020] CR2: 000055c783c0e6a8 CR3: 00000034a1284000 CR4: 0000000000340ee0
[Thu Aug 20 23:18:14 2020] Call Trace:
[Thu Aug 20 23:18:14 2020]  kfd_process_evict_queues+0x43/0xd0 [amdgpu]
[Thu Aug 20 23:18:14 2020]  kfd_suspend_all_processes+0x60/0xf0 [amdgpu]
[Thu Aug 20 23:18:14 2020]  kgd2kfd_suspend.part.7+0x43/0x50 [amdgpu]
[Thu Aug 20 23:18:14 2020]  kgd2kfd_pre_reset+0x46/0x60 [amdgpu]
[Thu Aug 20 23:18:14 2020]  amdgpu_amdkfd_pre_reset+0x1a/0x20 [amdgpu]
[Thu Aug 20 23:18:14 2020]  amdgpu_device_gpu_recover+0x377/0xf90 [amdgpu]
[Thu Aug 20 23:18:14 2020]  ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu]
[Thu Aug 20 23:18:14 2020]  amdgpu_ras_do_recovery+0x159/0x190 [amdgpu]
[Thu Aug 20 23:18:14 2020]  process_one_work+0x20f/0x400
[Thu Aug 20 23:18:14 2020]  worker_thread+0x34/0x410

When GPU hang, user process will fail to create a compute queue whose
struct object will be freed later, but driver wrongly add this queue to
queue list of the proccess. And then kfd_process_evict_queues will
access a freed memory, which cause a system crash.

v2:
The failure to execute_queues should probably not be reported to
the caller of create_queue, because the queue was already created.
Therefore change to ignore the return value from execute_queues.

Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Dennis Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
dabrace pushed a commit that referenced this pull request Nov 23, 2020
When removing the driver we would hit BUG_ON(!list_empty(&dev->ptype_specific))
in net/core/dev.c due to still having the NC-SI packet handler
registered.

 # echo 1e660000.ethernet > /sys/bus/platform/drivers/ftgmac100/unbind
  ------------[ cut here ]------------
  kernel BUG at net/core/dev.c:10254!
  Internal error: Oops - BUG: 0 [#1] SMP ARM
  CPU: 0 PID: 115 Comm: sh Not tainted 5.10.0-rc3-next-20201111-00007-g02e0365710c4 #46
  Hardware name: Generic DT based system
  PC is at netdev_run_todo+0x314/0x394
  LR is at cpumask_next+0x20/0x24
  pc : [<806f5830>]    lr : [<80863cb0>]    psr: 80000153
  sp : 855bbd58  ip : 00000001  fp : 855bbdac
  r10: 80c03d00  r9 : 80c06228  r8 : 81158c54
  r7 : 00000000  r6 : 80c05dec  r5 : 80c05d18  r4 : 813b9280
  r3 : 813b9054  r2 : 8122c470  r1 : 00000002  r0 : 00000002
  Flags: Nzcv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment none
  Control: 00c5387d  Table: 85514008  DAC: 00000051
  Process sh (pid: 115, stack limit = 0x7cb5703d)
 ...
  Backtrace:
  [<806f551c>] (netdev_run_todo) from [<80707eec>] (rtnl_unlock+0x18/0x1c)
   r10:00000051 r9:854ed710 r8:81158c54 r7:80c76bb0 r6:81158c10 r5:8115b410
   r4:813b9000
  [<80707ed4>] (rtnl_unlock) from [<806f5db8>] (unregister_netdev+0x2c/0x30)
  [<806f5d8c>] (unregister_netdev) from [<805a8180>] (ftgmac100_remove+0x20/0xa8)
   r5:8115b410 r4:813b9000
  [<805a8160>] (ftgmac100_remove) from [<805355e4>] (platform_drv_remove+0x34/0x4c)

Fixes: bd466c3 ("net/faraday: Support NCSI mode")
Signed-off-by: Joel Stanley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
dabrace pushed a commit that referenced this pull request May 6, 2021
Ritesh reported a bug [1] against UML, noting that it crashed on
startup. The backtrace shows the following (heavily redacted):

(gdb) bt
...
 #26 0x0000000060015b5d in sem_init () at ipc/sem.c:268
 #27 0x00007f89906d92f7 in ?? () from /lib/x86_64-linux-gnu/libcom_err.so.2
 #28 0x00007f8990ab8fb2 in call_init (...) at dl-init.c:72
...
 #40 0x00007f89909bf3a6 in nss_load_library (...) at nsswitch.c:359
...
 #44 0x00007f8990895e35 in _nss_compat_getgrnam_r (...) at nss_compat/compat-grp.c:486
 #45 0x00007f8990968b85 in __getgrnam_r [...]
 #46 0x00007f89909d6b77 in grantpt [...]
 #47 0x00007f8990a9394e in __GI_openpty [...]
 #48 0x00000000604a1f65 in openpty_cb (...) at arch/um/os-Linux/sigio.c:407
 #49 0x00000000604a58d0 in start_idle_thread (...) at arch/um/os-Linux/skas/process.c:598
 #50 0x0000000060004a3d in start_uml () at arch/um/kernel/skas/process.c:45
 #51 0x00000000600047b2 in linux_main (...) at arch/um/kernel/um_arch.c:334
 #52 0x000000006000574f in main (...) at arch/um/os-Linux/main.c:144

indicating that the UML function openpty_cb() calls openpty(),
which internally calls __getgrnam_r(), which causes the nsswitch
machinery to get started.

This loads, through lots of indirection that I snipped, the
libcom_err.so.2 library, which (in an unknown function, "??")
calls sem_init().

Now, of course it wants to get libpthread's sem_init(), since
it's linked against libpthread. However, the dynamic linker
looks up that symbol against the binary first, and gets the
kernel's sem_init().

Hajime Tazaki noted that "objcopy -L" can localize a symbol,
so the dynamic linker wouldn't do the lookup this way. I tried,
but for some reason that didn't seem to work.

Doing the same thing in the linker script instead does seem to
work, though I cannot entirely explain - it *also* works if I
just add "VERSION { { global: *; }; }" instead, indicating that
something else is happening that I don't really understand. It
may be that explicitly doing that marks them with some kind of
empty version, and that's different from the default.

Explicitly marking them with a version breaks kallsyms, so that
doesn't seem to be possible.

Marking all the symbols as local seems correct, and does seem
to address the issue, so do that. Also do it for static link,
nsswitch libraries could still be loaded there.

[1] https://bugs.debian.org/983379

Reported-by: Ritesh Raj Sarraf <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
Acked-By: Anton Ivanov <[email protected]>
Tested-By: Ritesh Raj Sarraf <[email protected]>
Signed-off-by: Richard Weinberger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.