Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README #101

Closed
wants to merge 1 commit into from
Closed

Update README #101

wants to merge 1 commit into from

Conversation

raintean
Copy link

No description provided.

@raintean raintean closed this Jun 27, 2014
@younishd
Copy link

test

@ComradePhilos
Copy link

dude, it worked! :D

@raintean
Copy link
Author

@ComradePhilos :)

hubcapsc pushed a commit to hubcapsc/linux that referenced this pull request Jul 2, 2014
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          #1   #2   #3   #4   #5   torvalds#6   torvalds#7
[    0.603005] .... node   #1, CPUs:     torvalds#8   torvalds#9  torvalds#10  torvalds#11  torvalds#12  torvalds#13  torvalds#14  torvalds#15
[    1.200005] .... node   #2, CPUs:    torvalds#16  torvalds#17  torvalds#18  torvalds#19  torvalds#20  torvalds#21  torvalds#22  torvalds#23
[    1.796005] .... node   #3, CPUs:    torvalds#24  torvalds#25  torvalds#26  torvalds#27  torvalds#28  torvalds#29  torvalds#30  torvalds#31
[    2.393005] .... node   #4, CPUs:    torvalds#32  torvalds#33  torvalds#34  torvalds#35  torvalds#36  torvalds#37  torvalds#38  torvalds#39
[    2.996005] .... node   #5, CPUs:    torvalds#40  torvalds#41  torvalds#42  torvalds#43  torvalds#44  torvalds#45  torvalds#46  torvalds#47
[    3.600005] .... node   torvalds#6, CPUs:    torvalds#48  torvalds#49  torvalds#50  torvalds#51  #52  #53  torvalds#54  torvalds#55
[    4.202005] .... node   torvalds#7, CPUs:    torvalds#56  torvalds#57  #58  torvalds#59  torvalds#60  torvalds#61  torvalds#62  torvalds#63
[    4.811005] .... node   torvalds#8, CPUs:    torvalds#64  torvalds#65  torvalds#66  torvalds#67  torvalds#68  torvalds#69  #70  torvalds#71
[    5.421006] .... node   torvalds#9, CPUs:    torvalds#72  torvalds#73  torvalds#74  torvalds#75  torvalds#76  torvalds#77  torvalds#78  torvalds#79
[    6.032005] .... node  torvalds#10, CPUs:    torvalds#80  torvalds#81  torvalds#82  torvalds#83  torvalds#84  torvalds#85  torvalds#86  torvalds#87
[    6.648006] .... node  torvalds#11, CPUs:    torvalds#88  torvalds#89  torvalds#90  torvalds#91  torvalds#92  torvalds#93  torvalds#94  torvalds#95
[    7.262005] .... node  torvalds#12, CPUs:    torvalds#96  torvalds#97  torvalds#98  torvalds#99 torvalds#100 torvalds#101 torvalds#102 torvalds#103
[    7.865005] .... node  torvalds#13, CPUs:   torvalds#104 torvalds#105 torvalds#106 torvalds#107 torvalds#108 torvalds#109 torvalds#110 torvalds#111
[    8.466005] .... node  torvalds#14, CPUs:   torvalds#112 torvalds#113 torvalds#114 torvalds#115 torvalds#116 torvalds#117 torvalds#118 torvalds#119
[    9.073006] .... node  torvalds#15, CPUs:   torvalds#120 torvalds#121 torvalds#122 torvalds#123 torvalds#124 torvalds#125 torvalds#126 torvalds#127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Linus Torvalds <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
rogerq pushed a commit to rogerq/linux that referenced this pull request Oct 2, 2014
This reverts commit 3617904.

IRQ_CROSSBAR_96 maps to irq torvalds#101 and not torvalds#96, which is mmc4's irq.

Change-Id: I9aecf8e08abeda5917817aea38634edd0f84b721
Signed-off-by: Ido Yariv <[email protected]>
Signed-off-by: bvijay <[email protected]>
pstglia pushed a commit to pstglia/linux that referenced this pull request Oct 6, 2014
nf_ct_gre_keymap_flush() removes a nf_ct_gre_keymap object from
net_gre->keymap_list and frees the object. But it doesn't clean
a reference on this object from ct_pptp_info->keymap[dir].
Then nf_ct_gre_keymap_destroy() may release the same object again.

So nf_ct_gre_keymap_flush() can be called only when we are sure that
when nf_ct_gre_keymap_destroy will not be called.

nf_ct_gre_keymap is created by nf_ct_gre_keymap_add() and the right way
to destroy it is to call nf_ct_gre_keymap_destroy().

This patch marks nf_ct_gre_keymap_flush() as static, so this patch can
break compilation of third party modules, which use
nf_ct_gre_keymap_flush. I'm not sure this is the right way to deprecate
this function.

[  226.540793] general protection fault: 0000 [#1] SMP
[  226.541750] Modules linked in: nf_nat_pptp nf_nat_proto_gre
nf_conntrack_pptp nf_conntrack_proto_gre ip_gre ip_tunnel gre
ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc xt_nat
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat
nf_conntrack veth tun bridge stp llc ppdev microcode joydev pcspkr
serio_raw virtio_console virtio_balloon floppy parport_pc parport
pvpanic i2c_piix4 virtio_net drm_kms_helper ttm ata_generic virtio_pci
virtio_ring virtio drm i2c_core pata_acpi [last unloaded: ip_tunnel]
[  226.541776] CPU: 0 PID: 49 Comm: kworker/u4:2 Not tainted 3.14.0-rc8+ torvalds#101
[  226.541776] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  226.541776] Workqueue: netns cleanup_net
[  226.541776] task: ffff8800371e0000 ti: ffff88003730c000 task.ti: ffff88003730c000
[  226.541776] RIP: 0010:[<ffffffff81389ba9>]  [<ffffffff81389ba9>] __list_del_entry+0x29/0xd0
[  226.541776] RSP: 0018:ffff88003730dbd0  EFLAGS: 00010a83
[  226.541776] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8800374e6c40 RCX: dead000000200200
[  226.541776] RDX: 6b6b6b6b6b6b6b6b RSI: ffff8800371e07d0 RDI: ffff8800374e6c40
[  226.541776] RBP: ffff88003730dbd0 R08: 0000000000000000 R09: 0000000000000000
[  226.541776] R10: 0000000000000001 R11: ffff88003730d92e R12: 0000000000000002
[  226.541776] R13: ffff88007a4c42d0 R14: ffff88007aef0000 R15: ffff880036cf0018
[  226.541776] FS:  0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
[  226.541776] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  226.541776] CR2: 00007f07f643f7d0 CR3: 0000000036fd2000 CR4: 00000000000006f0
[  226.541776] Stack:
[  226.541776]  ffff88003730dbe8 ffffffff81389c5d ffff8800374ffbe4 ffff88003730dc28
[  226.541776]  ffffffffa0162a43 ffffffffa01627c5 ffff88007a4c42d0 ffff88007aef0000
[  226.541776]  ffffffffa01651c0 ffff88007a4c45e0 ffff88007aef0000 ffff88003730dc40
[  226.541776] Call Trace:
[  226.541776]  [<ffffffff81389c5d>] list_del+0xd/0x30
[  226.541776]  [<ffffffffa0162a43>] nf_ct_gre_keymap_destroy+0x283/0x2d0 [nf_conntrack_proto_gre]
[  226.541776]  [<ffffffffa01627c5>] ? nf_ct_gre_keymap_destroy+0x5/0x2d0 [nf_conntrack_proto_gre]
[  226.541776]  [<ffffffffa0162ab7>] gre_destroy+0x27/0x70 [nf_conntrack_proto_gre]
[  226.541776]  [<ffffffffa0117de3>] destroy_conntrack+0x83/0x200 [nf_conntrack]
[  226.541776]  [<ffffffffa0117d87>] ? destroy_conntrack+0x27/0x200 [nf_conntrack]
[  226.541776]  [<ffffffffa0117d60>] ? nf_conntrack_hash_check_insert+0x2e0/0x2e0 [nf_conntrack]
[  226.541776]  [<ffffffff81630142>] nf_conntrack_destroy+0x72/0x180
[  226.541776]  [<ffffffff816300d5>] ? nf_conntrack_destroy+0x5/0x180
[  226.541776]  [<ffffffffa011ef80>] ? kill_l3proto+0x20/0x20 [nf_conntrack]
[  226.541776]  [<ffffffffa011847e>] nf_ct_iterate_cleanup+0x14e/0x170 [nf_conntrack]
[  226.541776]  [<ffffffffa011f74b>] nf_ct_l4proto_pernet_unregister+0x5b/0x90 [nf_conntrack]
[  226.541776]  [<ffffffffa0162409>] proto_gre_net_exit+0x19/0x30 [nf_conntrack_proto_gre]
[  226.541776]  [<ffffffff815edf89>] ops_exit_list.isra.1+0x39/0x60
[  226.541776]  [<ffffffff815eecc0>] cleanup_net+0x100/0x1d0
[  226.541776]  [<ffffffff810a608a>] process_one_work+0x1ea/0x4f0
[  226.541776]  [<ffffffff810a6028>] ? process_one_work+0x188/0x4f0
[  226.541776]  [<ffffffff810a64ab>] worker_thread+0x11b/0x3a0
[  226.541776]  [<ffffffff810a6390>] ? process_one_work+0x4f0/0x4f0
[  226.541776]  [<ffffffff810af42d>] kthread+0xed/0x110
[  226.541776]  [<ffffffff8173d4dc>] ? _raw_spin_unlock_irq+0x2c/0x40
[  226.541776]  [<ffffffff810af340>] ? kthread_create_on_node+0x200/0x200
[  226.541776]  [<ffffffff8174747c>] ret_from_fork+0x7c/0xb0
[  226.541776]  [<ffffffff810af340>] ? kthread_create_on_node+0x200/0x200
[  226.541776] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de
48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48
39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89
42 08
[  226.541776] RIP  [<ffffffff81389ba9>] __list_del_entry+0x29/0xd0
[  226.541776]  RSP <ffff88003730dbd0>
[  226.612193] ---[ end trace 985ae23ddfcc357c ]---

Cc: Pablo Neira Ayuso <[email protected]>
Cc: Patrick McHardy <[email protected]>
Cc: Jozsef Kadlecsik <[email protected]>
Cc: "David S. Miller" <[email protected]>
Signed-off-by: Andrey Vagin <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
apxii pushed a commit to apxii/linux that referenced this pull request May 6, 2015
Added support for the DVB devices: DVBSky USB, TechoTrend CT2 4400 and TechnoTrend CT2 4650
apxii pushed a commit to apxii/linux that referenced this pull request May 6, 2015
…echnoTrend CT2 4650

See PR torvalds#101 for more information

Change-Id: I245214a6d2c37f6e7d365461e564d5a7374ea6c0
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Apr 4, 2016
In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Cc: <[email protected]>
Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Apr 27, 2016
In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Cc: <[email protected]>
Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Gregor Boirie <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request May 11, 2016
commit 07d2390 upstream.

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Gregor Boirie <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request May 11, 2016
commit 07d2390 upstream.

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Gregor Boirie <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
heftig referenced this pull request in zen-kernel/zen-kernel May 11, 2016
commit 07d2390 upstream.

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty #101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Gregor Boirie <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request May 13, 2016
On Fri, May 13, 2016 at 07:23:34AM +0200, Marc Haber wrote:
> How do I apply this?

I'm attaching it.

$ patch -p1 --dry-run -i /tmp/01-mm-thp-calculate_the_mapcount_correctly_for_thp_pages_during_wp_faults.patch
checking file include/linux/mm.h
checking file include/linux/swap.h
checking file mm/huge_memory.c
checking file mm/memory.c
checking file mm/swapfile.c
$ patch -p1 -i /tmp/01-mm-thp-calculate_the_mapcount_correctly_for_thp_pages_during_wp_faults.patch
patching file include/linux/mm.h
patching file include/linux/swap.h
patching file mm/huge_memory.c
patching file mm/memory.c
patching file mm/swapfile.c

The --dry-run is to check whether it applies first.

That's on 4.6-rc7+ here.

HTH.

--
Regards/Gruss,
    Boris.

ECO tip torvalds#101: Trim your mails when you reply.

From [email protected] Tue May 10 21:21:32 2016
From: Andrea Arcangeli <[email protected]>
To: Andrew Morton <[email protected]>, [email protected],
 [email protected]
Cc: Alex Williamson <[email protected]>, "Kirill A. Shutemov"
 <[email protected]>
Subject: [PATCH 1/1] mm: thp: calculate the mapcount correctly for THP pages
 during WP faults
Date:	Tue, 10 May 2016 21:21:22 +0200
Message-Id: <[email protected]>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=utf-8
Status: RO

This will provide fully accuracy to the mapcount calculation in the
write protect faults, so page pinning will not get broken by false
positive copy-on-writes.

total_mapcount() isn't the right calculation needed in
reuse_swap_page(), so this introduces a page_trans_huge_mapcount()
that is effectively the full accurate return value for page_mapcount()
if dealing with Transparent Hugepages, however we only use the
page_trans_huge_mapcount() during COW faults where it strictly needed,
due to its higher runtime cost.

This also provide at practical zero cost the total_mapcount
information which is needed to know if we can still relocate the page
anon_vma to the local vma. If page_trans_huge_mapcount() returns 1 we
can reuse the page no matter if it's a pte or a pmd_trans_huge
triggering the fault, but we can only relocate the page anon_vma to
the local vma->anon_vma if we're sure it's only this "vma" mapping the
whole THP physical range.

Kirill A. Shutemov discovered the problem with moving the page
anon_vma to the local vma->anon_vma in a previous version of this
patch and another problem in the way page_move_anon_rmap() was called.

Andrew Morton discovered that CONFIG_SWAP=n wouldn't build in a
previous version, because reuse_swap_page must be a macro to call
page_trans_huge_mapcount from swap.h, so this uses a macro again
instead of an inline function. With this change at least it's a less
dangerous usage than it was before, because "page" is used only once
now, while with the previous code reuse_swap_page(page++) would have
called page_mapcount on page+1 and it would have increased page twice
instead of just once.

Reviewed-by: "Kirill A. Shutemov" <[email protected]>
Signed-off-by: Andrea Arcangeli <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Alex Williamson <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Ebru Akagunduz <[email protected]>
Cc: Geliang Tang <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Xie XiuQi <[email protected]>
Cc: linux-mm <[email protected]>
Cc: lkml <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Noltari pushed a commit to Noltari/linux that referenced this pull request May 23, 2016
[ Upstream commit 07d2390 ]

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Cc: <[email protected]>
Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Gregor Boirie <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request May 23, 2016
commit 07d2390 upstream.

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Gregor Boirie <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Jiri Slaby <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request May 23, 2016
[ Upstream commit 07d2390 ]

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Cc: <[email protected]>
Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Gregor Boirie <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
sashalevin pushed a commit to sashalevin/linux-stable-security that referenced this pull request May 27, 2016
commit 07d2390 upstream.

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Gregor Boirie <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Signed-off-by: Sasha Levin <[email protected]>
sashalevin pushed a commit to sashalevin/linux-stable-security that referenced this pull request May 27, 2016
commit 07d2390 upstream.

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Gregor Boirie <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Jiri Slaby <[email protected]>

Signed-off-by: Sasha Levin <[email protected]>
sashalevin pushed a commit to sashalevin/linux-stable-security that referenced this pull request May 27, 2016
[ Upstream commit 07d2390 ]

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Cc: <[email protected]>
Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Gregor Boirie <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
sashalevin pushed a commit to sashalevin/linux-stable-security that referenced this pull request May 27, 2016
[ Upstream commit 07d2390 ]

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Cc: <[email protected]>
Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Gregor Boirie <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
sashalevin pushed a commit to sashalevin/linux-stable-security that referenced this pull request May 27, 2016
commit 07d2390 upstream.

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Gregor Boirie <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Signed-off-by: Sasha Levin <[email protected]>
sashalevin pushed a commit to sashalevin/linux-stable-security that referenced this pull request May 27, 2016
commit 07d2390 upstream.

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Gregor Boirie <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Signed-off-by: Sasha Levin <[email protected]>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Jul 10, 2016
On 07/08/16 09:13, Petr Mladek wrote:
> On Thu 2016-07-07 20:27:13, Topi Miettinen wrote:
>> On 07/07/16 09:16, Petr Mladek wrote:
>>> On Sun 2016-07-03 15:08:07, Topi Miettinen wrote:
>>>> The attached patch would make any uses of capabilities generate audit
>>>> messages. It works for simple tests as you can see from the commit
>>>> message, but unfortunately the call to audit_cgroup_list() deadlocks the
>>>> system when booting a full blown OS. There's no deadlock when the call
>>>> is removed.
>>>>
>>>> I guess that in some cases, cgroup_mutex and/or css_set_lock could be
>>>> already held earlier before entering audit_cgroup_list(). Holding the
>>>> locks is however required by task_cgroup_from_root(). Is there any way
>>>> to avoid this? For example, only print some kind of cgroup ID numbers
>>>> (are there unique and stable IDs, available without locks?) for those
>>>> cgroups where the task is registered in the audit message?
>>>
>>> I am not sure if anyone know what really happens here. I suggest to
>>> enable lockdep. It might detect possible deadlock even before it
>>> really happens, see Documentation/locking/lockdep-design.txt
>>>
>>> It can be enabled by
>>>
>>>    CONFIG_PROVE_LOCKING=y
>>>
>>> It depends on
>>>
>>>     CONFIG_DEBUG_KERNEL=y
>>>
>>> and maybe some more options, see lib/Kconfig.debug
>>
>> Thanks a lot! I caught this stack dump:
>>
>> starting version 230
>> [    3.416647] ------------[ cut here ]------------
>> [    3.417310] WARNING: CPU: 0 PID: 95 at
>> /home/topi/d/linux.git/kernel/locking/lockdep.c:2871
>> lockdep_trace_alloc+0xb4/0xc0
>> [    3.417605] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
>> [    3.417923] Modules linked in:
>> [    3.418288] CPU: 0 PID: 95 Comm: systemd-udevd Not tainted 4.7.0-rc5+ torvalds#97
>> [    3.418444] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS Debian-1.8.2-1 04/01/2014
>> [    3.418726]  0000000000000086 000000007970f3b0 ffff88000016fb00
>> ffffffff813c9c45
>> [    3.418993]  ffff88000016fb50 0000000000000000 ffff88000016fb40
>> ffffffff81091e9b
>> [    3.419176]  00000b3705e2c798 0000000000000046 0000000000000410
>> 00000000ffffffff
>> [    3.419374] Call Trace:
>> [    3.419511]  [<ffffffff813c9c45>] dump_stack+0x67/0x92
>> [    3.419644]  [<ffffffff81091e9b>] __warn+0xcb/0xf0
>> [    3.419745]  [<ffffffff81091f1f>] warn_slowpath_fmt+0x5f/0x80
>> [    3.419868]  [<ffffffff810e9a84>] lockdep_trace_alloc+0xb4/0xc0
>> [    3.419988]  [<ffffffff8120dc42>] kmem_cache_alloc_node+0x42/0x600
>> [    3.420156]  [<ffffffff8110432d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
>> [    3.420170]  [<ffffffff8163183b>] __alloc_skb+0x5b/0x1d0
>> [    3.420170]  [<ffffffff81144f6b>] audit_log_start+0x29b/0x480
>> [    3.420170]  [<ffffffff810a2925>] ? __lock_task_sighand+0x95/0x270
>> [    3.420170]  [<ffffffff81145cc9>] audit_log_cap_use+0x39/0xf0
>> [    3.420170]  [<ffffffff8109cd75>] ns_capable+0x45/0x70
>> [    3.420170]  [<ffffffff8109cdb7>] capable+0x17/0x20
>> [    3.420170]  [<ffffffff812a2f50>] oom_score_adj_write+0x150/0x2f0
>> [    3.420170]  [<ffffffff81230997>] __vfs_write+0x37/0x160
>> [    3.420170]  [<ffffffff810e33b7>] ? update_fast_ctr+0x17/0x30
>> [    3.420170]  [<ffffffff810e3449>] ? percpu_down_read+0x49/0x90
>> [    3.420170]  [<ffffffff81233d47>] ? __sb_start_write+0xb7/0xf0
>> [    3.420170]  [<ffffffff81233d47>] ? __sb_start_write+0xb7/0xf0
>> [    3.420170]  [<ffffffff81231048>] vfs_write+0xb8/0x1b0
>> [    3.420170]  [<ffffffff812533c6>] ? __fget_light+0x66/0x90
>> [    3.420170]  [<ffffffff81232078>] SyS_write+0x58/0xc0
>> [    3.420170]  [<ffffffff81001f2c>] do_syscall_64+0x5c/0x300
>> [    3.420170]  [<ffffffff81849c9a>] entry_SYSCALL64_slow_path+0x25/0x25
>> [    3.420170] ---[ end trace fb586899fb556a5e ]---
>> [    3.447922] random: systemd-udevd urandom read with 3 bits of entropy
>> available
>> [    4.014078] clocksource: Switched to clocksource tsc
>> Begin: Loading essential drivers ... done.
>>
>> This is with qemu and the boot continues normally. With real computer,
>> there's no such output and system just seems to freeze.
>>
>> Could it be possible that the deadlock happens because there's some IO
>> towards /sys/fs/cgroup, which causes a capability check and that in turn
>> causes locking problems when we try to print cgroup list?
>
> The above warning is printed by the code from
> kernel/locking/lockdep.c:2871
>
> static void __lockdep_trace_alloc(gfp_t gfp_mask, unsigned long flags)
> {
> [...]
> 	/* We're only interested __GFP_FS allocations for now */
> 	if (!(gfp_mask & __GFP_FS))
> 		return;
>
> 	/*
> 	 * Oi! Can't be having __GFP_FS allocations with IRQs disabled.
> 	 */
> 	if (DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)))
> 		return;
>
>
> The backtrace shows that your new audit_log_cap_use() is called
> from vfs_write(). You might try to use audit_log_start() with
> GFP_NOFS instead of GFP_KERNEL.
>
> Note that this is rather intuitive advice. I still need to learn a lot
> about memory management and kernel in general to be more sure about
> a correct solution.
>
> Best Regards,
> Petr
>

With the attached patch, the system boots without deadlock. Locking
problems still remain:

[    3.652221] ======================================================
[    3.652221] [ INFO: possible circular locking dependency detected ]
[    3.652221] 4.7.0-rc5+ torvalds#101 Not tainted
[    3.652221] -------------------------------------------------------
[    3.652221] systemd/1 is trying to acquire lock:
[    3.652221]  (tasklist_lock){.+.+..}, at: [<ffffffff81137ddd>]
cgroup_mount+0x7ed/0xbc0
[    3.652221]
               but task is already holding lock:
[    3.652221]  (css_set_lock){......}, at: [<ffffffff81137c59>]
cgroup_mount+0x669/0xbc0
[    3.652221]
               which lock already depends on the new lock.

[    3.652221]
               the existing dependency chain (in reverse order) is:
[    3.652221]
               -> #3 (css_set_lock){......}:
[    3.652221]        [<ffffffff810e92b3>] lock_acquire+0xe3/0x1c0
[    3.652221]        [<ffffffff8184e137>] _raw_spin_lock_irq+0x37/0x50
[    3.652221]        [<ffffffff811374be>] cgroup_setup_root+0x19e/0x2d0
[    3.652221]        [<ffffffff821911fc>] cgroup_init+0xec/0x41d
[    3.652221]        [<ffffffff82171f68>] start_kernel+0x40c/0x465
[    3.652221]        [<ffffffff82171294>]
x86_64_start_reservations+0x2f/0x31
[    3.652221]        [<ffffffff8217140e>] x86_64_start_kernel+0x178/0x18b
[    3.652221]
               -> #2 (cgroup_mutex){+.+...}:
[    3.652221]        [<ffffffff810e92b3>] lock_acquire+0xe3/0x1c0
[    3.652221]        [<ffffffff8184af5f>] mutex_lock_nested+0x5f/0x350
[    3.652221]        [<ffffffff8113962a>] audit_cgroup_list+0x4a/0x2f0
[    3.652221]        [<ffffffff81145d19>] audit_log_cap_use+0xd9/0xf0
[    3.652221]        [<ffffffff8109cd75>] ns_capable+0x45/0x70
[    3.652221]        [<ffffffff8109cdb7>] capable+0x17/0x20
[    3.652221]        [<ffffffff812a2f00>] oom_score_adj_write+0x150/0x2f0
[    3.652221]        [<ffffffff81230947>] __vfs_write+0x37/0x160
[    3.652221]        [<ffffffff81230ff8>] vfs_write+0xb8/0x1b0
[    3.652221]        [<ffffffff81232028>] SyS_write+0x58/0xc0
[    3.652221]        [<ffffffff81001f2c>] do_syscall_64+0x5c/0x300
[    3.652221]        [<ffffffff8184ea1a>] return_from_SYSCALL_64+0x0/0x7a
[    3.652221]
               -> #1 (&(&sighand->siglock)->rlock){+.+...}:
[    3.652221]        [<ffffffff810e92b3>] lock_acquire+0xe3/0x1c0
[    3.652221]        [<ffffffff8184dfc1>] _raw_spin_lock+0x31/0x40
[    3.652221]        [<ffffffff810901d9>]
copy_process.part.34+0x10f9/0x1b40
[    3.652221]        [<ffffffff81090e23>] _do_fork+0xf3/0x6b0
[    3.652221]        [<ffffffff81091409>] kernel_thread+0x29/0x30
[    3.652221]        [<ffffffff810b71d7>] kthreadd+0x187/0x1e0
[    3.652221]        [<ffffffff8184eb7f>] ret_from_fork+0x1f/0x40
[    3.652221]
               -> #0 (tasklist_lock){.+.+..}:
[    3.652221]        [<ffffffff810e8dfb>] __lock_acquire+0x13cb/0x1440
[    3.652221]        [<ffffffff810e92b3>] lock_acquire+0xe3/0x1c0
[    3.652221]        [<ffffffff8184e3f4>] _raw_read_lock+0x34/0x50
[    3.652221]        [<ffffffff81137ddd>] cgroup_mount+0x7ed/0xbc0
[    3.652221]        [<ffffffff81234d98>] mount_fs+0x38/0x170
[    3.652221]        [<ffffffff8125626b>] vfs_kern_mount+0x6b/0x150
[    3.652221]        [<ffffffff81258f8c>] do_mount+0x24c/0xe30
[    3.652221]        [<ffffffff81259ea5>] SyS_mount+0x95/0xe0
[    3.652221]        [<ffffffff8184e965>]
entry_SYSCALL_64_fastpath+0x18/0xa8
[    3.652221]
               other info that might help us debug this:

[    3.652221] Chain exists of:
                 tasklist_lock --> cgroup_mutex --> css_set_lock

[    3.652221]  Possible unsafe locking scenario:

[    3.652221]        CPU0                    CPU1
[    3.652221]        ----                    ----
[    3.652221]   lock(css_set_lock);
[    3.652221]                                lock(cgroup_mutex);
[    3.652221]                                lock(css_set_lock);
[    3.652221]   lock(tasklist_lock);
[    3.652221]
                *** DEADLOCK ***

[    3.652221] 1 lock held by systemd/1:
[    3.652221]  #0:  (css_set_lock){......}, at: [<ffffffff81137c59>]
cgroup_mount+0x669/0xbc0
[    3.652221]
               stack backtrace:
[    3.652221] CPU: 0 PID: 1 Comm: systemd Not tainted 4.7.0-rc5+ torvalds#101
[    3.652221] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Debian-1.8.2-1 04/01/2014
[    3.652221]  0000000000000086 0000000024b7c1ed ffff880006d13bb0
ffffffff813c9bf5
[    3.652221]  ffffffff829dbd20 ffffffff829cf2a0 ffff880006d13bf0
ffffffff810e60a3
[    3.652221]  ffff880006d13c30 ffff880006d067b0 ffff880006d06040
0000000000000001
[    3.652221] Call Trace:
[    3.652221]  [<ffffffff813c9bf5>] dump_stack+0x67/0x92
[    3.652221]  [<ffffffff810e60a3>] print_circular_bug+0x1e3/0x250
[    3.652221]  [<ffffffff810e8dfb>] __lock_acquire+0x13cb/0x1440
[    3.652221]  [<ffffffff81210bcd>] ? __kmalloc_track_caller+0x2bd/0x630
[    3.652221]  [<ffffffff810e92b3>] lock_acquire+0xe3/0x1c0
[    3.652221]  [<ffffffff81137ddd>] ? cgroup_mount+0x7ed/0xbc0
[    3.652221]  [<ffffffff8184e3f4>] _raw_read_lock+0x34/0x50
[    3.652221]  [<ffffffff81137ddd>] ? cgroup_mount+0x7ed/0xbc0
[    3.652221]  [<ffffffff81137ddd>] cgroup_mount+0x7ed/0xbc0
[    3.652221]  [<ffffffff810e5637>] ? lockdep_init_map+0x57/0x1f0
[    3.652221]  [<ffffffff81234d98>] mount_fs+0x38/0x170
[    3.652221]  [<ffffffff8125626b>] vfs_kern_mount+0x6b/0x150
[    3.652221]  [<ffffffff81258f8c>] do_mount+0x24c/0xe30
[    3.652221]  [<ffffffff812105bb>] ? kmem_cache_alloc_trace+0x28b/0x5e0
[    3.652221]  [<ffffffff811cc176>] ? strndup_user+0x46/0x80
[    3.652221]  [<ffffffff81259ea5>] SyS_mount+0x95/0xe0
[    3.652221]  [<ffffffff8184e965>] entry_SYSCALL_64_fastpath+0x18/0xa8

Rate limiting would not be a bad idea, there were 329 audit log entries
about capability use.

-Topi

From b857ec7d436f6fbb8c33e8d6a16d2f16175517d8 Mon Sep 17 00:00:00 2001
From: Topi Miettinen <[email protected]>
Date: Sun, 10 Jul 2016 11:45:30 +0300
Subject: [PATCH] cgroup: check options and capabilities before locking to
 avoid a deadlock

Signed-off-by: Topi Miettinen <[email protected]>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Aug 19, 2016
WARNING: please, no spaces at the start of a line
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,$

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                  ^

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                        ^

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                               ^

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                                    ^

WARNING: ENOSYS means 'invalid syscall nr' and nothing else
torvalds#115: FILE: kernel/sysctl.c:2899:
+	return -ENOSYS;

total: 4 errors, 2 warnings, 74 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/sysctl-handle-error-writing-uint_max-to-u32-fields.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Subash Abhinov Kasiviswanathan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Aug 19, 2016
WARNING: please, no spaces at the start of a line
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,$

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                  ^

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                        ^

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                               ^

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                                    ^

WARNING: ENOSYS means 'invalid syscall nr' and nothing else
torvalds#115: FILE: kernel/sysctl.c:2899:
+	return -ENOSYS;

total: 4 errors, 2 warnings, 74 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/sysctl-handle-error-writing-uint_max-to-u32-fields.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Subash Abhinov Kasiviswanathan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
koct9i pushed a commit to koct9i/linux that referenced this pull request Aug 21, 2016
WARNING: please, no spaces at the start of a line
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,$

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                  ^

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                        ^

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                               ^

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                                    ^

WARNING: ENOSYS means 'invalid syscall nr' and nothing else
torvalds#115: FILE: kernel/sysctl.c:2899:
+	return -ENOSYS;

total: 4 errors, 2 warnings, 74 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/sysctl-handle-error-writing-uint_max-to-u32-fields.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Subash Abhinov Kasiviswanathan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Aug 22, 2016
WARNING: please, no spaces at the start of a line
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,$

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                  ^

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                        ^

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                               ^

ERROR: space required after that ',' (ctx:VxV)
torvalds#101: FILE: kernel/sysctl.c:2297:
+    return do_proc_dointvec(table,write,buffer,lenp,ppos,
                                                    ^

WARNING: ENOSYS means 'invalid syscall nr' and nothing else
torvalds#115: FILE: kernel/sysctl.c:2899:
+	return -ENOSYS;

total: 4 errors, 2 warnings, 74 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/sysctl-handle-error-writing-uint_max-to-u32-fields.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Subash Abhinov Kasiviswanathan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
zhijianli88 pushed a commit to zhijianli88/linux that referenced this pull request Dec 6, 2022
For leaf dir, In most cases, there should be as many bestfree slots
as the dir data blocks that can fit under i_size (except for [1]).

Root cause is we don't examin the number bestfree slots, when the slots
number less than dir data blocks, if we need to allocate new dir data
block and update the bestfree array, we will use the dir block number as
index to assign bestfree array, while we did not check the leaf buf
boundary which may cause UAF or other memory access problem. This issue
can also triggered with test cases xfs/473 from fstests.

According to Dave Chinner & Darrick's suggestion, adding buffer verifier
to detect this abnormal situation in time.
Simplify the testcase for fstest xfs/554 [1]

The error log is shown as follows:
==================================================================
BUG: KASAN: use-after-free in xfs_dir2_leaf_addname+0x1995/0x1ac0
Write of size 2 at addr ffff88810168b000 by task touch/1552
CPU: 5 PID: 1552 Comm: touch Not tainted 6.0.0-rc3+ torvalds#101
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 print_report.cold+0xf6/0x691
 kasan_report+0xa8/0x120
 xfs_dir2_leaf_addname+0x1995/0x1ac0
 xfs_dir_createname+0x58c/0x7f0
 xfs_create+0x7af/0x1010
 xfs_generic_create+0x270/0x5e0
 path_openat+0x270b/0x3450
 do_filp_open+0x1cf/0x2b0
 do_sys_openat2+0x46b/0x7a0
 do_sys_open+0xb7/0x130
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fe4d9e9312b
Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0
75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00
f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b
RDX: 0000000000000941 RSI: 00007ffda4c17f33 RDI: 00000000ffffff9c
RBP: 00007ffda4c17f33 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941
R13: 00007fe4d9f631a4 R14: 00007ffda4c17f33 R15: 0000000000000000
 </TASK>

The buggy address belongs to the physical page:
page:ffffea000405a2c0 refcount:0 mapcount:0 mapping:0000000000000000
index:0x0 pfn:0x10168b
flags: 0x2fffff80000000(node=0|zone=2|lastcpupid=0x1fffff)
raw: 002fffff80000000 ffffea0004057788 ffffea000402dbc8 0000000000000000
raw: 0000000000000000 0000000000170000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88810168af00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff88810168af80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff88810168b000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                   ^
 ffff88810168b080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff88810168b100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
Disabling lock debugging due to kernel taint
00000000: 58 44 44 33 5b 53 35 c2 00 00 00 00 00 00 00 78
XDD3[S5........x
XFS (sdb): Internal error xfs_dir2_data_use_free at line 1200 of file
fs/xfs/libxfs/xfs_dir2_data.c.  Caller
xfs_dir2_data_use_free+0x28a/0xeb0
CPU: 5 PID: 1552 Comm: touch Tainted: G    B              6.0.0-rc3+
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 xfs_corruption_error+0x132/0x150
 xfs_dir2_data_use_free+0x198/0xeb0
 xfs_dir2_leaf_addname+0xa59/0x1ac0
 xfs_dir_createname+0x58c/0x7f0
 xfs_create+0x7af/0x1010
 xfs_generic_create+0x270/0x5e0
 path_openat+0x270b/0x3450
 do_filp_open+0x1cf/0x2b0
 do_sys_openat2+0x46b/0x7a0
 do_sys_open+0xb7/0x130
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fe4d9e9312b
Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0
75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00
f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b
RDX: 0000000000000941 RSI: 00007ffda4c17f46 RDI: 00000000ffffff9c
RBP: 00007ffda4c17f46 R08: 0000000000000000 R09: 0000000000000001
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941
R13: 00007fe4d9f631a4 R14: 00007ffda4c17f46 R15: 0000000000000000
 </TASK>
XFS (sdb): Corruption detected. Unmount and run xfs_repair

[1] https://lore.kernel.org/all/[email protected]/
Reviewed-by: Hou Tao <[email protected]>
Signed-off-by: Guo Xuenan <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Jan 30, 2023
Ilya Leoshkevich says:

====================

v2: https://lore.kernel.org/bpf/[email protected]/#t
v2 -> v3:
- Make __arch_prepare_bpf_trampoline static.
  (Reported-by: kernel test robot <[email protected]>)
- Support both old- and new- style map definitions in sk_assign. (Alexei)
- Trim DENYLIST.s390x. (Alexei)
- Adjust s390x vmlinux path in vmtest.sh.
- Drop merged fixes.

v1: https://lore.kernel.org/bpf/[email protected]/#t
v1 -> v2:
- Fix core_read_macros, sk_assign, test_profiler, test_bpffs (24/31;
  I'm not quite happy with the fix, but don't have better ideas),
  and xdp_synproxy. (Andrii)
- Prettify liburandom_read and verify_pkcs7_sig fixes. (Andrii)
- Fix bpf_usdt_arg using barrier_var(); prettify barrier_var(). (Andrii)
- Change BPF_MAX_TRAMP_LINKS to enum and query it using BTF. (Andrii)
- Improve bpf_jit_supports_kfunc_call() description. (Alexei)
- Always check sign_extend() return value.
- Cc: Alexander Gordeev.

Hi,

This series implements poke, trampoline, kfunc, and mixing subprogs
and tailcalls on s390x.

The following failures still remain:

torvalds#82      get_stack_raw_tp:FAIL
get_stack_print_output:FAIL:user_stack corrupted user stack
Known issue:
We cannot reliably unwind userspace on s390x without DWARF.

torvalds#101     ksyms_module:FAIL
address of kernel function bpf_testmod_test_mod_kfunc is out of range
Known issue:
Kernel and modules are too far away from each other on s390x.

torvalds#190     stacktrace_build_id:FAIL
Known issue:
We cannot reliably unwind userspace on s390x without DWARF.

torvalds#281     xdp_metadata:FAIL
See patch 6.

None of these seem to be due to the new changes.

Best regards,
Ilya
====================

Signed-off-by: Alexei Starovoitov <[email protected]>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Mar 17, 2023
Currently, test_progs outputs all stdout/stderr as it runs, and when it
is done, prints a summary.

It is non-trivial for tooling to parse that output and extract meaningful
information from it.

This change adds a new option, `--json-summary`/`-J` that let the caller
specify a file where `test_progs{,-no_alu32}` can write a summary of the
run in a json format that can later be parsed by tooling.

Currently, it creates a summary section with successes/skipped/failures
followed by a list of failed tests and subtests.

A test contains the following fields:
- name: the name of the test
- number: the number of the test
- message: the log message that was printed by the test.
- failed: A boolean indicating whether the test failed or not. Currently
we only output failed tests, but in the future, successful tests could
be added.
- subtests: A list of subtests associated with this test.

A subtest contains the following fields:
- name: same as above
- number: sanme as above
- message: the log message that was printed by the subtest.
- failed: same as above but for the subtest

An example run and json content below:
```
$ sudo ./test_progs -a $(grep -v '^#' ./DENYLIST.aarch64 | awk '{print
$1","}' | tr -d '\n') -j -J /tmp/test_progs.json
$ jq < /tmp/test_progs.json | head -n 30
{
  "success": 29,
  "success_subtest": 23,
  "skipped": 3,
  "failed": 28,
  "results": [
    {
      "name": "bpf_cookie",
      "number": 10,
      "message": "test_bpf_cookie:PASS:skel_open 0 nsec\n",
      "failed": true,
      "subtests": [
        {
          "name": "multi_kprobe_link_api",
          "number": 2,
          "message": "kprobe_multi_link_api_subtest:PASS:load_kallsyms 0
nsec\nlibbpf: extern 'bpf_testmod_fentry_test1' (strong): not
resolved\nlibbpf: failed to load object 'kprobe_multi'\nlibbpf: failed
to load BPF skeleton 'kprobe_multi':
-3\nkprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected
error: -3\n",
          "failed": true
        },
        {
          "name": "multi_kprobe_attach_api",
          "number": 3,
          "message": "libbpf: extern 'bpf_testmod_fentry_test1'
(strong): not resolved\nlibbpf: failed to load object
'kprobe_multi'\nlibbpf: failed to load BPF skeleton 'kprobe_multi':
-3\nkprobe_multi_attach_api_subtest:FAIL:fentry_raw_skel_load unexpected
error: -3\n",
          "failed": true
        },
        {
          "name": "lsm",
          "number": 8,
          "message": "lsm_subtest:PASS:lsm.link_create 0
nsec\nlsm_subtest:FAIL:stack_mprotect unexpected stack_mprotect: actual
0 != expected -1\n",
          "failed": true
        }
```

The file can then be used to print a summary of the test run and list of
failing tests/subtests:

```
$ jq -r < /tmp/test_progs.json '"Success:
\(.success)/\(.success_subtest), Skipped: \(.skipped), Failed:
\(.failed)"'

Success: 29/23, Skipped: 3, Failed: 28
$ jq -r < /tmp/test_progs.json '.results | map([
    if .failed then "#\(.number) \(.name)" else empty end,
    (
        . as {name: $tname, number: $tnum} | .subtests | map(
            if .failed then "#\($tnum)/\(.number) \($tname)/\(.name)"
else empty end
        )
    )
]) | flatten | .[]' | head -n 20
 torvalds#10 bpf_cookie
 torvalds#10/2 bpf_cookie/multi_kprobe_link_api
 torvalds#10/3 bpf_cookie/multi_kprobe_attach_api
 torvalds#10/8 bpf_cookie/lsm
 torvalds#15 bpf_mod_race
 torvalds#15/1 bpf_mod_race/ksym (used_btfs UAF)
 torvalds#15/2 bpf_mod_race/kfunc (kfunc_btf_tab UAF)
 torvalds#36 cgroup_hierarchical_stats
 torvalds#61 deny_namespace
 torvalds#61/1 deny_namespace/unpriv_userns_create_no_bpf
 torvalds#73 fexit_stress
 torvalds#83 get_func_ip_test
 torvalds#99 kfunc_dynptr_param
 torvalds#99/1 kfunc_dynptr_param/dynptr_data_null
 torvalds#99/4 kfunc_dynptr_param/dynptr_data_null
 torvalds#100 kprobe_multi_bench_attach
 torvalds#100/1 kprobe_multi_bench_attach/kernel
 torvalds#100/2 kprobe_multi_bench_attach/modules
 torvalds#101 kprobe_multi_test
 torvalds#101/1 kprobe_multi_test/skel_api
```

Signed-off-by: Manu Bretelle <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Mar 17, 2023
Currently, test_progs outputs all stdout/stderr as it runs, and when it
is done, prints a summary.

It is non-trivial for tooling to parse that output and extract meaningful
information from it.

This change adds a new option, `--json-summary`/`-J` that let the caller
specify a file where `test_progs{,-no_alu32}` can write a summary of the
run in a json format that can later be parsed by tooling.

Currently, it creates a summary section with successes/skipped/failures
followed by a list of failed tests and subtests.

A test contains the following fields:
- name: the name of the test
- number: the number of the test
- message: the log message that was printed by the test.
- failed: A boolean indicating whether the test failed or not. Currently
we only output failed tests, but in the future, successful tests could
be added.
- subtests: A list of subtests associated with this test.

A subtest contains the following fields:
- name: same as above
- number: sanme as above
- message: the log message that was printed by the subtest.
- failed: same as above but for the subtest

An example run and json content below:
```
$ sudo ./test_progs -a $(grep -v '^#' ./DENYLIST.aarch64 | awk '{print
$1","}' | tr -d '\n') -j -J /tmp/test_progs.json
$ jq < /tmp/test_progs.json | head -n 30
{
  "success": 29,
  "success_subtest": 23,
  "skipped": 3,
  "failed": 28,
  "results": [
    {
      "name": "bpf_cookie",
      "number": 10,
      "message": "test_bpf_cookie:PASS:skel_open 0 nsec\n",
      "failed": true,
      "subtests": [
        {
          "name": "multi_kprobe_link_api",
          "number": 2,
          "message": "kprobe_multi_link_api_subtest:PASS:load_kallsyms 0 nsec\nlibbpf: extern 'bpf_testmod_fentry_test1' (strong): not resolved\nlibbpf: failed to load object 'kprobe_multi'\nlibbpf: failed to load BPF skeleton 'kprobe_multi': -3\nkprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3\n",
          "failed": true
        },
        {
          "name": "multi_kprobe_attach_api",
          "number": 3,
          "message": "libbpf: extern 'bpf_testmod_fentry_test1' (strong): not resolved\nlibbpf: failed to load object 'kprobe_multi'\nlibbpf: failed to load BPF skeleton 'kprobe_multi': -3\nkprobe_multi_attach_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3\n",
          "failed": true
        },
        {
          "name": "lsm",
          "number": 8,
          "message": "lsm_subtest:PASS:lsm.link_create 0 nsec\nlsm_subtest:FAIL:stack_mprotect unexpected stack_mprotect: actual 0 != expected -1\n",
          "failed": true
        }
```

The file can then be used to print a summary of the test run and list of
failing tests/subtests:

```
$ jq -r < /tmp/test_progs.json '"Success: \(.success)/\(.success_subtest), Skipped: \(.skipped), Failed: \(.failed)"'

Success: 29/23, Skipped: 3, Failed: 28
$ jq -r < /tmp/test_progs.json '.results | map([
    if .failed then "#\(.number) \(.name)" else empty end,
    (
        . as {name: $tname, number: $tnum} | .subtests | map(
            if .failed then "#\($tnum)/\(.number) \($tname)/\(.name)" else empty end
        )
    )
]) | flatten | .[]' | head -n 20
 torvalds#10 bpf_cookie
 torvalds#10/2 bpf_cookie/multi_kprobe_link_api
 torvalds#10/3 bpf_cookie/multi_kprobe_attach_api
 torvalds#10/8 bpf_cookie/lsm
 torvalds#15 bpf_mod_race
 torvalds#15/1 bpf_mod_race/ksym (used_btfs UAF)
 torvalds#15/2 bpf_mod_race/kfunc (kfunc_btf_tab UAF)
 torvalds#36 cgroup_hierarchical_stats
 torvalds#61 deny_namespace
 torvalds#61/1 deny_namespace/unpriv_userns_create_no_bpf
 torvalds#73 fexit_stress
 torvalds#83 get_func_ip_test
 torvalds#99 kfunc_dynptr_param
 torvalds#99/1 kfunc_dynptr_param/dynptr_data_null
 torvalds#99/4 kfunc_dynptr_param/dynptr_data_null
 torvalds#100 kprobe_multi_bench_attach
 torvalds#100/1 kprobe_multi_bench_attach/kernel
 torvalds#100/2 kprobe_multi_bench_attach/modules
 torvalds#101 kprobe_multi_test
 torvalds#101/1 kprobe_multi_test/skel_api
```

Signed-off-by: Manu Bretelle <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Nov 28, 2023
[ Upstream commit 13cf24e ]

For leaf dir, In most cases, there should be as many bestfree slots
as the dir data blocks that can fit under i_size (except for [1]).

Root cause is we don't examin the number bestfree slots, when the slots
number less than dir data blocks, if we need to allocate new dir data
block and update the bestfree array, we will use the dir block number as
index to assign bestfree array, while we did not check the leaf buf
boundary which may cause UAF or other memory access problem. This issue
can also triggered with test cases xfs/473 from fstests.

According to Dave Chinner & Darrick's suggestion, adding buffer verifier
to detect this abnormal situation in time.
Simplify the testcase for fstest xfs/554 [1]

The error log is shown as follows:
==================================================================
BUG: KASAN: use-after-free in xfs_dir2_leaf_addname+0x1995/0x1ac0
Write of size 2 at addr ffff88810168b000 by task touch/1552
CPU: 5 PID: 1552 Comm: touch Not tainted 6.0.0-rc3+ torvalds#101
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 print_report.cold+0xf6/0x691
 kasan_report+0xa8/0x120
 xfs_dir2_leaf_addname+0x1995/0x1ac0
 xfs_dir_createname+0x58c/0x7f0
 xfs_create+0x7af/0x1010
 xfs_generic_create+0x270/0x5e0
 path_openat+0x270b/0x3450
 do_filp_open+0x1cf/0x2b0
 do_sys_openat2+0x46b/0x7a0
 do_sys_open+0xb7/0x130
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fe4d9e9312b
Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0
75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00
f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b
RDX: 0000000000000941 RSI: 00007ffda4c17f33 RDI: 00000000ffffff9c
RBP: 00007ffda4c17f33 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941
R13: 00007fe4d9f631a4 R14: 00007ffda4c17f33 R15: 0000000000000000
 </TASK>

The buggy address belongs to the physical page:
page:ffffea000405a2c0 refcount:0 mapcount:0 mapping:0000000000000000
index:0x0 pfn:0x10168b
flags: 0x2fffff80000000(node=0|zone=2|lastcpupid=0x1fffff)
raw: 002fffff80000000 ffffea0004057788 ffffea000402dbc8 0000000000000000
raw: 0000000000000000 0000000000170000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88810168af00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff88810168af80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff88810168b000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                   ^
 ffff88810168b080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff88810168b100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
Disabling lock debugging due to kernel taint
00000000: 58 44 44 33 5b 53 35 c2 00 00 00 00 00 00 00 78
XDD3[S5........x
XFS (sdb): Internal error xfs_dir2_data_use_free at line 1200 of file
fs/xfs/libxfs/xfs_dir2_data.c.  Caller
xfs_dir2_data_use_free+0x28a/0xeb0
CPU: 5 PID: 1552 Comm: touch Tainted: G    B              6.0.0-rc3+
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 xfs_corruption_error+0x132/0x150
 xfs_dir2_data_use_free+0x198/0xeb0
 xfs_dir2_leaf_addname+0xa59/0x1ac0
 xfs_dir_createname+0x58c/0x7f0
 xfs_create+0x7af/0x1010
 xfs_generic_create+0x270/0x5e0
 path_openat+0x270b/0x3450
 do_filp_open+0x1cf/0x2b0
 do_sys_openat2+0x46b/0x7a0
 do_sys_open+0xb7/0x130
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fe4d9e9312b
Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0
75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00
f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b
RDX: 0000000000000941 RSI: 00007ffda4c17f46 RDI: 00000000ffffff9c
RBP: 00007ffda4c17f46 R08: 0000000000000000 R09: 0000000000000001
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941
R13: 00007fe4d9f631a4 R14: 00007ffda4c17f46 R15: 0000000000000000
 </TASK>
XFS (sdb): Corruption detected. Unmount and run xfs_repair

[1] https://lore.kernel.org/all/[email protected]/
Reviewed-by: Hou Tao <[email protected]>
Signed-off-by: Guo Xuenan <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
Signed-off-by: Leah Rumancik <[email protected]>
Acked-by: Chandan Babu R <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Dec 4, 2023
On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
prabhakarlad pushed a commit to prabhakarlad/linux that referenced this pull request Dec 4, 2023
On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Dec 8, 2023
On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Tested-by: Lad Prabhakar <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <[email protected]>
prabhakarlad pushed a commit to prabhakarlad/linux that referenced this pull request Dec 11, 2023
On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
prabhakarlad pushed a commit to prabhakarlad/linux that referenced this pull request Dec 19, 2023
On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
prabhakarlad pushed a commit to prabhakarlad/linux that referenced this pull request Dec 25, 2023
On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
gestionlin pushed a commit to gestionlin/linux that referenced this pull request Jan 7, 2024
On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Tested-by: Lad Prabhakar <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <[email protected]>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Jan 8, 2024
commit 1036f69 upstream.

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Tested-by: Lad Prabhakar <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
cynthia0928 pushed a commit to andestech/linux that referenced this pull request Jan 9, 2024
…cise exception (torvalds#101)

To print error messages for imprecise exception, extract the field sdcause
in the CSR sdcause instead of reading the whole register.

Signed-off-by: Ben Zong-You Xie <[email protected]>

Reviewed-on: https://gitea.andestech.com/RD-SW/linux/pulls/101
Reviewed-by: Tim Shih-Ting OuYang <[email protected]>
Reviewed-by: Leo Yu-Chi Liang <[email protected]>
Reviewed-by: Dylan Dai-Rong Jhong <[email protected]>
Co-authored-by: Ben Zong-You Xie <[email protected]>
Co-committed-by: Ben Zong-You Xie <[email protected]>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Jan 10, 2024
commit 1036f69 upstream.

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Tested-by: Lad Prabhakar <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
hjl-tools pushed a commit to hjl-tools/linux that referenced this pull request Jan 10, 2024
commit 1036f69 upstream.

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Tested-by: Lad Prabhakar <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Jan 10, 2024
commit 1036f69 upstream.

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Tested-by: Lad Prabhakar <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Jan 15, 2024
commit 1036f69 upstream.

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Tested-by: Lad Prabhakar <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Jan 15, 2024
commit 1036f69 upstream.

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Tested-by: Lad Prabhakar <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Jan 15, 2024
commit 1036f69 upstream.

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Tested-by: Lad Prabhakar <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Jan 15, 2024
commit 1036f69 upstream.

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Tested-by: Lad Prabhakar <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
logic10492 pushed a commit to logic10492/linux-amd-zen2 that referenced this pull request Jan 18, 2024
sched_ext: fix race in scx_move_task() with exiting tasks
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Jan 19, 2024
Why:
    The PCI error slot reset maybe triggered after inject ue to UMC multi times, this
    caused system hang.
    [  557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume
    [  557.373718] [drm] PCIE GART of 512M enabled.
    [  557.373722] [drm] PTB located at 0x0000031FED700000
    [  557.373788] [drm] VRAM is lost due to GPU reset!
    [  557.373789] [drm] PSP is resuming...
    [  557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset
    [  557.547067] [drm] PCI error: detected callback, state(1)!!
    [  557.547069] [drm] No support for XGMI hive yet...
    [  557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter
    [  557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations
    [  557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered
    [  557.610492] [drm] PCI error: slot reset callback!!
    ...
    [  560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded!
    [  560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded!
    [  560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI
    [  560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G           OE     5.15.0-91-generic torvalds#101-Ubuntu
    [  560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023
    [  560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu]
    [  560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu]
    [  560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff <48> 83 ba 18 02 00 00 00 0f 84 6a f8 ff ff 48 8d 7a 78 be 01 00 00
    [  560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202
    [  560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0
    [  560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010
    [  560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08
    [  560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000
    [  560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000
    [  560.803889] FS:  0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000
    [  560.812973] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0
    [  560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
    [  560.843444] PKRU: 55555554
    [  560.846480] Call Trace:
    [  560.849225]  <TASK>
    [  560.851580]  ? show_trace_log_lvl+0x1d6/0x2ea
    [  560.856488]  ? show_trace_log_lvl+0x1d6/0x2ea
    [  560.861379]  ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu]
    [  560.867778]  ? show_regs.part.0+0x23/0x29
    [  560.872293]  ? __die_body.cold+0x8/0xd
    [  560.876502]  ? die_addr+0x3e/0x60
    [  560.880238]  ? exc_general_protection+0x1c5/0x410
    [  560.885532]  ? asm_exc_general_protection+0x27/0x30
    [  560.891025]  ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu]
    [  560.898323]  amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu]
    [  560.904520]  process_one_work+0x228/0x3d0
How:
    In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected
    all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure.

Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
IsaiahStapleton pushed a commit to IsaiahStapleton/linux that referenced this pull request Mar 8, 2024
On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Tested-by: Lad Prabhakar <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <[email protected]>
Kaz205 pushed a commit to Kaz205/linux that referenced this pull request Apr 12, 2024
[ Upstream commit 601429c ]

Why:
    The PCI error slot reset maybe triggered after inject ue to UMC multi times, this
    caused system hang.
    [  557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume
    [  557.373718] [drm] PCIE GART of 512M enabled.
    [  557.373722] [drm] PTB located at 0x0000031FED700000
    [  557.373788] [drm] VRAM is lost due to GPU reset!
    [  557.373789] [drm] PSP is resuming...
    [  557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset
    [  557.547067] [drm] PCI error: detected callback, state(1)!!
    [  557.547069] [drm] No support for XGMI hive yet...
    [  557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter
    [  557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations
    [  557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered
    [  557.610492] [drm] PCI error: slot reset callback!!
    ...
    [  560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded!
    [  560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded!
    [  560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI
    [  560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G           OE     5.15.0-91-generic torvalds#101-Ubuntu
    [  560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023
    [  560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu]
    [  560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu]
    [  560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff <48> 83 ba 18 02 00 00 00 0f 84 6a f8 ff ff 48 8d 7a 78 be 01 00 00
    [  560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202
    [  560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0
    [  560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010
    [  560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08
    [  560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000
    [  560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000
    [  560.803889] FS:  0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000
    [  560.812973] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0
    [  560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
    [  560.843444] PKRU: 55555554
    [  560.846480] Call Trace:
    [  560.849225]  <TASK>
    [  560.851580]  ? show_trace_log_lvl+0x1d6/0x2ea
    [  560.856488]  ? show_trace_log_lvl+0x1d6/0x2ea
    [  560.861379]  ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu]
    [  560.867778]  ? show_regs.part.0+0x23/0x29
    [  560.872293]  ? __die_body.cold+0x8/0xd
    [  560.876502]  ? die_addr+0x3e/0x60
    [  560.880238]  ? exc_general_protection+0x1c5/0x410
    [  560.885532]  ? asm_exc_general_protection+0x27/0x30
    [  560.891025]  ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu]
    [  560.898323]  amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu]
    [  560.904520]  process_one_work+0x228/0x3d0
How:
    In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected
    all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure.

Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ptr1337 pushed a commit to CachyOS/linux that referenced this pull request Apr 13, 2024
[ Upstream commit 601429c ]

Why:
    The PCI error slot reset maybe triggered after inject ue to UMC multi times, this
    caused system hang.
    [  557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume
    [  557.373718] [drm] PCIE GART of 512M enabled.
    [  557.373722] [drm] PTB located at 0x0000031FED700000
    [  557.373788] [drm] VRAM is lost due to GPU reset!
    [  557.373789] [drm] PSP is resuming...
    [  557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset
    [  557.547067] [drm] PCI error: detected callback, state(1)!!
    [  557.547069] [drm] No support for XGMI hive yet...
    [  557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter
    [  557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations
    [  557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered
    [  557.610492] [drm] PCI error: slot reset callback!!
    ...
    [  560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded!
    [  560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded!
    [  560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI
    [  560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G           OE     5.15.0-91-generic torvalds#101-Ubuntu
    [  560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023
    [  560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu]
    [  560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu]
    [  560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff <48> 83 ba 18 02 00 00 00 0f 84 6a f8 ff ff 48 8d 7a 78 be 01 00 00
    [  560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202
    [  560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0
    [  560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010
    [  560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08
    [  560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000
    [  560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000
    [  560.803889] FS:  0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000
    [  560.812973] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0
    [  560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
    [  560.843444] PKRU: 55555554
    [  560.846480] Call Trace:
    [  560.849225]  <TASK>
    [  560.851580]  ? show_trace_log_lvl+0x1d6/0x2ea
    [  560.856488]  ? show_trace_log_lvl+0x1d6/0x2ea
    [  560.861379]  ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu]
    [  560.867778]  ? show_regs.part.0+0x23/0x29
    [  560.872293]  ? __die_body.cold+0x8/0xd
    [  560.876502]  ? die_addr+0x3e/0x60
    [  560.880238]  ? exc_general_protection+0x1c5/0x410
    [  560.885532]  ? asm_exc_general_protection+0x27/0x30
    [  560.891025]  ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu]
    [  560.898323]  amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu]
    [  560.904520]  process_one_work+0x228/0x3d0
How:
    In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected
    all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure.

Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
paralin pushed a commit to skiffos/linux that referenced this pull request Apr 14, 2024
On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Tested-by: Lad Prabhakar <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <[email protected]>
(cherry picked from commit 1036f69)
Signed-off-by: Cristian Ciocaltea <[email protected]>
john-cabaj pushed a commit to UbuntuAsahi/linux that referenced this pull request Jun 17, 2024
BugLink: https://bugs.launchpad.net/bugs/2065899

[ Upstream commit 601429c ]

Why:
    The PCI error slot reset maybe triggered after inject ue to UMC multi times, this
    caused system hang.
    [  557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume
    [  557.373718] [drm] PCIE GART of 512M enabled.
    [  557.373722] [drm] PTB located at 0x0000031FED700000
    [  557.373788] [drm] VRAM is lost due to GPU reset!
    [  557.373789] [drm] PSP is resuming...
    [  557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset
    [  557.547067] [drm] PCI error: detected callback, state(1)!!
    [  557.547069] [drm] No support for XGMI hive yet...
    [  557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter
    [  557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations
    [  557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered
    [  557.610492] [drm] PCI error: slot reset callback!!
    ...
    [  560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded!
    [  560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded!
    [  560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI
    [  560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G           OE     5.15.0-91-generic torvalds#101-Ubuntu
    [  560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023
    [  560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu]
    [  560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu]
    [  560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff <48> 83 ba 18 02 00 00 00 0f 84 6a f8 ff ff 48 8d 7a 78 be 01 00 00
    [  560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202
    [  560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0
    [  560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010
    [  560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08
    [  560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000
    [  560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000
    [  560.803889] FS:  0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000
    [  560.812973] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0
    [  560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
    [  560.843444] PKRU: 55555554
    [  560.846480] Call Trace:
    [  560.849225]  <TASK>
    [  560.851580]  ? show_trace_log_lvl+0x1d6/0x2ea
    [  560.856488]  ? show_trace_log_lvl+0x1d6/0x2ea
    [  560.861379]  ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu]
    [  560.867778]  ? show_regs.part.0+0x23/0x29
    [  560.872293]  ? __die_body.cold+0x8/0xd
    [  560.876502]  ? die_addr+0x3e/0x60
    [  560.880238]  ? exc_general_protection+0x1c5/0x410
    [  560.885532]  ? asm_exc_general_protection+0x27/0x30
    [  560.891025]  ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu]
    [  560.898323]  amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu]
    [  560.904520]  process_one_work+0x228/0x3d0
How:
    In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected
    all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure.

Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Manuel Diewald <[email protected]>
Signed-off-by: Roxana Nicolescu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants