Kernel Panic "BUG: unable to handle kernel NULL pointer dereference at 0000000000000010" because of "runc:[2:INIT]" #1857

s-r-engineer · 2018-08-06T04:53:11Z

Hi.

We worked on Centos 7.4-1708 (kernel 3.10.0-693.11.6) and Docker CE 17.12.0.ce-1.
After system crash we reinstalled OS to Centos 7.5-1804 (kernel 3.10.0-862.9.1) and Docker 18.06.0.ce-3.
Because of that system crash (human factor btw) we started to reindexing ElasticSearch shards (a lot of shards).
And in this circumstances we have had this bug:

SYS

KERNEL: /usr/lib/debug/lib/modules/3.10.0-862.9.1.el7.x86_64/vmlinux
DUMPFILE: vmcore [PARTIAL DUMP]
CPUS: 40
DATE: Thu Aug 2 20:42:21 2018
UPTIME: 13:00:49
LOAD AVERAGE: 5.42, 6.01, 6.93
TASKS: 1493
NODENAME: tech-node2-dcdp
RELEASE: 3.10.0-862.9.1.el7.x86_64
VERSION: #1 SMP Mon Jul 16 16:29:36 UTC 2018
MACHINE: x86_64 (2200 Mhz)
MEMORY: 126.6 GB
PANIC: "BUG: unable to handle kernel NULL pointer dereference at 0000000000000010"
PID: 191240
COMMAND: "runc:[2:INIT]"
TASK: ffff9780a738af70 [THREAD_INFO: ffff97743cb5c000]
CPU: 16
STATE: TASK_RUNNING (PANIC)

Backtrace

PID: 191240  TASK: ffff9780a738af70  CPU: 16  COMMAND: "runc:[2:INIT]"
 #0 [ffff97743cb5f9f8] machine_kexec at ffffffff9666178a
 #1 [ffff97743cb5fa58] __crash_kexec at ffffffff96713bf2
 #2 [ffff97743cb5fb28] crash_kexec at ffffffff96713ce0
 #3 [ffff97743cb5fb40] oops_end at ffffffff96d18738
 #4 [ffff97743cb5fb68] no_context at ffffffff96d0807e
 #5 [ffff97743cb5fbb8] __bad_area_nosemaphore at ffffffff96d08115
 #6 [ffff97743cb5fc08] bad_area_nosemaphore at ffffffff96d08286
 #7 [ffff97743cb5fc18] __do_page_fault at ffffffff96d1b6f0
 #8 [ffff97743cb5fc80] do_page_fault at ffffffff96d1b8e5
 #9 [ffff97743cb5fcb0] page_fault at ffffffff96d17758
    [exception RIP: strchr]
    RIP: ffffffff96954850  RSP: ffff97743cb5fd60  RFLAGS: 00010202
    RAX: 0000000000000001  RBX: ffff9774c5227000  RCX: 0000000000000000
    RDX: 0000000000000008  RSI: 000000000000002c  RDI: 0000000000000010
    RBP: ffff97743cb5fdb8   R8: 000038747d0249f0   R9: ffff9774c5227000
    R10: ffff977579eff800  R11: ffff97743cb5fc96  R12: ffff978514eea480
    R13: ffff9774c5227000  R14: 0000000000000001  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#10 [ffff97743cb5fd60] selinux_sb_show_options at ffffffff968dbbb5
#11 [ffff97743cb5fdc0] security_sb_show_options at ffffffff968d0c2c
#12 [ffff97743cb5fdd0] show_sb_opts at ffffffff9686077c
#13 [ffff97743cb5fdf8] show_mountinfo at ffffffff96860d3b
#14 [ffff97743cb5fe58] m_show at ffffffff9683c219
#15 [ffff97743cb5fe68] seq_read at ffffffff96841b80
#16 [ffff97743cb5fed8] vfs_read at ffffffff9681b2ff
#17 [ffff97743cb5ff08] sys_read at ffffffff9681c1cf
#18 [ffff97743cb5ff50] system_call_fastpath at ffffffff96d20795
    RIP: 0000558110c73710  RSP: 000000c4200a3c58  RFLAGS: 00000216
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: ffffffffffffffff
    RDX: 0000000000001000  RSI: 000000c42011b000  RDI: 0000000000000004
    RBP: 000000c4200a3a68   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000202  R12: ffffffffffffffff
    R13: 0000000000000002  R14: 0000000000000001  R15: 0000000000000002
    ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b

LOG

[46658.106388] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[46658.106390]   cache: kmalloc-128(4:9d0ace7f06cbe7f359d76e723f38758818620843970c59bb68e823c817fe1f35), object size: 128, buffer size: 128, default order: 1, min order: 0
[46658.106391]   node 0: slabs: 164, objs: 10368, free: 0
[46658.106392]   node 1: slabs: 53, objs: 3392, free: 88
[46658.106414] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[46658.106416]   cache: kmalloc-128(4:9d0ace7f06cbe7f359d76e723f38758818620843970c59bb68e823c817fe1f35), object size: 128, buffer size: 128, default order: 1, min order: 0
[46658.106417]   node 0: slabs: 164, objs: 10368, free: 0
[46658.106419]   node 1: slabs: 53, objs: 3392, free: 88
[46658.106453] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[46658.106455]   cache: kmalloc-128(4:9d0ace7f06cbe7f359d76e723f38758818620843970c59bb68e823c817fe1f35), object size: 128, buffer size: 128, default order: 1, min order: 0
[46658.106456]   node 0: slabs: 164, objs: 10368, free: 0
[46658.106458]   node 1: slabs: 53, objs: 3392, free: 88
[46658.132214] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[46658.132217]   cache: kmalloc-128(4:9d0ace7f06cbe7f359d76e723f38758818620843970c59bb68e823c817fe1f35), object size: 128, buffer size: 128, default order: 1, min order: 0
[46658.132219]   node 0: slabs: 164, objs: 10368, free: 0
[46658.132220]   node 1: slabs: 53, objs: 3392, free: 88
[46658.132611] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[46658.132613]   cache: kmalloc-128(4:9d0ace7f06cbe7f359d76e723f38758818620843970c59bb68e823c817fe1f35), object size: 128, buffer size: 128, default order: 1, min order: 0
[46658.132614]   node 0: slabs: 164, objs: 10368, free: 0
[46658.132616]   node 1: slabs: 53, objs: 3392, free: 88
[46658.132635] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[46658.132637]   cache: kmalloc-128(4:9d0ace7f06cbe7f359d76e723f38758818620843970c59bb68e823c817fe1f35), object size: 128, buffer size: 128, default order: 1, min order: 0
[46658.132638]   node 0: slabs: 164, objs: 10368, free: 0
[46658.132639]   node 1: slabs: 53, objs: 3392, free: 88
[46658.132657] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[46658.132658]   cache: kmalloc-128(4:9d0ace7f06cbe7f359d76e723f38758818620843970c59bb68e823c817fe1f35), object size: 128, buffer size: 128, default order: 1, min order: 0
[46658.132660]   node 0: slabs: 164, objs: 10368, free: 0
[46658.132661]   node 1: slabs: 53, objs: 3392, free: 88
[46725.383816] ___slab_alloc: 1 callbacks suppressed
[46725.383820] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[46725.383822]   cache: kmalloc-128(4:9d0ace7f06cbe7f359d76e723f38758818620843970c59bb68e823c817fe1f35), object size: 128, buffer size: 128, default order: 1, min order: 0
[46725.383824]   node 0: slabs: 169, objs: 10688, free: 0
[46725.383827]   node 1: slabs: 53, objs: 3392, free: 88
[46725.383867] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[46725.383869]   cache: kmalloc-128(4:9d0ace7f06cbe7f359d76e723f38758818620843970c59bb68e823c817fe1f35), object size: 128, buffer size: 128, default order: 1, min order: 0
[46725.383870]   node 0: slabs: 169, objs: 10688, free: 0
[46725.383872]   node 1: slabs: 53, objs: 3392, free: 88
[46739.898316] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[46739.898350] IP: [<ffffffff96954850>] strchr+0x0/0x40
[46739.898371] PGD 0 
[46739.898379] Oops: 0000 [#1] SMP 
[46739.898393] Modules linked in: xt_REDIRECT nf_nat_redirect ip_vs_rr xt_ipvs ip_vs xt_nat veth vxlan ip6_udp_tunnel udp_tunnel xt_mark ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink xt_addrtype br_netfilter overlay(T) ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter dcdbas iTCO_wdt vfat iTCO_vendor_support fat skx_edac intel_powerclamp coretemp intel_rapl xfs iosf_mbi kvm_intel kvm irqbypass libcrc32c crc32_pclmul
[46739.898664]  ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_ssif pcspkr sg mei_me mei i2c_i801 lpc_ich shpchp ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_pad acpi_power_meter ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common crc32c_intel ixgbe ahci drm libahci tg3 libata megaraid_sas mdio dca ptp i2c_core pps_core
[46739.898844] CPU: 16 PID: 191240 Comm: runc:[2:INIT] Kdump: loaded Tainted: G               ------------ T 3.10.0-862.9.1.el7.x86_64 #1
[46739.898878] Hardware name: Dell Inc. PowerEdge R440/0N28XX, BIOS 1.2.11 10/20/2017
[46739.898900] task: ffff9780a738af70 ti: ffff97743cb5c000 task.ti: ffff97743cb5c000
[46739.898922] RIP: 0010:[<ffffffff96954850>]  [<ffffffff96954850>] strchr+0x0/0x40
[46739.898948] RSP: 0018:ffff97743cb5fd60  EFLAGS: 00010202
[46739.898964] RAX: 0000000000000001 RBX: ffff9774c5227000 RCX: 0000000000000000
[46739.898984] RDX: 0000000000000008 RSI: 000000000000002c RDI: 0000000000000010
[46739.899004] RBP: ffff97743cb5fdb8 R08: 000038747d0249f0 R09: ffff9774c5227000
[46739.899025] R10: ffff977579eff800 R11: ffff97743cb5fc96 R12: ffff978514eea480
[46739.899044] R13: ffff9774c5227000 R14: 0000000000000001 R15: 0000000000000000
[46739.899065] FS:  00007fe7ad864740(0000) GS:ffff97757dc00000(0000) knlGS:0000000000000000
[46739.899088] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[46739.899105] CR2: 0000000000000010 CR3: 000000045c318000 CR4: 00000000007607e0
[46739.899143] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[46739.899165] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[46739.899186] PKRU: 55555554
[46739.899196] Call Trace:
[46739.899209]  [<ffffffff968dbbb5>] ? selinux_sb_show_options+0x135/0x470
[46739.899234]  [<ffffffff968d0c2c>] security_sb_show_options+0x1c/0x20
[46739.899267]  [<ffffffff9686077c>] show_sb_opts+0x5c/0x70
[46739.899284]  [<ffffffff96860d3b>] show_mountinfo+0x20b/0x2d0
[46739.899304]  [<ffffffff9683c219>] m_show+0x19/0x20
[46739.899319]  [<ffffffff96841b80>] seq_read+0x110/0x3f0
[46739.899337]  [<ffffffff9681b2ff>] vfs_read+0x9f/0x170
[46739.899353]  [<ffffffff9681c1cf>] SyS_read+0x7f/0xf0
[46739.899372]  [<ffffffff96d20795>] system_call_fastpath+0x1c/0x21
[46739.899390] Code: c1 75 18 48 83 c0 01 84 c9 74 05 48 39 d0 75 e3 31 c0 5d c3 0f 1f 80 00 00 00 00 44 38 c1 19 c0 83 c8 01 5d c3 66 0f 1f 44 00 00 <0fb6 07 55 48 89 e5 40 38 f0 74 1b 84 c0 89 f2 75 0a eb 1c 0f 
[46739.900319] RIP  [<ffffffff96954850>] strchr+0x0/0x40
[46739.901106]  RSP <ffff97743cb5fd60>
[46739.901846] CR2: 0000000000000010

Situation is very similar to this in case of reproducing the problem.
After all we just downgraded our kernel and docker to last known working configuration (kernel 3.10.0-693.11.6 and Docker CE 17.12.0.ce-1) and all works fine.
#1725

The text was updated successfully, but these errors were encountered:

cyphar · 2018-08-06T06:04:08Z

It looks like something in SELinux is triggering this -- in particular it looks like runc does a cat /proc/mountinfo which SELinux then fails to fill the mount options for. Looking into the code itself, it looks like SELinux is checking whether there is a comma in its internal mount options to decide whether it should quote it (from security/selinux/hooks.c):

static void selinux_write_opts(struct seq_file *m,
			       struct security_mnt_opts *opts)
{
	int i;
	char *prefix;

	for (i = 0; i < opts->num_mnt_opts; i++) {
		char *has_comma;

		if (opts->mnt_opts[i])
			has_comma = strchr(opts->mnt_opts[i], ','); /* <<< panic is here <<< */
		else
			has_comma = NULL;

		switch (opts->mnt_opts_flags[i]) {
		case CONTEXT_MNT:
			prefix = CONTEXT_STR;
			break;
		case FSCONTEXT_MNT:
			prefix = FSCONTEXT_STR;
			break;
		case ROOTCONTEXT_MNT:
			prefix = ROOTCONTEXT_STR;
			break;
		case DEFCONTEXT_MNT:
			prefix = DEFCONTEXT_STR;
			break;
		case SBLABEL_MNT:
			seq_putc(m, ',');
			seq_puts(m, LABELSUPP_STR);
			continue;
		default:
			BUG();
			return;
		};
		/* we need a comma before each option */
		seq_putc(m, ',');
		seq_puts(m, prefix);
		if (has_comma)
			seq_putc(m, '\"');
		seq_escape(m, opts->mnt_opts[i], "\"\n\\");
		if (has_comma)
			seq_putc(m, '\"');
	}
}

Where the line marked is where the panic appears to be.

/cc @rhatdan

s-r-engineer · 2018-08-06T06:33:56Z

@cyphar
Yeap. We tried configuration with disabled SELinux. All the same.

s-r-engineer · 2018-08-06T06:37:40Z

But to be honest that run (without SELinux) we couldn't catch. We have had bad dump and I not completely sure what exactly happens.

cyphar · 2018-08-06T07:26:01Z

@PPerminov

We tried configuration with disabled SELinux. All the same.

How did you disable it? setenforce 0 or by actually switching LSMs and rebooting? I ask because it's possible that the same codepath would be exercised even if you disable SELinux with setenforce (as far as I know the LSM hooks aren't de-registered at runtime).

s-r-engineer · 2018-08-06T08:22:29Z

@cyphar It was changed in config and rebooted.

s-r-engineer · 2018-08-15T05:50:54Z

@cyphar
Hi
Centos Bug Report

benamazing · 2019-07-23T04:55:11Z

Hi, is the problem solved? I encountered the same bug as well, and the kernel version is also 3.10.0-862.9.1.el7.x86_64

AkihiroSuda · 2020-03-09T09:48:46Z

This should be reported to the (distro) kernel maintainers.

If only RHEL 7 users are hitting this, this should be reported to Red Hat.

s-r-engineer mentioned this issue Aug 6, 2018

Enabling kmem accounting can break applications on CentOS7 #1725

Closed

AkihiroSuda closed this as completed Mar 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kernel Panic "BUG: unable to handle kernel NULL pointer dereference at 0000000000000010" because of "runc:[2:INIT]" #1857

Kernel Panic "BUG: unable to handle kernel NULL pointer dereference at 0000000000000010" because of "runc:[2:INIT]" #1857

s-r-engineer commented Aug 6, 2018 •

edited

Loading

cyphar commented Aug 6, 2018

s-r-engineer commented Aug 6, 2018

s-r-engineer commented Aug 6, 2018

cyphar commented Aug 6, 2018

s-r-engineer commented Aug 6, 2018

s-r-engineer commented Aug 15, 2018

benamazing commented Jul 23, 2019

AkihiroSuda commented Mar 9, 2020

Kernel Panic "BUG: unable to handle kernel NULL pointer dereference at 0000000000000010" because of "runc:[2:INIT]" #1857

Kernel Panic "BUG: unable to handle kernel NULL pointer dereference at 0000000000000010" because of "runc:[2:INIT]" #1857

Comments

s-r-engineer commented Aug 6, 2018 • edited Loading

cyphar commented Aug 6, 2018

s-r-engineer commented Aug 6, 2018

s-r-engineer commented Aug 6, 2018

cyphar commented Aug 6, 2018

s-r-engineer commented Aug 6, 2018

s-r-engineer commented Aug 15, 2018

benamazing commented Jul 23, 2019

AkihiroSuda commented Mar 9, 2020

s-r-engineer commented Aug 6, 2018 •

edited

Loading