panic loop: VERIFY0(0 == dmu_object_free(spa->spa_meta_objset, spa_err_obj, tx)) failed (0 == 2) #15277

mmaybee · 2023-09-15T15:19:26Z

System information

Type	Version/Name
Distribution Name	Ubuntu
Distribution Version	20.4
Kernel Version	5.15.0-67
Architecture	x86_64
OpenZFS Version	zfs-2.2.99

Describe the problem you're observing

After upgrading a system which did not have the head_errlog feature (introduced in 0409d33) and with an existing scrub IO error recorded, to a version which did have this feature, the system entered a panic loop during a scrub.

When a pool transitions to a state where the head_errlog feature is enabled, any persistent errlog state is converted to a new format. This involves deleting the old persistent errlog objects (errlog_last and errlog_scrub) and replacing them with new ones. While the internal state is updated (spa_errlog_last and spa_errlog_scrub), the persistent pointers to these objects in the MOS directory object are not updated. These are currently only updated in spa_errlog_sync() when it is called during sync phase (and there is "something to do").

Describe how to reproduce the problem

The following sequence will reproduce the issue:

Start with a pool which does not have the head_errlog feature (this was introduced in release 6.0.15.0).

$ zpool create -o feature@head_errlog=disabled test_pool sdb

A data error occurs in the pool at some point and is discovered via a scrub. This will result in the errlog_last entry in the pools MOS directory to have a non-zero value (an object Id for a zap object that records the error block). Note that on pool import the entries from the MOS directory object are used to populate fields in the spa (e.g. errlog_last is used to initialize spa_errlog_last).

$ dd if=/dev/urandom of=/test_pool/file bs=128K count=1000
$ zinject -b 36:2:0:0:1 test_pool
$ zpool scrub test_pool
$ sudo zdb -dddd test 1
Dataset mos [META], ID 0, cr_txg 4, 141K, 61 objects, rootbp DVA[0]=<0:c0016200:200> DVA[1]=<0:e0016200:200> DVA[2]=<0:100022800:200> [L0 DMU objset] fletcher4 lz4 unencrypted LE contiguous unique triple size=1000L/200P birth=89L/89P fill=61 cksum=0000000961588255:000003cffd0c0ccd:0000c9dbf729c1f0:001c559859613442

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
         1    1   128K    16K  13.5K     512    32K  100.00  object directory
	dnode flags: USED_BYTES 
	dnode maxblkid: 1
	Fat ZAP stats:
		Pointer table:
			1024 elements
			zt_blk: 0
			zt_numblks: 0
			zt_shift: 10
			zt_blks_copied: 0
			zt_nextblk: 0
		ZAP entries: 19
		Leaf blocks: 1
		Total blocks: 2
		zap_block_type: 0x8000000000000001
			zap_magic: 0x2f52ab2ab
		zap_salt: 0x1df1dddb
		Leafs with 2^n pointers:
			  9:      1 *
		Blocks with n*5 entries:
			  3:      1 *
		Blocks n/10 full:
			  2:      1 *
		Entries with n chunks:
			  3:     13 *************
			  4:      3 ***
			  5:      1 *
			  6:      1 *
			  7:      0 
			  8:      0 
			  9:      1 *
		Buckets with n entries:
			  0:    494 ****************************************
			  1:     17 **
			  2:      1 *

		creation_version = 5000 
		feature_descriptions = 53 
		sync_bplist = 62 
		org.illumos:checksum_salt = 8b469083295342722ca32f7963150eacd1153da3fa0471e12d59130d73b853d2
		scan = 1 2 0 0 49 1 49 1694731576 1694731576 131278336 131278336 44544 0 1 1 3 0 0 0 -1 0 0 0 0 
		free_bpobj = 41 
		feature_enabled_txg = 63 
		deflate = 1 
		errlog_scrub = 0 
		root_dataset = 32 
		com.delphix:log_spacemap_zap = 69 
		config = 61 
		error_scrub = 0 0 0 0 0 0 0 0 0 
		history = 60 
		features_for_write = 52 
		features_for_read = 51 
		com.delphix:bookmark_v2_recalculated = 1 
		com.delphix:vdev_zap_map = 64 
		errlog_last = 89

Upgrade the pool to a later version (e.g., 7.0.0.0.0). This will cause spa_errlog_upgrade() to be called, which will transfer the entries from the old object and delete the old object (replacing the value of spa_errlog_last). Note that this function does not update the MOS directory object, so the directory will continue to reference the old (now deleted) object.

$ zpool set feature@head_errlog=enabled test_pool
$ sdb
sdb> spa
ADDR               NAME
------------------------------------------------------------
0xffff8968924e4000 rpool
0xffff89690d298000 test_pool
sdb> addr 0xffff89690d298000 | cast spa_t * | member spa_errlog_last
(uint64_t)129

Before another scrub, reboot the system. This will cause the spa to be initialized with, now stale, data from the MOS directory object. At this point the spa will now reference a deleted object from spa_errlog_last and the MOS directory object continues to reference the deleted object as well.

$ reboot
...
$ sdb
sdb> spa
ADDR               NAME
------------------------------------------------------------
0xffff9cc08c598000 rpool
0xffff9cc096e6c000 test_pool
sdb> addr 0xffff9cc096e6c000 | cast spa_t * | member spa_errlog_last
(uint64_t)89

Start a new scrub. When we attempt to rotate the new scrub data (from spa_errlog_scrub) to the spa_errlog_last we will detect a non-zero value in spa_errlog_last and attempt to free that (already freed) object. This will trigger the panic loop.

$ zpool scrub test_pool

Include any warning/errors/backtraces from the system logs

Panic stack:

[   34.473834] VERIFY0(0 == dmu_object_free(spa->spa_meta_objset, spa_err_obj, tx)) failed (0 == 2)
[   34.475480] PANIC at spa_errlog.c:1139:delete_errlog()
[   34.476432] Kernel panic - not syncing: VERIFY0(0 == dmu_object_free(spa->spa_meta_objset, spa_err_obj, tx)) failed (0 == 2)
[   34.478437] CPU: 1 PID: 1059 Comm: txg_sync Tainted: P           OE     5.4.0-135-dx2023011005-db2b50090-generic #152
[   34.480332] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
[   34.482249] Call Trace:
[   34.482712]  dump_stack+0x6d/0x8b
[   34.483327]  panic+0x101/0x2e3
[   34.483894]  spl_panic+0xcf/0x102 [spl]
[   34.487189]  delete_errlog+0x124/0x130 [zfs]
[   34.496503]  spa_errlog_sync+0x256/0x260 [zfs]
[   34.499310]  spa_sync_iterate_to_convergence+0xe5/0x250 [zfs]
[   34.500415]  spa_sync+0x2f7/0x670 [zfs]
[   34.501181]  txg_sync_thread+0x22d/0x2d0 [zfs]
[   34.502988]  thread_generic_wrapper+0x83/0xa0 [spl]
[   34.503889]  kthread+0x104/0x140
[   34.506175]  ret_from_fork+0x1f/0x40

The text was updated successfully, but these errors were encountered:

mmaybee · 2023-09-15T15:20:48Z

@gamanakis please take a look at this issue

gamanakis · 2023-09-15T18:16:03Z

Thank you for catching this, and for the useful analysis. Let me see how we can fix it.

gamanakis · 2023-09-15T20:46:13Z

@mmaybee I opened #15279. Tested it locally and it seems to work.

gamanakis · 2023-09-15T20:46:57Z

@behlendorf we should include this in zfs-2.2-release.

mmaybee · 2023-09-15T21:55:04Z

Yes, I agree, this should go into 2.2. The recovery from this can be quite cumbersome (as seen in #14643). I ended up installing a modified ZFS kernel to recover the one instance we encountered.

spa_upgrade_errlog() does not update the MOS directory when the head_errlog feature is enabled. In this case if spa_errlog_sync() is not called, the MOS dir references the old errlog_last and errlog_sync objects. Thus when doing a scrub a panic will occur: Call Trace: dump_stack+0x6d/0x8b panic+0x101/0x2e3 spl_panic+0xcf/0x102 [spl] delete_errlog+0x124/0x130 [zfs] spa_errlog_sync+0x256/0x260 [zfs] spa_sync_iterate_to_convergence+0xe5/0x250 [zfs] spa_sync+0x2f7/0x670 [zfs] txg_sync_thread+0x22d/0x2d0 [zfs] thread_generic_wrapper+0x83/0xa0 [spl] kthread+0x104/0x140 ret_from_fork+0x1f/0x40 Fix this by updating the related MOS directory objects in spa_upgrade_errlog(). Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes #15279 Closes #15277

spa_upgrade_errlog() does not update the MOS directory when the head_errlog feature is enabled. In this case if spa_errlog_sync() is not called, the MOS dir references the old errlog_last and errlog_sync objects. Thus when doing a scrub a panic will occur: Call Trace: dump_stack+0x6d/0x8b panic+0x101/0x2e3 spl_panic+0xcf/0x102 [spl] delete_errlog+0x124/0x130 [zfs] spa_errlog_sync+0x256/0x260 [zfs] spa_sync_iterate_to_convergence+0xe5/0x250 [zfs] spa_sync+0x2f7/0x670 [zfs] txg_sync_thread+0x22d/0x2d0 [zfs] thread_generic_wrapper+0x83/0xa0 [spl] kthread+0x104/0x140 ret_from_fork+0x1f/0x40 Fix this by updating the related MOS directory objects in spa_upgrade_errlog(). Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes openzfs#15279 Closes openzfs#15277

spa_upgrade_errlog() does not update the MOS directory when the head_errlog feature is enabled. In this case if spa_errlog_sync() is not called, the MOS dir references the old errlog_last and errlog_sync objects. Thus when doing a scrub a panic will occur: Call Trace: dump_stack+0x6d/0x8b panic+0x101/0x2e3 spl_panic+0xcf/0x102 [spl] delete_errlog+0x124/0x130 [zfs] spa_errlog_sync+0x256/0x260 [zfs] spa_sync_iterate_to_convergence+0xe5/0x250 [zfs] spa_sync+0x2f7/0x670 [zfs] txg_sync_thread+0x22d/0x2d0 [zfs] thread_generic_wrapper+0x83/0xa0 [spl] kthread+0x104/0x140 ret_from_fork+0x1f/0x40 Fix this by updating the related MOS directory objects in spa_upgrade_errlog(). Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes #15279 Closes #15277

spa_upgrade_errlog() does not update the MOS directory when the head_errlog feature is enabled. In this case if spa_errlog_sync() is not called, the MOS dir references the old errlog_last and errlog_sync objects. Thus when doing a scrub a panic will occur: Call Trace: dump_stack+0x6d/0x8b panic+0x101/0x2e3 spl_panic+0xcf/0x102 [spl] delete_errlog+0x124/0x130 [zfs] spa_errlog_sync+0x256/0x260 [zfs] spa_sync_iterate_to_convergence+0xe5/0x250 [zfs] spa_sync+0x2f7/0x670 [zfs] txg_sync_thread+0x22d/0x2d0 [zfs] thread_generic_wrapper+0x83/0xa0 [spl] kthread+0x104/0x140 ret_from_fork+0x1f/0x40 Fix this by updating the related MOS directory objects in spa_upgrade_errlog(). Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes openzfs#15279 Closes openzfs#15277

mmaybee added the Type: Defect Incorrect behavior (e.g. crash, hang) label Sep 15, 2023

gamanakis mentioned this issue Sep 15, 2023

Update the MOS directory on spa_upgrade_errlog() #15279

Merged

13 tasks

gamanakis mentioned this issue Sep 15, 2023

[2.2] Update the MOS directory on spa_upgrade_errlog() #15280

Merged

13 tasks

behlendorf closed this as completed in #15279 Sep 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

panic loop: VERIFY0(0 == dmu_object_free(spa->spa_meta_objset, spa_err_obj, tx)) failed (0 == 2) #15277

panic loop: VERIFY0(0 == dmu_object_free(spa->spa_meta_objset, spa_err_obj, tx)) failed (0 == 2) #15277

mmaybee commented Sep 15, 2023

mmaybee commented Sep 15, 2023

gamanakis commented Sep 15, 2023

gamanakis commented Sep 15, 2023

gamanakis commented Sep 15, 2023

mmaybee commented Sep 15, 2023

panic loop: VERIFY0(0 == dmu_object_free(spa->spa_meta_objset, spa_err_obj, tx)) failed (0 == 2) #15277

panic loop: VERIFY0(0 == dmu_object_free(spa->spa_meta_objset, spa_err_obj, tx)) failed (0 == 2) #15277

Comments

mmaybee commented Sep 15, 2023

System information

Describe the problem you're observing

Describe how to reproduce the problem

Include any warning/errors/backtraces from the system logs

mmaybee commented Sep 15, 2023

gamanakis commented Sep 15, 2023

gamanakis commented Sep 15, 2023

gamanakis commented Sep 15, 2023

mmaybee commented Sep 15, 2023