Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic loop: VERIFY0(0 == dmu_object_free(spa->spa_meta_objset, spa_err_obj, tx)) failed (0 == 2) #15277

Closed
mmaybee opened this issue Sep 15, 2023 · 5 comments · Fixed by #15279
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@mmaybee
Copy link
Contributor

mmaybee commented Sep 15, 2023

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 20.4
Kernel Version 5.15.0-67
Architecture x86_64
OpenZFS Version zfs-2.2.99

Describe the problem you're observing

After upgrading a system which did not have the head_errlog feature (introduced in 0409d33) and with an existing scrub IO error recorded, to a version which did have this feature, the system entered a panic loop during a scrub.

When a pool transitions to a state where the head_errlog feature is enabled, any persistent errlog state is converted to a new format. This involves deleting the old persistent errlog objects (errlog_last and errlog_scrub) and replacing them with new ones. While the internal state is updated (spa_errlog_last and spa_errlog_scrub), the persistent pointers to these objects in the MOS directory object are not updated. These are currently only updated in spa_errlog_sync() when it is called during sync phase (and there is "something to do").

Describe how to reproduce the problem

The following sequence will reproduce the issue:

Start with a pool which does not have the head_errlog feature (this was introduced in release 6.0.15.0).

$ zpool create -o feature@head_errlog=disabled test_pool sdb

A data error occurs in the pool at some point and is discovered via a scrub. This will result in the errlog_last entry in the pools MOS directory to have a non-zero value (an object Id for a zap object that records the error block). Note that on pool import the entries from the MOS directory object are used to populate fields in the spa (e.g. errlog_last is used to initialize spa_errlog_last).

$ dd if=/dev/urandom of=/test_pool/file bs=128K count=1000
$ zinject -b 36:2:0:0:1 test_pool
$ zpool scrub test_pool
$ sudo zdb -dddd test 1
Dataset mos [META], ID 0, cr_txg 4, 141K, 61 objects, rootbp DVA[0]=<0:c0016200:200> DVA[1]=<0:e0016200:200> DVA[2]=<0:100022800:200> [L0 DMU objset] fletcher4 lz4 unencrypted LE contiguous unique triple size=1000L/200P birth=89L/89P fill=61 cksum=0000000961588255:000003cffd0c0ccd:0000c9dbf729c1f0:001c559859613442

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
         1    1   128K    16K  13.5K     512    32K  100.00  object directory
	dnode flags: USED_BYTES 
	dnode maxblkid: 1
	Fat ZAP stats:
		Pointer table:
			1024 elements
			zt_blk: 0
			zt_numblks: 0
			zt_shift: 10
			zt_blks_copied: 0
			zt_nextblk: 0
		ZAP entries: 19
		Leaf blocks: 1
		Total blocks: 2
		zap_block_type: 0x8000000000000001
			zap_magic: 0x2f52ab2ab
		zap_salt: 0x1df1dddb
		Leafs with 2^n pointers:
			  9:      1 *
		Blocks with n*5 entries:
			  3:      1 *
		Blocks n/10 full:
			  2:      1 *
		Entries with n chunks:
			  3:     13 *************
			  4:      3 ***
			  5:      1 *
			  6:      1 *
			  7:      0 
			  8:      0 
			  9:      1 *
		Buckets with n entries:
			  0:    494 ****************************************
			  1:     17 **
			  2:      1 *

		creation_version = 5000 
		feature_descriptions = 53 
		sync_bplist = 62 
		org.illumos:checksum_salt = 8b469083295342722ca32f7963150eacd1153da3fa0471e12d59130d73b853d2
		scan = 1 2 0 0 49 1 49 1694731576 1694731576 131278336 131278336 44544 0 1 1 3 0 0 0 -1 0 0 0 0 
		free_bpobj = 41 
		feature_enabled_txg = 63 
		deflate = 1 
		errlog_scrub = 0 
		root_dataset = 32 
		com.delphix:log_spacemap_zap = 69 
		config = 61 
		error_scrub = 0 0 0 0 0 0 0 0 0 
		history = 60 
		features_for_write = 52 
		features_for_read = 51 
		com.delphix:bookmark_v2_recalculated = 1 
		com.delphix:vdev_zap_map = 64 
		errlog_last = 89 

Upgrade the pool to a later version (e.g., 7.0.0.0.0). This will cause spa_errlog_upgrade() to be called, which will transfer the entries from the old object and delete the old object (replacing the value of spa_errlog_last). Note that this function does not update the MOS directory object, so the directory will continue to reference the old (now deleted) object.

$ zpool set feature@head_errlog=enabled test_pool
$ sdb
sdb> spa
ADDR               NAME
------------------------------------------------------------
0xffff8968924e4000 rpool
0xffff89690d298000 test_pool
sdb> addr 0xffff89690d298000 | cast spa_t * | member spa_errlog_last
(uint64_t)129

Before another scrub, reboot the system. This will cause the spa to be initialized with, now stale, data from the MOS directory object. At this point the spa will now reference a deleted object from spa_errlog_last and the MOS directory object continues to reference the deleted object as well.

$ reboot
...
$ sdb
sdb> spa
ADDR               NAME
------------------------------------------------------------
0xffff9cc08c598000 rpool
0xffff9cc096e6c000 test_pool
sdb> addr 0xffff9cc096e6c000 | cast spa_t * | member spa_errlog_last
(uint64_t)89

Start a new scrub. When we attempt to rotate the new scrub data (from spa_errlog_scrub) to the spa_errlog_last we will detect a non-zero value in spa_errlog_last and attempt to free that (already freed) object. This will trigger the panic loop.

$ zpool scrub test_pool

Include any warning/errors/backtraces from the system logs

Panic stack:

[   34.473834] VERIFY0(0 == dmu_object_free(spa->spa_meta_objset, spa_err_obj, tx)) failed (0 == 2)
[   34.475480] PANIC at spa_errlog.c:1139:delete_errlog()
[   34.476432] Kernel panic - not syncing: VERIFY0(0 == dmu_object_free(spa->spa_meta_objset, spa_err_obj, tx)) failed (0 == 2)
[   34.478437] CPU: 1 PID: 1059 Comm: txg_sync Tainted: P           OE     5.4.0-135-dx2023011005-db2b50090-generic #152
[   34.480332] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
[   34.482249] Call Trace:
[   34.482712]  dump_stack+0x6d/0x8b
[   34.483327]  panic+0x101/0x2e3
[   34.483894]  spl_panic+0xcf/0x102 [spl]
[   34.487189]  delete_errlog+0x124/0x130 [zfs]
[   34.496503]  spa_errlog_sync+0x256/0x260 [zfs]
[   34.499310]  spa_sync_iterate_to_convergence+0xe5/0x250 [zfs]
[   34.500415]  spa_sync+0x2f7/0x670 [zfs]
[   34.501181]  txg_sync_thread+0x22d/0x2d0 [zfs]
[   34.502988]  thread_generic_wrapper+0x83/0xa0 [spl]
[   34.503889]  kthread+0x104/0x140
[   34.506175]  ret_from_fork+0x1f/0x40
@mmaybee mmaybee added the Type: Defect Incorrect behavior (e.g. crash, hang) label Sep 15, 2023
@mmaybee
Copy link
Contributor Author

mmaybee commented Sep 15, 2023

@gamanakis please take a look at this issue

@gamanakis
Copy link
Contributor

Thank you for catching this, and for the useful analysis. Let me see how we can fix it.

@gamanakis
Copy link
Contributor

@mmaybee I opened #15279. Tested it locally and it seems to work.

@gamanakis
Copy link
Contributor

@behlendorf we should include this in zfs-2.2-release.

@mmaybee
Copy link
Contributor Author

mmaybee commented Sep 15, 2023

Yes, I agree, this should go into 2.2. The recovery from this can be quite cumbersome (as seen in #14643). I ended up installing a modified ZFS kernel to recover the one instance we encountered.

behlendorf pushed a commit that referenced this issue Sep 19, 2023
spa_upgrade_errlog() does not update the MOS directory when the
head_errlog feature is enabled. In this case if spa_errlog_sync() is not
called, the MOS dir references the old errlog_last and errlog_sync
objects. Thus when doing a scrub a panic will occur:

Call Trace:
 dump_stack+0x6d/0x8b
 panic+0x101/0x2e3
 spl_panic+0xcf/0x102 [spl]
 delete_errlog+0x124/0x130 [zfs]
 spa_errlog_sync+0x256/0x260 [zfs]
 spa_sync_iterate_to_convergence+0xe5/0x250 [zfs]
 spa_sync+0x2f7/0x670 [zfs]
 txg_sync_thread+0x22d/0x2d0 [zfs]
 thread_generic_wrapper+0x83/0xa0 [spl]
 kthread+0x104/0x140
 ret_from_fork+0x1f/0x40

Fix this by updating the related MOS directory objects in
spa_upgrade_errlog().

Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #15279 
Closes #15277
gamanakis added a commit to gamanakis/zfs that referenced this issue Sep 19, 2023
spa_upgrade_errlog() does not update the MOS directory when the
head_errlog feature is enabled. In this case if spa_errlog_sync() is not
called, the MOS dir references the old errlog_last and errlog_sync
objects. Thus when doing a scrub a panic will occur:

Call Trace:
 dump_stack+0x6d/0x8b
 panic+0x101/0x2e3
 spl_panic+0xcf/0x102 [spl]
 delete_errlog+0x124/0x130 [zfs]
 spa_errlog_sync+0x256/0x260 [zfs]
 spa_sync_iterate_to_convergence+0xe5/0x250 [zfs]
 spa_sync+0x2f7/0x670 [zfs]
 txg_sync_thread+0x22d/0x2d0 [zfs]
 thread_generic_wrapper+0x83/0xa0 [spl]
 kthread+0x104/0x140
 ret_from_fork+0x1f/0x40

Fix this by updating the related MOS directory objects in
spa_upgrade_errlog().

Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes openzfs#15279 
Closes openzfs#15277
behlendorf pushed a commit that referenced this issue Sep 19, 2023
spa_upgrade_errlog() does not update the MOS directory when the
head_errlog feature is enabled. In this case if spa_errlog_sync() is not
called, the MOS dir references the old errlog_last and errlog_sync
objects. Thus when doing a scrub a panic will occur:

Call Trace:
 dump_stack+0x6d/0x8b
 panic+0x101/0x2e3
 spl_panic+0xcf/0x102 [spl]
 delete_errlog+0x124/0x130 [zfs]
 spa_errlog_sync+0x256/0x260 [zfs]
 spa_sync_iterate_to_convergence+0xe5/0x250 [zfs]
 spa_sync+0x2f7/0x670 [zfs]
 txg_sync_thread+0x22d/0x2d0 [zfs]
 thread_generic_wrapper+0x83/0xa0 [spl]
 kthread+0x104/0x140
 ret_from_fork+0x1f/0x40

Fix this by updating the related MOS directory objects in
spa_upgrade_errlog().

Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #15279 
Closes #15277
lundman pushed a commit to openzfsonwindows/openzfs that referenced this issue Dec 12, 2023
spa_upgrade_errlog() does not update the MOS directory when the
head_errlog feature is enabled. In this case if spa_errlog_sync() is not
called, the MOS dir references the old errlog_last and errlog_sync
objects. Thus when doing a scrub a panic will occur:

Call Trace:
 dump_stack+0x6d/0x8b
 panic+0x101/0x2e3
 spl_panic+0xcf/0x102 [spl]
 delete_errlog+0x124/0x130 [zfs]
 spa_errlog_sync+0x256/0x260 [zfs]
 spa_sync_iterate_to_convergence+0xe5/0x250 [zfs]
 spa_sync+0x2f7/0x670 [zfs]
 txg_sync_thread+0x22d/0x2d0 [zfs]
 thread_generic_wrapper+0x83/0xa0 [spl]
 kthread+0x104/0x140
 ret_from_fork+0x1f/0x40

Fix this by updating the related MOS directory objects in
spa_upgrade_errlog().

Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes openzfs#15279 
Closes openzfs#15277
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants