Skip to content

Commit

Permalink
zfs: Fix a deadlock between page busy and the teardown lock
Browse files Browse the repository at this point in the history
When rolling back a dataset, ZFS has to purge file data resident in the
system page cache.  To do this, it loops over all vnodes for the
mountpoint and calls vn_pages_remove() to purge pages associated with
the vnode's VM object.  Each page is thus exclusively busied while the
dataset's teardown write lock is held.

When handling a page fault on a mapped ZFS file, FreeBSD's page fault
handler busies newly allocated pages and then uses VOP_GETPAGES to fill
them.  The ZFS getpages VOP acquires the teardown read lock with vnode
pages already busied.  This represents a lock order reversal which can
lead to deadlock.

To break the deadlock, observe that zfs_rezget() need only purge those
pages marked valid, and that pages busied by the page fault handler are,
by definition, invalid.  Furthermore, ZFS pages always transition from
invalid to valid with the teardown lock held, and ZFS never creates
partially valid pages.  Thus, zfs_rezget() can use the new
vn_pages_remove_valid() to skip over pages busied by the fault handler.

PR:		258208
Tested by:	pho
Reviewed by:	avg, sef, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32931

Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #12828
  • Loading branch information
markjdb authored and behlendorf committed Dec 12, 2021
1 parent d172264 commit cdf7467
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions module/os/freebsd/zfs/zfs_znode.c
Original file line number Diff line number Diff line change
Expand Up @@ -1079,9 +1079,18 @@ zfs_rezget(znode_t *zp)
* the vnode in case of error, but currently we cannot do that
* because of the LOR between the vnode lock and z_teardown_lock.
* So, instead, we have to "doom" the znode in the illumos style.
*
* Ignore invalid pages during the scan. This is to avoid deadlocks
* between page busying and the teardown lock, as pages are busied prior
* to a VOP_GETPAGES operation, which acquires the teardown read lock.
* Such pages will be invalid and can safely be skipped here.
*/
vp = ZTOV(zp);
#if __FreeBSD_version >= 1400042
vn_pages_remove_valid(vp, 0, 0);
#else
vn_pages_remove(vp, 0, 0);
#endif

ZFS_OBJ_HOLD_ENTER(zfsvfs, obj_num);

Expand Down

0 comments on commit cdf7467

Please sign in to comment.