Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mutex_enter_interruptible() for interruptible sleeping IOCTLs #15360

Merged
merged 1 commit into from
Oct 26, 2023

Conversation

bertschinger
Copy link
Contributor

Motivation and Context

This is to resolve a usability issue with running zpool status during a long-running zpool import. zpool status sleeps uninterruptibly on the mutex, so the sysadmin gets a unkillable hanging shell, which is a bad user experience. In our environment it's common for zpool import to take 5+ minutes, so we encounter this frequently.

It's not just a problem for interactive use -- we use Pacemaker with the ZFS ocf resource agents, and these run zpool status periodically to perform health checks. Pacemaker continues to run these health checks while a pool is importing, and this can result in a very high number of hung zpool status tasks stuck in D state. This is mainly an annoyance as they clutter up ps output, etc., it isn't a performance issue, but being able to kill these processes would be nice.

There are further usability improvements possible here, for example, an option could be added to zpool status that makes it return immediately if it would block due to a concurrent zpool import. Possibly zpool import could set a flag and zpool status could check the flag before calling mutex_enter(). Mutex_tryenter() wouldn't work in this situation because it couldn't distinguish two concurrent status calls, for example. This is less crucial than making zpool status interruptible in my opinion, but just suggesting it as a further idea in case others like it.

There are other IOCTLs that zpool status calls (ZFS_IOC_POOL_STATS, ZFS_IOC_POOL_GET_PROPS) that should in principle be changed to use mutex_enter_interruptible(), since a zpool status could start and complete the first IOCTL (ZFS_IOC_POOL_CONFIGS), then a zpool import could start, then zpool status would sleep uninterruptibly. However, getting this first IOCTL converted should cover the 99% common case. If the response to this patch is positive I can submit a follow-up that converts the other 2 IOCTLs.

Description

Many long-running ZFS ioctls lock the spa_namespace_lock, forcing
concurrent ioctls to sleep for the mutex. Previously, the only
option is to call mutex_enter() which sleeps uninterruptibly. This
is a usability issue for sysadmins, for example, if the admin runs
`zpool status` while a slow `zpool import` is ongoing, the admin's
shell will be locked in uninterruptible sleep for a long time.

This patch resolves this admin usability issue by introducing
mutex_enter_interruptible() which sleeps interruptibly while waiting
to acquire a lock. It is implemented for both Linux and FreeBSD.

The ZFS_IOC_POOL_CONFIGS ioctl, used by `zpool status`, is changed to
use this new macro so that the command can be interrupted if it is
issued during a concurrent `zpool import` (or other long-running
operation).

How Has This Been Tested?

Tested on Linux 6.5.0-rc6 and FreeBSD 13.2-RELEASE-p3.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

include/os/linux/spl/sys/mutex.h Show resolved Hide resolved
module/zfs/spa_config.c Outdated Show resolved Hide resolved
@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Oct 7, 2023
@behlendorf behlendorf self-requested a review October 7, 2023 16:17
@mjguzik
Copy link
Contributor

mjguzik commented Oct 9, 2023

Sounds like the real fix would whack dependency on this lock. ultimately no amount of pending imports should block zpool status to begin with.

Has anyone looked into this?

@bertschinger
Copy link
Contributor Author

Sounds like the real fix would whack dependency on this lock. ultimately no amount of pending imports should block zpool status to begin with.

This would also resolve my pain point, although it sounds like a more significant change since from what I can tell the spa_namespace_lock is pretty tightly integrated into the status command. Can you and/or others more familiar with the ZFS code than me comment on the complexity/feasibility of this? If that's the right way to go, I'd be willing to take a stab at it.

I think there are design questions with making zpool status return while an import is in progress. Should it print "no pools available", or something like "Import of pool XYZ in progress"? I would prefer the latter. What if one pool is already imported and another import is in progress?

Finally, even if zpool status can be changed to not depend on the lock, are there other potential users of this new interruptible function if it becomes available? For example, if I run zpool import A in one shell and zpool import B in another, I may wish to interrupt the second import if things take too long. I'm just brainstorming in case there are other cases where this could be helpful...

@tonyhutter
Copy link
Contributor

we use Pacemaker with the ZFS ocf resource agents, and these run zpool status periodically to perform health checks

Not to dissuade you from this PR, but we do have a lockless kstat for querying pool status (f0ed6c7):

$ cat /proc/spl/kstat/zfs/[pool]/state 
ONLINE

@bertschinger
Copy link
Contributor Author

we do have a lockless kstat for querying pool status

This is great and definitely helpful for us. It looks like the newer versions of the Resource Agents scripts use this feature, so this will help improve the Pacemaker HA use case. Thank you!

I do still think this PR has value since the new proc file doesn't fix the behavior of zpool status itself, even if it allows it to be worked around in some (mostly non-interactive) cases. Also, relating to the previous suggestion to make zpool status itself lockless, maybe a "brief" or "non-blocking" option could be added to zpool status (or zpool list) that uses this proc file, so that this interface is usable via the existing command line tools. That might help discoverability of this feature, since if it were in the standard command man pages we might have already learned about it.

Many long-running ZFS ioctls lock the spa_namespace_lock, forcing
concurrent ioctls to sleep for the mutex. Previously, the only
option is to call mutex_enter() which sleeps uninterruptibly. This
is a usability issue for sysadmins, for example, if the admin runs
`zpool status` while a slow `zpool import` is ongoing, the admin's
shell will be locked in uninterruptible sleep for a long time.

This patch resolves this admin usability issue by introducing
mutex_enter_interruptible() which sleeps interruptibly while waiting
to acquire a lock. It is implemented for both Linux and FreeBSD.

The ZFS_IOC_POOL_CONFIGS ioctl, used by `zpool status`, is changed to
use this new macro so that the command can be interrupted if it is
issued during a concurrent `zpool import` (or other long-running
operation).

Signed-off-by: Thomas Bertschinger <[email protected]>
@behlendorf
Copy link
Contributor

Sounds like the real fix would whack dependency on this lock. ultimately no amount of pending imports should block zpool status to begin with.

There's been some investigation in to converting this to a reader/writer lock and reducing its scope. That's a better long term fix, but it's almost a much more invasive change. This is a nice complement to that work since even when it's done it's nice to be able to interrupt the ioctl.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Oct 25, 2023
@behlendorf behlendorf merged commit 97a0b5b into openzfs:master Oct 26, 2023
23 of 25 checks passed
defaziogiancarlo pushed a commit to LLNL/zfs that referenced this pull request Nov 17, 2023
Many long-running ZFS ioctls lock the spa_namespace_lock, forcing
concurrent ioctls to sleep for the mutex. Previously, the only
option is to call mutex_enter() which sleeps uninterruptibly. This
is a usability issue for sysadmins, for example, if the admin runs
`zpool status` while a slow `zpool import` is ongoing, the admin's
shell will be locked in uninterruptible sleep for a long time.

This patch resolves this admin usability issue by introducing
mutex_enter_interruptible() which sleeps interruptibly while waiting
to acquire a lock. It is implemented for both Linux and FreeBSD.

The ZFS_IOC_POOL_CONFIGS ioctl, used by `zpool status`, is changed to
use this new macro so that the command can be interrupted if it is
issued during a concurrent `zpool import` (or other long-running
operation).

Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Thomas Bertschinger <[email protected]>
Closes openzfs#15360
ixhamza pushed a commit to truenas/zfs that referenced this pull request Nov 20, 2023
Many long-running ZFS ioctls lock the spa_namespace_lock, forcing
concurrent ioctls to sleep for the mutex. Previously, the only
option is to call mutex_enter() which sleeps uninterruptibly. This
is a usability issue for sysadmins, for example, if the admin runs
`zpool status` while a slow `zpool import` is ongoing, the admin's
shell will be locked in uninterruptible sleep for a long time.

This patch resolves this admin usability issue by introducing
mutex_enter_interruptible() which sleeps interruptibly while waiting
to acquire a lock. It is implemented for both Linux and FreeBSD.

The ZFS_IOC_POOL_CONFIGS ioctl, used by `zpool status`, is changed to
use this new macro so that the command can be interrupted if it is
issued during a concurrent `zpool import` (or other long-running
operation).

Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Thomas Bertschinger <[email protected]>
Closes openzfs#15360
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Dec 12, 2023
Many long-running ZFS ioctls lock the spa_namespace_lock, forcing
concurrent ioctls to sleep for the mutex. Previously, the only
option is to call mutex_enter() which sleeps uninterruptibly. This
is a usability issue for sysadmins, for example, if the admin runs
`zpool status` while a slow `zpool import` is ongoing, the admin's
shell will be locked in uninterruptible sleep for a long time.

This patch resolves this admin usability issue by introducing
mutex_enter_interruptible() which sleeps interruptibly while waiting
to acquire a lock. It is implemented for both Linux and FreeBSD.

The ZFS_IOC_POOL_CONFIGS ioctl, used by `zpool status`, is changed to
use this new macro so that the command can be interrupted if it is
issued during a concurrent `zpool import` (or other long-running
operation).

Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Thomas Bertschinger <[email protected]>
Closes openzfs#15360
defaziogiancarlo pushed a commit to LLNL/zfs that referenced this pull request Apr 18, 2024
Many long-running ZFS ioctls lock the spa_namespace_lock, forcing
concurrent ioctls to sleep for the mutex. Previously, the only
option is to call mutex_enter() which sleeps uninterruptibly. This
is a usability issue for sysadmins, for example, if the admin runs
`zpool status` while a slow `zpool import` is ongoing, the admin's
shell will be locked in uninterruptible sleep for a long time.

This patch resolves this admin usability issue by introducing
mutex_enter_interruptible() which sleeps interruptibly while waiting
to acquire a lock. It is implemented for both Linux and FreeBSD.

The ZFS_IOC_POOL_CONFIGS ioctl, used by `zpool status`, is changed to
use this new macro so that the command can be interrupted if it is
issued during a concurrent `zpool import` (or other long-running
operation).

Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Thomas Bertschinger <[email protected]>
Closes openzfs#15360
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants