-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zfs list sometimes hangs with SPL panic in zfs_ioc_pool_stats #3405
Comments
I should note that I have other crontab entries that make snapshots every 5 minutes, so it might be some kind of race condition between |
Possibly related to #3335. |
EFAULT from |
Got it again after 64 hours of uptime. This time it got blocked in |
I think I managed to make it reproducible. Running the following script in a Debian Jessie VM (4 virtual CPUs) with zfs-0.6.4.1 manages to trigger the panic within seconds: #!/bin/bash
set -e
dd if=/dev/zero of=/tmp/disk bs=1 count=1 seek="$((3 * 1024 * 1024 * 1024))"
zpool create racetest /tmp/disk
spawn_list_thread() {
local ID="$1"
while :
do
echo LIST $ID
zfs list -H -o name -t filesystem >/dev/null
done &
}
spawn_snapshot_thread() {
local ID="$1"
zfs create "racetest/$ID"
while :
do
echo SNAPSHOT $ID
zfs snapshot "racetest/${ID}@$(date '+%s')_$RANDOM"
done &
}
# Spawn 8 zfs list threads, and 64 zfs snapshot threads
for I in $(seq 1 8)
do
spawn_list_thread "$I"
done
for I in $(seq 1 64)
do
spawn_snapshot_thread "$I"
done
wait Now I can start bissecting the thing. |
Okay, this is interesting. Starting from zfs-0.6.4.1, reverting the offending commit from #3335 makes the issue impossible to reproduce using the above script. By that I mean, the following fixes the issue:
However, @nedbass's fix in #3339 doesn't seem to work in my case. By that I mean, the following does NOT fix the issue:
The error and stack trace are still exactly the same with 2209580. |
I got this with
when running zfs destroy and list conncurrently
|
@dechamps thanks for confirming where this was accidentally introduced. Looks like we overlooked something and we'll definitely want to get this resolved in the next point release. Thanks for the reproducer. |
Closing this was fixed in 0.6.4.2. |
A week ago, I upgraded from zfs-0.6.3 to zfs-0.6.4.1. I have some crontab entry that runs the following every minute as part of a longer script:
At first everything went just fine, but then after ~48 hours of uptime (so after ~3000 invocations), the command hanged with the following in dmesg:
The
zfs list
process then become stuck and unkillable. I had to reboot the system to make it go away. It's worth noting that this didn't seem to affect anything else though - in fact, I was able to run the same command just fine even while anotherzfs list
process was stuck.I suspect this is a regression from 0.6.3 to 0.6.4.1, because this absolutely never happened before I upgraded.
The text was updated successfully, but these errors were encountered: