Non-zero prefetch_* counters when zfs_prefetch_disable=1 #16236

sean- · 2024-05-29T17:52:20Z

I have a curious situation involving zfs_prefetch_disable=1, but prefetch counters are non-zero:

root@ip-172-31-84-153:/home/ubuntu# cat /sys/module/zfs/parameters/zfs_prefetch_disable
1
root@ip-172-31-84-153:/proc/spl/kstat/zfs# cat /proc/spl/kstat/zfs/arcstats | grep prefetch; sleep 60; cat /proc/spl/kstat/zfs/arcstats | grep prefetch
prefetch_data_hits              4    2107540125
prefetch_data_iohits            4    978962740
prefetch_data_misses            4    710163230
prefetch_metadata_hits          4    24076
prefetch_metadata_iohits        4    15148
prefetch_metadata_misses        4    26193
l2_prefetch_asize               4    0
predictive_prefetch             4    3796731512
demand_hit_predictive_prefetch  4    120556318
demand_iohit_predictive_prefetch 4    45979928
prescient_prefetch              4    0
demand_hit_prescient_prefetch   4    0
demand_iohit_prescient_prefetch 4    0
# sleep 60
prefetch_data_hits              4    2108678041
prefetch_data_iohits            4    979125437
prefetch_data_misses            4    710304178
prefetch_metadata_hits          4    24076
prefetch_metadata_iohits        4    15148
prefetch_metadata_misses        4    26193
l2_prefetch_asize               4    0
predictive_prefetch             4    3798173073
demand_hit_predictive_prefetch  4    120673435
demand_iohit_predictive_prefetch 4    45988168
prescient_prefetch              4    0
demand_hit_prescient_prefetch   4    0
demand_iohit_prescient_prefetch 4    0

The reason for the intrigue re: prefetch is because some Ubuntu 22.04 hosts are running into #14516, #14120 (want #11980), which was causing the oom killer to run about and kill database processes:

root@ip-172-31-84-153:/proc/spl/kstat/zfs# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.4 LTS
Release:	22.04
Codename:	jammy
root@ip-172-31-84-153:/proc/spl/kstat/zfs# zfs version
zfs-2.1.5-1ubuntu6~22.04.4
zfs-kmod-2.2.0-0ubuntu1~23.10.2
root@ip-172-31-84-153:/proc/spl/kstat/zfs# uname -r
6.5.0-1020-aws

Also, we think we were hitting #14686, which was fixed in #14692 (and why we're upgrading to 24.04):

But, on the 24.04 systems, we still see significant prefetch activity:

The workload is interesting because this feature store runs on top of CockroachDB. When one feature finishes, the workload moves to the next feature, and the DB needs to fault in ~100% net-new data. This appears to be triggering the prefetch activity, which leads the host to exceed its ARC max (which leads to unnecessary OOMs, which triggered the upgrade from 22.04 to 24.04).

#15214 looks promising, except we have prefetch disabled but still see similar reported behavior regarding arc_anon usage (not necessarily the pegged core, however).

But, why is prefetch activity happening in the first place? logbias=throughput or something else?

The text was updated successfully, but these errors were encountered:

amotin · 2024-05-29T18:21:03Z

zfs_prefetch_disable=1 disables only speculative prefetches, based on tracked activity. Some prefetches ZFS executes internally, based on hard-coded logic. Many of those are accounted as prescient prefetch, that means ZFS reliably knows that data will be needed soon, but it some cases prefetches can be predictive. IIRC marking prefetches as prescient could take some love. If you have a workload when predictive or especially prescient prefetch is unreasonable, it would be good to analyze it.

sean- · 2024-05-29T22:14:42Z

Heh, so if a workload fadvise(2)es a FD with POSIX_FADV_SEQUENTIAL, that would show up? I think this all makes sense to me now. Thank you, @amotin .

amotin · 2024-05-29T22:31:12Z

@sean- Yes, that is one case. But if you search sources for dmu_prefetch, you'll find about a dozen other cases.

sean- added the Type: Defect Incorrect behavior (e.g. crash, hang) label May 29, 2024

amotin closed this as completed Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-zero prefetch_* counters when zfs_prefetch_disable=1 #16236

Non-zero prefetch_* counters when zfs_prefetch_disable=1 #16236

sean- commented May 29, 2024

amotin commented May 29, 2024

sean- commented May 29, 2024

amotin commented May 29, 2024 •

edited

Loading

Non-zero prefetch_* counters when zfs_prefetch_disable=1 #16236

Non-zero prefetch_* counters when zfs_prefetch_disable=1 #16236

Comments

sean- commented May 29, 2024

amotin commented May 29, 2024

sean- commented May 29, 2024

amotin commented May 29, 2024 • edited Loading

amotin commented May 29, 2024 •

edited

Loading