Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System-wide speculative prefetch limit. #14516

Merged
merged 1 commit into from
Mar 1, 2023
Merged

Conversation

amotin
Copy link
Member

@amotin amotin commented Feb 21, 2023

With some pathological access patterns it is possible to make ZFS accumulate almost unlimited amount of speculative prefetch ZIOs. Combined with linear ABD allocations in RAIDZ code, it appears to be possible to exhaust system KVA, triggering kernel panic.

Address this by introducing a system-wide counter of active prefetch requests and blocking prefetch distance doubling per stream hits if the number of active requests is higher that ~6% of ARC size.

How Has This Been Tested?

On FreeBSD system with 8GB RAM and 5-disk RAIDZ2 pool massive multi-threaded strided read over many files was found to cause memory exhaustion. With the patch number of prefetch requests never grows above the set limit.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Feb 22, 2023
module/zfs/dmu_zfetch.c Show resolved Hide resolved
module/zfs/dmu_zfetch.c Show resolved Hide resolved
With some pathological access patterns it is possible to make ZFS
accumulate almost unlimited amount of speculative prefetch ZIOs.
Combined with linear ABD allocations in RAIDZ code, it appears to
be possible to exhaust system KVA, triggering kernel panic.

Address this by introducing a system-wide counter of active prefetch
requests and blocking prefetch distance doubling per stream hits if
the number of active requests is higher that ~6% of ARC size.

Signed-off-by:  Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Mar 1, 2023
@behlendorf behlendorf merged commit 5f42d1d into openzfs:master Mar 1, 2023
amotin added a commit to amotin/zfs that referenced this pull request Mar 2, 2023
With some pathological access patterns it is possible to make ZFS
accumulate almost unlimited amount of speculative prefetch ZIOs.
Combined with linear ABD allocations in RAIDZ code, it appears to
be possible to exhaust system KVA, triggering kernel panic.

Address this by introducing a system-wide counter of active prefetch
requests and blocking prefetch distance doubling per stream hits if
the number of active requests is higher that ~6% of ARC size.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:  Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes openzfs#14516
amotin added a commit to amotin/zfs that referenced this pull request Mar 2, 2023
With some pathological access patterns it is possible to make ZFS
accumulate almost unlimited amount of speculative prefetch ZIOs.
Combined with linear ABD allocations in RAIDZ code, it appears to
be possible to exhaust system KVA, triggering kernel panic.

Address this by introducing a system-wide counter of active prefetch
requests and blocking prefetch distance doubling per stream hits if
the number of active requests is higher that ~6% of ARC size.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:  Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes openzfs#14516
behlendorf pushed a commit that referenced this pull request Mar 2, 2023
With some pathological access patterns it is possible to make ZFS
accumulate almost unlimited amount of speculative prefetch ZIOs.
Combined with linear ABD allocations in RAIDZ code, it appears to
be possible to exhaust system KVA, triggering kernel panic.

Address this by introducing a system-wide counter of active prefetch
requests and blocking prefetch distance doubling per stream hits if
the number of active requests is higher that ~6% of ARC size.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:  Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes #14516
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 3, 2023
With some pathological access patterns it is possible to make ZFS
accumulate almost unlimited amount of speculative prefetch ZIOs.
Combined with linear ABD allocations in RAIDZ code, it appears to
be possible to exhaust system KVA, triggering kernel panic.

Address this by introducing a system-wide counter of active prefetch
requests and blocking prefetch distance doubling per stream hits if
the number of active requests is higher that ~6% of ARC size.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:  Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes openzfs#14516
@amotin amotin deleted the pref_mem branch October 9, 2023 13:23
behlendorf added a commit that referenced this pull request Oct 13, 2023
New Features
- Block cloning (#13392)
- Linux container support (#14070, #14097, #12263)
- Scrub error log (#12812, #12355)
- BLAKE3 checksums (#12918)
- Corrective "zfs receive"
- Vdev and zpool user properties

Performance
- Fully adaptive ARC (#14359)
- SHA2 checksums (#13741)
- Edon-R checksums (#13618)
- Zstd early abort (#13244)
- Prefetch improvements (#14603, #14516, #14402, #14243, #13452)
- General optimization (#14121, #14123, #14039, #13680, #13613,
  #13606, #13576, #13553, #12789, #14925, #14948)

Signed-off-by: Brian Behlendorf <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants