Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BRT: Change brt_pending_tree sorting order #15954

Merged
merged 1 commit into from
Mar 21, 2024

Conversation

amotin
Copy link
Member

@amotin amotin commented Mar 2, 2024

It does not look important how exactly brt_pending_tree is sorted. When cloning large file, it is quite likely that all of its blocks have identical physical birth times, so comparing them first does not provide useful entropy, while accesses additional cache line. In most cases combination of vdev and offset provides unique result and physical birth time comparison is not even needed. Meanwhile, when traversing the tree inside brt_pending_apply(), it can be beneficial for dbuf cache and CPU cache hits to group processing by vdev and so by the per-VDEV BRT ZAPs.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

It does not look important how exactly brt_pending_tree is sorted.
When cloning large file, it is quite likely that all of its blocks
have identical physical birth times, so comparing them first does
not provide useful entropy, while accesses additional cache line.
In most cases combination of vdev and offset provides unique result
and physical birth time comparison is not even needed.  Meanwhile,
when traversing the tree inside brt_pending_apply(), it can be
beneficial for dbuf cache and CPU cache hits to group processing
by vdev and so by the per-VDEV BRT ZAPs.

Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
@amotin amotin added the Status: Code Review Needed Ready for review and testing label Mar 2, 2024
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Mar 21, 2024
@behlendorf behlendorf merged commit 2c01cae into openzfs:master Mar 21, 2024
20 of 26 checks passed
@pjd
Copy link
Contributor

pjd commented Mar 22, 2024

So why not to compare offsets first? That contains the most entropy and would be optimal for single-disk pools.

@amotin amotin deleted the brt_pending_compare branch March 22, 2024 16:12
@amotin
Copy link
Member Author

amotin commented Mar 22, 2024

So why not to compare offsets first? That contains the most entropy and would be optimal for single-disk pools.

@pjd vdev and offset in block pointer reside in the same cache line, so while there may indeed be an extra extra comparison, it does not cost that much, unlike physical birth, that is 72 bytes away. But comparing vdev first makes tree sorted on it first, that allows brt_pending_apply() to modify ZAPs for one VDEV at a time, improving caching efficiency and so performance there.

amotin added a commit to amotin/zfs that referenced this pull request Apr 17, 2024
It does not look important how exactly brt_pending_tree is sorted.
When cloning large file, it is quite likely that all of its blocks
have identical physical birth times, so comparing them first does
not provide useful entropy, while accesses additional cache line.
In most cases combination of vdev and offset provides unique result
and physical birth time comparison is not even needed.  Meanwhile,
when traversing the tree inside brt_pending_apply(), it can be
beneficial for dbuf cache and CPU cache hits to group processing
by vdev and so by the per-VDEV BRT ZAPs.

Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes openzfs#15954
behlendorf pushed a commit that referenced this pull request Apr 19, 2024
It does not look important how exactly brt_pending_tree is sorted.
When cloning large file, it is quite likely that all of its blocks
have identical physical birth times, so comparing them first does
not provide useful entropy, while accesses additional cache line.
In most cases combination of vdev and offset provides unique result
and physical birth time comparison is not even needed.  Meanwhile,
when traversing the tree inside brt_pending_apply(), it can be
beneficial for dbuf cache and CPU cache hits to group processing
by vdev and so by the per-VDEV BRT ZAPs.

Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes #15954
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Sep 4, 2024
It does not look important how exactly brt_pending_tree is sorted.
When cloning large file, it is quite likely that all of its blocks
have identical physical birth times, so comparing them first does
not provide useful entropy, while accesses additional cache line.
In most cases combination of vdev and offset provides unique result
and physical birth time comparison is not even needed.  Meanwhile,
when traversing the tree inside brt_pending_apply(), it can be
beneficial for dbuf cache and CPU cache hits to group processing
by vdev and so by the per-VDEV BRT ZAPs.

Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes openzfs#15954
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants