Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix occasional rsend test crashes #15273

Merged
merged 1 commit into from
Sep 20, 2023
Merged

Conversation

pcd1193182
Copy link
Contributor

Motivation and Context

We have occasional crashes in the rsend tests. Debugging revealed that this is because the send_worker thread is getting EINTR from splice(). This happens when a non-fatal signal is received during the syscall. We should retry the syscall, rather than exiting failure.

Description

Tweak the loop to only break if the splice is finished or we receive a non-EINTR error.

How Has This Been Tested?

Ran the rsend tests in a loop for 24 hours without issues. Usually the issue reproduced in less than that time.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@pcd1193182 pcd1193182 added the Status: Code Review Needed Ready for review and testing label Sep 15, 2023
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for running this down.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Sep 19, 2023
@behlendorf
Copy link
Contributor

@nabijaczleweli would you mind reviewing this proposed fix.

@pcd1193182 pcd1193182 force-pushed the rsend_tests branch 2 times, most recently from f23119f to ca73f1b Compare September 19, 2023 21:21
@behlendorf behlendorf merged commit 6d9bc3e into openzfs:master Sep 20, 2023
19 checks passed
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Sep 20, 2023
We have occasional crashes in the rsend tests. Debugging revealed 
that this is because the send_worker thread is getting EINTR from 
splice(). This happens when a non-fatal signal is received during 
the syscall. We should retry the syscall, rather than exiting failure.
Tweak the loop to only break if the splice is finished or we receive 
a non-EINTR error.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ahelenia Ziemiańska <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes openzfs#15273
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Sep 21, 2023
We have occasional crashes in the rsend tests. Debugging revealed 
that this is because the send_worker thread is getting EINTR from 
splice(). This happens when a non-fatal signal is received during 
the syscall. We should retry the syscall, rather than exiting failure.
Tweak the loop to only break if the splice is finished or we receive 
a non-EINTR error.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ahelenia Ziemiańska <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes openzfs#15273
behlendorf pushed a commit that referenced this pull request Sep 22, 2023
We have occasional crashes in the rsend tests. Debugging revealed 
that this is because the send_worker thread is getting EINTR from 
splice(). This happens when a non-fatal signal is received during 
the syscall. We should retry the syscall, rather than exiting failure.
Tweak the loop to only break if the splice is finished or we receive 
a non-EINTR error.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ahelenia Ziemiańska <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes #15273
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Dec 12, 2023
We have occasional crashes in the rsend tests. Debugging revealed 
that this is because the send_worker thread is getting EINTR from 
splice(). This happens when a non-fatal signal is received during 
the syscall. We should retry the syscall, rather than exiting failure.
Tweak the loop to only break if the splice is finished or we receive 
a non-EINTR error.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ahelenia Ziemiańska <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes openzfs#15273
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants