Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMLII-1842 Fix a potential deadlock on fork #836

Merged
merged 2 commits into from
Aug 12, 2024
Merged

AMLII-1842 Fix a potential deadlock on fork #836

merged 2 commits into from
Aug 12, 2024

Conversation

vickenty
Copy link
Contributor

What does this PR do?

Fix a potential deadlock on fork

Description of the Change

If a fork happens while some thread is sending metrics and holds one of the locks, the lock would remain locked in the child process and would deadlock, either in the post_fork handler (when the post_fork hook tries to close the socket), or later (when user code tries to send a metric).

Work around the issue by resetting the socket and buffer locks in the child process. If those were locked in the parent at the time of the fork, the internal client state may be inconsistent, so we reset it as well.

With config lock, we can not reset the state to some known good state, and to avoid problems when fork is called while a thread modifies the client configuration, the config lock will be held across fork. Both the client and the parent can safely unlock it afterwards.

Alternate Designs

Possible Drawbacks

Verification Process

Additional Notes

Release Notes

Review checklist (to be filled by reviewers)

  • Feature or bug fix MUST have appropriate tests (unit, integration, etc...)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have one changelog/ label attached. If applicable it should have the backward-incompatible label attached.
  • PR should not have do-not-merge/ label attached.
  • If Applicable, issue must have kind/ and severity/ labels attached at least.

@vickenty vickenty force-pushed the vickenty/fork branch 2 times, most recently from af7b9f1 to 7afa4bc Compare June 24, 2024 16:23
If a fork happens while some thread is sending metrics and holds one
of the locks, the lock would remain locked in the child process and
would deadlock, either in the post_fork handler (when the post_fork
hook tries to close the socket), or later (when user code tries to
send a metric).

Work around the issue by resetting the socket and buffer locks in the
child process. If those were locked in the parent at the time of the
fork, the internal client state may be inconsistent, so we reset it as
well.

With config lock, we can not reset the state to some known good state,
and to avoid problems when fork is called while a thread modifies the
client configuration, the config lock will be held across fork. Both
the client and the parent can safely unlock it afterwards.
@vickenty vickenty added kind/bug Bug related issue changelog/Fixed Fixed features results into a bug fix version bump labels Jun 24, 2024
@vickenty vickenty marked this pull request as ready for review June 24, 2024 16:55
@vickenty vickenty requested review from a team as code owners June 24, 2024 16:55
Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days.
Note that the issue will not be automatically closed, but this notification will remind us to investigate why there's been inactivity.

@github-actions github-actions bot added the stale Stale - Bot reminder label Jul 25, 2024
@vickenty vickenty merged commit cabf231 into master Aug 12, 2024
11 checks passed
@vickenty vickenty deleted the vickenty/fork branch August 12, 2024 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/Fixed Fixed features results into a bug fix version bump kind/bug Bug related issue resource/dogstatsd stale Stale - Bot reminder
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants