Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce system for garbage collecting connection managers #851

Merged
merged 1 commit into from
Sep 20, 2019

Conversation

brandur
Copy link
Contributor

@brandur brandur commented Sep 19, 2019

Introduces a system for garbage collecting connection managers in an
attempt to solve #850.

Previously, the number of connection managers (and by extension the
number of connections that they were holding) would stay stable if a
program used a stable number of threads. However, if threads were used
disposably, the number of active connection managers out there could
continue to grow unchecked, and each of those could be holding one or
more dead connections which are no longer open, but still holding a file
descriptor waiting to be unlinked in disposed of by Ruby's GC.

This PR introduces a connection manager garbage collector that runs
periodically whenever a new connection manager is created. Connection
managers get a timestamp to indicate when they were last used, and the
GC runs through each one and prunes any that haven't seen use within a
certain threshold (currently, 300 seconds). This should have the effect
of removing connection managers as they're not needed anymore, and thus
resolving the socket leakage seen in #850.

I had to make a couple implementation tweaks to get this working
correctly. Namely:

  • The StripeClient class now tracks thread contexts instead of
    connection managers. This is so that when we're disposing of a
    connection manager, we can set default_connection_manager on its
    parent thread context to nil so that it's not still tracking a
    connection manager that we're trying to get rid of.

  • StripeClient instances can still be instantiated as before, but no
    longer internalize a reference to their own connection manager,
    instead falling back to the one in the current thread context. The
    rationale is that when trying to dispose of a connection manager, we'd
    also have to dispose of its reference in any outstanding
    StripeClient instances that might still be tracking it, and that
    starts to get a little unwieldy. I've left #connection_manager in
    place for backwards compatibility, but marked it as deprecated.

r? @ob-stripe
cc @stripe/api-libraries

Copy link
Contributor

@ob-stripe ob-stripe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. As usual, thanks for leaving good comments 👍

Introduces a system for garbage collecting connection managers in an
attempt to solve #850.

Previously, the number of connection managers (and by extension the
number of connections that they were holding) would stay stable if a
program used a stable number of threads. However, if threads were used
disposably, the number of active connection managers out there could
continue to grow unchecked, and each of those could be holding one or
more dead connections which are no longer open, but still holding a file
descriptor waiting to be unlinked in disposed of by Ruby's GC.

This PR introduces a connection manager garbage collector that runs
periodically whenever a new connection manager is created. Connection
managers get a timestamp to indicate when they were last used, and the
GC runs through each one and prunes any that haven't seen use within a
certain threshold (currently, 120 seconds). This should have the effect
of removing connection managers as they're not needed anymore, and thus
resolving the socket leakage seen in #850.

I had to make a couple implementation tweaks to get this working
correctly. Namely:

* The `StripeClient` class now tracks thread contexts instead of
  connection managers. This is so that when we're disposing of a
  connection manager, we can set `default_connection_manager` on its
  parent thread context to `nil` so that it's not still tracking a
  connection manager that we're trying to get rid of.

* `StripeClient` instances can still be instantiated as before, but no
  longer internalize a reference to their own connection manager,
  instead falling back to the one in the current thread context. The
  rationale is that when trying to dispose of a connection manager, we'd
  also have to dispose of its reference in any outstanding
  `StripeClient` instances that might still be tracking it, and that
  starts to get a little unwieldy. I've left `#connection_manager` in
  place for backwards compatibility, but marked it as deprecated.
@brandur-stripe
Copy link
Contributor

Thank you OB!

I made one more minor tweak in reducing the expiry age of a connection manager from 300 to 120 seconds. I think it makes sense to try and prune them a little more aggressively in case a lot of threads are being created and churned through, and it's not a big deal if a thread that's not used very often has to occasionally recreate its connection manager.

@brandur-stripe brandur-stripe merged commit cbf4403 into master Sep 20, 2019
@brandur-stripe brandur-stripe deleted the brandur-connection-manager-gc branch September 20, 2019 06:43
@brandur-stripe
Copy link
Contributor

Released as 5.2.0.

@jodosha
Copy link

jodosha commented Sep 23, 2019

Hello folks, before to upgrade to 5.2.0, I had a look at the changes introduced with this new version.
There is something that caught my attention: the way you try to calculate time diff.

Time.now (wall time) isn't stable, and it can be unpredictably change (even backwards in time) during an operation is performed. You should use a monotonic clock instead.

Here's a detailed explanation of the problem: https://blog.dnsimple.com/2018/03/elapsed-time-with-ruby-the-right-way/

I hope that my feedback was helpful. Thanks for Stripe and this gem. 💚

@ob-stripe
Copy link
Contributor

Hey @jodosha, thanks! This is definitely useful feedback. I'll let @brandur-stripe have the final say, but it sounds to me like we should be using a monotonic clock as you pointed out.

@brandur-stripe
Copy link
Contributor

Thanks @jodosha! I took a crack at moving us to monotonic time in #857.

@jodosha
Copy link

jodosha commented Oct 3, 2019

Thank you all for considering my feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants