Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GODRIVER-3156 Detect and discard closed idle connections. #1815

Open
wants to merge 4 commits into
base: v1
Choose a base branch
from

Conversation

matthewdale
Copy link
Collaborator

GODRIVER-3156

Summary

  • When checking if a connection is closed (either during check out or during the background maintenance routine), try to read from the connection to check if it's still alive.
    • Only check for the liveness of the connection if it's been idle for more than 10 seconds to prevent adding latency to most check-out operations.
  • Change connection.idleDeadline to connection.idleStart, allowing us to calculate different idle deadlines for different purposes.

Background & Motivation

Currently the Go driver never attempts to detect connections that were closed by the other side when checking out connections. As a result, it's possible to return a connection that's been closed by the other side where any read or write operations will immediately fail. Instead, we should check if the connection has been closed when checking out a connection.

@mongodb-drivers-pr-bot mongodb-drivers-pr-bot bot added the priority-3-low Low Priority PR for Review label Sep 16, 2024
@@ -512,24 +513,53 @@ func (c *connection) close() error {
}

func (c *connection) closed() bool {
return atomic.LoadInt64(&c.state) == connDisconnected
if atomic.LoadInt64(&c.state) == connDisconnected {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding a comment to closed noting that the method checks both the client-side connection state as well as the server-side.

closed will return true if the connection has been closed client-side or by the server. If the connection has been closed by the server, this method will close the connection client-side to avoid returning a perished connection back to the pool.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great idea! This recommendation is also making me reconsider adding the logic to connection.closed, which previously would return quickly and do no system calls. It may be better to add a new method on connection or pool that is used in connectionPerished to prevent unintended side effects.

// If the connection has been idle for less than 10 seconds, skip the liveness
// check.
idleStart, ok := c.idleStart.Load().(time.Time)
if !ok || idleStart.Add(10*time.Second).After(time.Now()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the significance of 10 seconds?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reference was the PyMongo driver, which checks connection liveness when it's idle for >1 second. I wanted to mitigate the risk that checking liveness too often would cause performance problems (since a liveness check takes at least 1ms), so I increased that threshold to 10s.

Address: address.Address(addr.String()),
}, WithDialer(func(Dialer) Dialer { return d }))
err := p.ready()
noerr(t, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest using the require package instead of noerr.

@blink1073 blink1073 added priority-1-high High Priority PR for Review and removed priority-3-low Low Priority PR for Review labels Sep 27, 2024
@blink1073
Copy link
Member

This looks relevant: x/mongo/driver/topology/pool_test.go:1230:4: undefined: noerr

Copy link
Contributor

API Change Report

No changes found!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority-1-high High Priority PR for Review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants