Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: throttler: gRPC logs indicate attempts to contact a decomissioned IP address #14164

Closed
shlomi-noach opened this issue Oct 3, 2023 · 0 comments · Fixed by #14165
Closed

Comments

@shlomi-noach
Copy link
Contributor

Overview of the Issue

With the introduction of gRPC based throttler checks, and on a k8s environment, and given a specific rollout flow, we've seen errors (warnings) coming from the grpc-go package, like so:

component.go:41] [core] [Channel #652 SubChannel #653] grpc: addrConn.createTransport failed to connect to {Addr: "10.10.10.20:15999", ServerName: "10.10.10.20:15999", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 10.10.10.20:15999: i/o timeout"  

The errors are coming from the primary tablet. The IP address is that of a decomissioned replica. grpc-go seems to still "remember" that IP and routinely keeps polling it.

Observations:

  • Disabling & enabling the throttler resolves the issue.
  • If the throttler starts disabled, then enabled after the old tablet was decomissioned, the errors do not show.
  • The error trace does not originate directly by throttler code, but rather from grpc-go routine keepalive or cleanup code.

The issue is already identified, PR to follow.

cc @vmg @deepthi

Reproduction Steps

Binary Version

v18

Operating System and Environment details

-

Log Fragments

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant