You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While chaos testing at Hubspot we discovered that if the MySql process was killed and restarted for a vttablet replica that the vttablet would never reconnect. After some investigation, we found this was due to the way that healthcheck.go verifies the health. The healthcheck is powered by the hearbeat reporter. The heartbeat reporter fetches the latest from the reader. Which caches either the last value or an error. In the case that MySql becomes unreachable, the QueryService will shutdown and will stop the reader. If the reader has received an error on the last query it ran, then it will always return this error until it gets a new reading. But the healthcheck is using this value to determine if the QueryService can be restarted and the heartbeat won’t get a new value until the reader is restarted, which won’t happen until the QueryService is restarted. So, the service never recovers.
Reproduction Steps
Deploy a vttablet replica with heartbeat enabled and then kill the mysql process.
The text was updated successfully, but these errors were encountered:
Overview of the Issue
While chaos testing at Hubspot we discovered that if the MySql process was killed and restarted for a vttablet replica that the vttablet would never reconnect. After some investigation, we found this was due to the way that healthcheck.go verifies the health. The healthcheck is powered by the hearbeat reporter. The heartbeat reporter fetches the latest from the reader. Which caches either the last value or an error. In the case that MySql becomes unreachable, the QueryService will shutdown and will stop the reader. If the reader has received an error on the last query it ran, then it will always return this error until it gets a new reading. But the healthcheck is using this value to determine if the QueryService can be restarted and the heartbeat won’t get a new value until the reader is restarted, which won’t happen until the QueryService is restarted. So, the service never recovers.
Reproduction Steps
Deploy a vttablet replica with heartbeat enabled and then kill the mysql process.
The text was updated successfully, but these errors were encountered: