You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In TiDB, there isn't a proper way to choose follower peer or learner peer to read, which may cause hotpoint of read request.
Priviously, we used a random approach to choose TiFlash peer which can lead to balance in terms of probability.
While TiDB doesn't have ability to tell if a pure learner store(i.e. TiFlash) is down or not. Therefore if there is a TiFlash node get crashed during a query, TiDB may tend to read from the crashed node again when doing backoff which will cause a long time wait in MPP query due to some design issues. pingcap/tidb#23589 fixed this problem in a brutal way and introduce the hotpoint issue in MPP mode.
We should find a way to solve both the load balance issue and the backoff issue.
Related work
PD is planning to design a mechanism to collect information about load(e.g. read flow, QPS, etc.) of follower peers to help schedule hotpoint introduced by stale read. They demand TiDB to choose follower peer randomly.
A keep alive mechanism is needed for doing backoff.
The text was updated successfully, but these errors were encountered:
Background
In TiDB, there isn't a proper way to choose follower peer or learner peer to read, which may cause hotpoint of read request.
Priviously, we used a random approach to choose TiFlash peer which can lead to balance in terms of probability.
While TiDB doesn't have ability to tell if a pure learner store(i.e. TiFlash) is down or not. Therefore if there is a TiFlash node get crashed during a query, TiDB may tend to read from the crashed node again when doing backoff which will cause a long time wait in MPP query due to some design issues. pingcap/tidb#23589 fixed this problem in a brutal way and introduce the hotpoint issue in MPP mode.
We should find a way to solve both the load balance issue and the backoff issue.
Related work
PD is planning to design a mechanism to collect information about load(e.g. read flow, QPS, etc.) of follower peers to help schedule hotpoint introduced by stale read. They demand TiDB to choose follower peer randomly.
A keep alive mechanism is needed for doing backoff.
The text was updated successfully, but these errors were encountered: