Refine the load balance strategy of choosing TiFlash replica to read #1807

leiysky · 2021-04-22T03:48:57Z

Background

In TiDB, there isn't a proper way to choose follower peer or learner peer to read, which may cause hotpoint of read request.

Priviously, we used a random approach to choose TiFlash peer which can lead to balance in terms of probability.

While TiDB doesn't have ability to tell if a pure learner store(i.e. TiFlash) is down or not. Therefore if there is a TiFlash node get crashed during a query, TiDB may tend to read from the crashed node again when doing backoff which will cause a long time wait in MPP query due to some design issues. pingcap/tidb#23589 fixed this problem in a brutal way and introduce the hotpoint issue in MPP mode.

We should find a way to solve both the load balance issue and the backoff issue.

Related work

PD is planning to design a mechanism to collect information about load(e.g. read flow, QPS, etc.) of follower peers to help schedule hotpoint introduced by stale read. They demand TiDB to choose follower peer randomly.

A keep alive mechanism is needed for doing backoff.

leiysky · 2021-04-22T03:49:30Z

/cc @hanfei1991

leiysky added the type/enhancement The issue or PR belongs to an enhancement. label Apr 22, 2021

fuzhe1989 mentioned this issue Jul 8, 2021

Better way to let TiDB know a TiFlash is available #2181

Closed

hanfei1991 mentioned this issue Jul 12, 2021

mpp: check the tiflash availabilities before launching mpp queries. pingcap/tidb#26130

Merged

2 tasks

This was referenced Jul 13, 2021

mpp: check the tiflash availabilities before launching mpp queries. (#26130) pingcap/tidb#26192

Merged

mpp: check the tiflash availabilities before launching mpp queries. (#26130) pingcap/tidb#26195

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine the load balance strategy of choosing TiFlash replica to read #1807

Refine the load balance strategy of choosing TiFlash replica to read #1807

leiysky commented Apr 22, 2021

leiysky commented Apr 22, 2021

Refine the load balance strategy of choosing TiFlash replica to read #1807

Refine the load balance strategy of choosing TiFlash replica to read #1807

Comments

leiysky commented Apr 22, 2021

Background

Related work

leiysky commented Apr 22, 2021