fix(replication): potential deadlock when switching master frequently #2516

git-hulk · 2024-09-01T14:12:48Z

This closes #2512.

Currently, the replication thread will wait for the worker's exclusive guard stop before closing db. But it now stops the worker from running new commands after acquiring the worker's exclusive guard, and it might cause deadlock if switches at the same time.

The following steps will show how it may happen:

T0: client A sent slaveof MASTER_IP0 MASTER_PORT0, then the replication thread was started and waiting for the exclusive guard.
T1: client B sent slaveof MASTER_IP1 MASTER_PORT1 and AddMaster will stop the previous replication thread, which is waiting for the exclusive guard. But the exclusive guard is acquired by the current thread.

The workaround is also straightforward, just stop workers from running new commands by enabling is_loading_ to true before acquiring the lock in the replication thread.

Currently, the replication thread will wait for the worker exclusive guard stop before closing the db. But it now stops the worker from running new commands after acquiring the worker exclusive guard, and it might cause deadlock if switches at the same time. The following steps will show how it may happen: - T0: client A sent `slaveof MASTER_IP0 MASTER_PORT0`, then the replication thread was started and waiting for the exclusive guard. - T1: client B sent `slaveof MASTER_IP1 MASTER_PORT1` and `AddMaster` will stop the previous replication thread, which is waiting for the exclusive guard. But the exclusive guard is acquiring by the current thread. And the workaround is also straightforward, just stop workers from running new commands by enabling `is_loading_` to true before acquiring the lock in replication thread.

sonarcloud · 2024-09-02T06:04:11Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
61.1% Coverage on New Code
1.2% Duplication on New Code

See analysis details on SonarCloud

git-hulk changed the title ~~Fix potential deadlock when switching master frequently~~ fix(replication): potential deadlock when switching master frequently Sep 1, 2024

git-hulk force-pushed the fix/potential-deadlock-in-replication branch from 6a24294 to f0724b5 Compare September 1, 2024 14:15

git-hulk requested review from caipengbo, PragmaTwice and mapleFU September 2, 2024 03:51

Merge branch 'unstable' into fix/potential-deadlock-in-replication

d92560b

caipengbo approved these changes Sep 2, 2024

View reviewed changes

git-hulk merged commit ab41cbb into apache:unstable Sep 2, 2024
32 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(replication): potential deadlock when switching master frequently #2516

fix(replication): potential deadlock when switching master frequently #2516

git-hulk commented Sep 1, 2024

sonarcloud bot commented Sep 2, 2024

fix(replication): potential deadlock when switching master frequently #2516

fix(replication): potential deadlock when switching master frequently #2516

Conversation

git-hulk commented Sep 1, 2024

sonarcloud bot commented Sep 2, 2024

Quality Gate passed