Skip to content

Commit

Permalink
fix(replication): potential deadlock when switching master frequently (
Browse files Browse the repository at this point in the history
…apache#2516)

This closes apache#2512.

Currently, the replication thread will wait for the worker's exclusive guard stop before closing db.
But it now stops the worker from running new commands after acquiring the worker's exclusive guard,
and it might cause deadlock if switches at the same time.

The following steps will show how it may happen:

- T0: client A sent `slaveof MASTER_IP0 MASTER_PORT0`, then the replication thread was started and waiting for the exclusive guard.

- T1: client B sent `slaveof MASTER_IP1 MASTER_PORT1` and `AddMaster` will stop the previous replication thread, which is waiting for the exclusive guard. But the exclusive guard is acquired by the current thread.

The workaround is also straightforward, just stop workers from running new commands by enabling `is_loading_` to 
true before acquiring the lock in the replication thread.
  • Loading branch information
git-hulk authored and AntiTopQuark committed Sep 2, 2024
1 parent 8b41ab6 commit b2665a4
Showing 1 changed file with 6 additions and 4 deletions.
10 changes: 6 additions & 4 deletions src/server/server.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1359,14 +1359,16 @@ void Server::PrepareRestoreDB() {
// accessing, data migration task should be stopped before restoring DB
WaitNoMigrateProcessing();

// Workers will disallow to run commands which may access DB, so we should
// enable this flag to stop workers from running new commands. And wait for
// the exclusive guard to be released to guarantee no worker is running.
is_loading_ = true;

// To guarantee work threads don't access DB, we should release 'ExclusivityGuard'
// ASAP to avoid user can't receive responses for long time, because the following
// 'CloseDB' may cost much time to acquire DB mutex.
LOG(INFO) << "[server] Waiting workers for finishing executing commands...";
{
auto exclusivity = WorkExclusivityGuard();
is_loading_ = true;
}
{ auto exclusivity = WorkExclusivityGuard(); }

// Cron thread, compaction checker thread, full synchronization thread
// may always run in the background, we need to close db, so they don't actually work.
Expand Down

0 comments on commit b2665a4

Please sign in to comment.