slave executes the flushdb command extrated from binlog may cause slave-master inconsistent #2708

cheniujh · 2024-06-07T12:13:26Z

Is this a regression?

Yes

Description

Pika的Slave在apply完binlog后，将binlog所对应的WriteDB任务分发给多个线程处理，分配时采用按key hash的方式取线程。
flushdb是一个会写binlog，却没有key的命令，每次都会固定取某个线程来执行/ApplyDB。
考虑这样一个case：
主落盘的顺序/Binlog的顺序是：
set key1 a1
set key1 a2
flushdb
set key1 a3
主节点最后的状态是：主库中有key1，值为a3
但从节点可能会:
thread1执行：set key a1; set key1 a2; set key1 a3;
thread2执行：flushdb
假如因为线程调度原因，thread1先执行完了这3个set，thread2才去执行flushdb。那从节点就会出现数据不正确，因为flushdb是最后执行的，整个从库是空的。但是按照正确预期, 从库应当有key1，值为a3

After applying the binlog, Pika's Slave distributes the corresponding WriteDB tasks to multiple threads for processing, allocating them by hashing the key to select the thread.
flushdb is a command that writes to the binlog but has no key; it always picks a specific thread for execution/ApplyDB.
Consider this case:
The sequence of primary disk writes/binlog order is:

set key1 a1
set key1 a2
flushdb
set key1 a3

The final state of the primary node is that it contains key1 with the value a3. However, the secondary node might have:

thread1 executes: set key1 a1; set key1 a2; set key1 a3;
thread2 executes: flushdb

If, due to thread scheduling, thread1 completes these three set operations before thread2 executes flushdb, the secondary node will end up with incorrect data because flushdb, being executed last, empties the entire secondary database. However, the correct expectation is that the secondary database should contain key1 with the value a3.

Please provide a link to a minimal reproduction of the bug

No response

Screenshots or videos

No response

Please provide the version you discovered this bug in (check about page for version information)

No response

Anything else?

No response

The text was updated successfully, but these errors were encountered:

cheniujh added the ☢️ Bug Something isn't working label Jun 7, 2024

cheniujh self-assigned this Jun 7, 2024

cheniujh mentioned this issue Jul 17, 2024

fix: flushdb may cause master-slave inconsistency #2808

Merged

AlexStocks closed this as completed Jul 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

slave executes the flushdb command extrated from binlog may cause slave-master inconsistent #2708

slave executes the flushdb command extrated from binlog may cause slave-master inconsistent #2708

cheniujh commented Jun 7, 2024

slave executes the flushdb command extrated from binlog may cause slave-master inconsistent #2708

slave executes the flushdb command extrated from binlog may cause slave-master inconsistent #2708

Comments

cheniujh commented Jun 7, 2024

Is this a regression?

Description

Please provide a link to a minimal reproduction of the bug

Screenshots or videos

Please provide the version you discovered this bug in (check about page for version information)

Anything else?