feat(dup): preserve data consistency during replica learn #355

neverchanje · 2019-12-10T07:26:05Z

We need to ensure the unduplicated mutation-logs are included during replica learn (Specifically, unduplicated means the mutations that aren't confirmed by meta-server to be duplicated) in order to prevent data-inconsistency between clusters.

The design docs here might be helpful to understand the mechanisms: https://pegasus-kv.github.io/2019/06/09/duplication-design.html#%E6%97%A5%E5%BF%97%E5%AE%8C%E6%95%B4%E6%80%A7

The normal procedure of learning without duplication

First, the learnee calculates learn_start_decree, ie. where learning should begin, then replicates the data between [learn_start_decree, the latest committed decree] to the learner.

100                   900
|       | | | | | | | |
        |
        flushed=500

As logs between [100, 500] have been flushed to rocksdb's SSTable, the learnee (primary) can ignore copying those logs. Assume learn_start_decree=200, learnee will copy rocksdb checkpoint first, then copy logs between (500, 900]. Assume learn_start_decree=600, learnee only copies logs [600, 900].

How does the learnee get learn_start_decree:
For now, learn_start_decree = learner's committed decree + 1. For example, if the learner bootstraps from scratch, learn_start_decree is 1, means to replicate all data.

Learning procedure with duplication

When duplication is enabled, the procedure needs changes. Since:

the original learn_start_decree = learner's committed decree + 1 may ignore the unduplicated logs, since duplication may be much lagged behind 2PC, speak more specifically, the confirmed_decree may much smaller than learner's committed_decree.

To fix this problem, learn_start_decree should include not only "the data (logs/rdb) which learner doesn't have" but also the "unduplicated logs that learner doesn't have". See replica::get_learn_start_decree.

learnee:
100                   900
| | | | | | | | | | | |
    |
    confirmed_decree=300

learner:
|   rdb   |
          |
          committed_decree=500, no private log

learn_start_decree_no_dup = 501
learn_start_decree_for_dup = 301

Assume learner bootstrapped from a pure rdb, with decree=500, no private log. Originally only (500, 900] is learned, with duplication enabled, logs (300, 500] must be included. So finally it's [300, 900] of logs to be learned.

get_learn_start_decree

To copy the "unduplicated logs that learner doesn't have", the learnee needs to know the existing logs the learner has. We add max_gced_decree in learn_request for this reason.

If learner's max_gced_decree <= min_confirmed_decree + 1, it means the learner has the unduplicated logs. Therefore the learnee should perform as normal.

get_max_gced_decree_for_learn

On the learner side, there are two log dirs during the learning process, one 'plog/', the normal private-log path, the other 'learn/', the log files learned from learnee. The max_gced_decree is calculated from the compound of both dirs, via get_max_gced_decree_for_learn.

There's a problem: how can I get the max_gced_decree in learn/? To solve this we introduced first_learn_start_decree, which is the learn_start_decree in the first round of learning, stored in potential_secondary_context. It can be used to represent "the max_gced_decree under learn/".

We do not use previous_log_max_decrees to determine the max_gced_decree. To ensure data safety we cannot trust the files under learn/, because they may be stale.

Downside

The downside of this change is the increased logs to be copied during learning with duplication enabled, which may lead to slower rebalance.
TODO: We can add an option for duplication to trade-off data consistency with performance.

hycdong · 2019-12-17T08:32:48Z

The downside of this change is the increased logs to be copied during learning, which may lead to
slower rebalance.

上面说的会让learn的log数量变多，是指所有learn的过程是吧？能不能只让正在热备的learn多一点文件，比如不在热备的时候confirmed_decree是一个特殊的decree

neverchanje · 2019-12-19T02:05:11Z

The downside of this change is the increased logs to be copied during learning, which may lead to
slower rebalance.

上面说的会让learn的log数量变多，是指所有learn的过程是吧？能不能只让正在热备的learn多一点文件，比如不在热备的时候confirmed_decree是一个特殊的decree

文档写错了，是 learning with duplication，没有热备份的时候，learn 的流程与之前无异

src/dist/replication/lib/replica_learn.cpp

src/dist/replication/test/replica_test/unit_test/replica_learn_test.cpp

src/dist/replication/lib/replica_learn.cpp

neverchanje and others added 7 commits December 10, 2019 14:49

feat(dup): preserve data consistency during replica learn

bdecc80

Merge branch 'master' into dup-part

38b7074

fix comment

9988630

add unit test

7f7e3e1

fix learn test

813b8b7

Merge branch 'master' of https://github.com/XiaoMi/rdsn into dup-part

8cba678

fix learn test

d8fa91e

neverchanje added the component/duplication label Dec 12, 2019

neverchanje requested review from hycdong, acelyc111 and levy5307 December 12, 2019 06:36

Wu Tao and others added 5 commits December 12, 2019 17:06

Merge branch 'master' into dup-part

fd8fec1

fix comment

50576e7

remove group check

494e8a0

Merge branch 'dup-part' of github.com:neverchanje/rdsn into dup-part

0367deb

Merge branch 'master' into dup-part

ab5fd68

weekly-digest bot mentioned this pull request Dec 15, 2019

Weekly Digest (8 December, 2019 - 15 December, 2019) #358

Closed

Wu Tao and others added 2 commits December 18, 2019 17:00

Merge branch 'master' into dup-part

63f1331

add comment

53ee7cb

Wu Tao and others added 4 commits December 19, 2019 10:05

Merge branch 'master' into dup-part

f7edc57

Merge branch 'dup-part' of github.com:neverchanje/rdsn into dup-part

be9b405

Merge branch 'master' into dup-part

f9fd07d

Merge branch 'master' into dup-part

ed3db8d

weekly-digest bot mentioned this pull request Dec 22, 2019

Weekly Digest (15 December, 2019 - 22 December, 2019) #366

Closed

hycdong reviewed Dec 23, 2019

View reviewed changes

Wu Tao and others added 3 commits December 24, 2019 18:05

Merge branch 'master' into dup-part

8e198a0

skip private log append if not duplicating

fb75a39

Merge branch 'dup-part' of github.com:neverchanje/rdsn into dup-part

c885fcb

neverchanje added 2 commits December 24, 2019 20:03

fix compile

212d54e

add comment

cc673bd

hycdong reviewed Dec 26, 2019

View reviewed changes

src/dist/replication/lib/replica_learn.cpp Show resolved Hide resolved

src/dist/replication/test/replica_test/unit_test/replica_learn_test.cpp Show resolved Hide resolved

Merge branch 'master' into dup-part

b8e14c9

hycdong approved these changes Dec 26, 2019

View reviewed changes

levy5307 reviewed Dec 26, 2019

View reviewed changes

src/dist/replication/lib/replica_learn.cpp Show resolved Hide resolved

levy5307 approved these changes Dec 26, 2019

View reviewed changes

neverchanje merged commit 7fc0b25 into XiaoMi:master Dec 26, 2019

neverchanje deleted the dup-part branch December 26, 2019 09:43

weekly-digest bot mentioned this pull request Dec 29, 2019

Weekly Digest (22 December, 2019 - 29 December, 2019) #372

Closed

neverchanje mentioned this pull request Mar 30, 2020

Release 1.12.3 apache/incubator-pegasus#506

Closed

neverchanje pushed a commit that referenced this pull request Mar 31, 2020

feat(dup): preserve data consistency during replica learn (#355)

237f1aa

neverchanje added the 1.12.3 label Apr 17, 2020

foreverneverer mentioned this pull request May 28, 2021

fix(duplication): plog may be lost on going duplicating when replica re-open and then learn #838

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dup): preserve data consistency during replica learn #355

feat(dup): preserve data consistency during replica learn #355

neverchanje commented Dec 10, 2019 •

edited

Loading

hycdong commented Dec 17, 2019 •

edited

Loading

neverchanje commented Dec 19, 2019

feat(dup): preserve data consistency during replica learn #355

feat(dup): preserve data consistency during replica learn #355

Conversation

neverchanje commented Dec 10, 2019 • edited Loading

The normal procedure of learning without duplication

Learning procedure with duplication

get_learn_start_decree

get_max_gced_decree_for_learn

Downside

hycdong commented Dec 17, 2019 • edited Loading

neverchanje commented Dec 19, 2019

neverchanje commented Dec 10, 2019 •

edited

Loading

hycdong commented Dec 17, 2019 •

edited

Loading