-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raft: add learner #8605
raft: add learner #8605
Conversation
@xiang90 @siddontang PTAL |
sorry for the delay. i will try to give this a review this week. |
raft/raftpb/raft.proto
Outdated
message ConfState { | ||
repeated uint64 nodes = 1; | ||
repeated uint64 nodes = 1; | ||
repeated Server servers = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make this simpler by just having
repeated uint64 nodes = 1;
repeated uint64 learners= 2;
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may also have stagings after addVoter feature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i do not like the idea of staging server. i would rather let the application itself promote the leader to a node explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basically, the less raft layer needs to care about the better.
can we rename Nonvoter to Learners? also i would like to simplify the code path and minimize code change by having an explicit list of learners instead of a mixed list of Servers (which can be peer or non-voter). |
498d3fe
to
b159733
Compare
the learner should always stay as follower state, and never try to promote or vote, right? where do we prevent that from happening? |
the promotable will always false for learner:
learner should be able to vote. because if we addNode a learner, the voters may think that node is voter, but that node may think its still learner.
|
The same thing happens with normal add node case. There are two motivations for adding learner:
What you are trying to do is to enhance 2. I would like to know why 2 itself is not strong enough. Do you have a specific use case or failure mode in mind? |
In our case, we have 3 nodes in one raft group. If we want replace one node, we will add a new node first, then remove the old one. |
Assume we have 3 nodes, A, B and C, and want to replace A with D, so now we can:
The corner case here is after 2, we have only B, C and if any one of them fails, the cluster can't work. If the learner D can vote, this is not a problem, but here we have another corner case, after 1, we have A, B, C, D, A and D are on the same node, so if the node crashes, the cluster can't work too. We have the same problem when we do:
I don't think the learner mechanism can fix this corner case, maybe only Raft Membership Change can do. So I suggest keeping things simple and letting learner don't be able to vote. |
I would say keep it as simple as possible for now. if the corner case does happen in reality or if we have a more concrete real world use case and failure case, we can improve it. |
Just to be more clear, after adding the learner concept, we already largely improve the reliability of membership change process. We will only promote a learner to a member if it can catch up. |
mis := make(uint64Slice, 0, len(r.prs)) | ||
for id := range r.prs { | ||
mis = append(mis, r.prs[id].Match) | ||
mis := make(uint64Slice, 0, r.voterCount()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seem we can still use len(r.prs)
to reduce the extra learner check in voterCount
.
PTAL @xiang90 |
@siddontang i would like to take another look after @javaforfun disables the vote of the learner. |
Hi @xiang90 Seem the learner can't vote now, see https://github.com/coreos/etcd/pull/8605/files#diff-a8629a0eb6da1c6c3805111f1f1af85eR376 |
@xiang90 If learner can't vote, how to prevent this problem? #8605 (comment) |
@javaforfun we are not going to solve the problem you brought up for now. we should focus on solving:
|
I agree with @xiang90, here we don't need a 100% solution for ConfChange. |
PTAL @xiang90 |
raft/raft.go
Outdated
@@ -116,6 +116,8 @@ type Config struct { | |||
// used for testing right now. | |||
peers []uint64 | |||
|
|||
learners []uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc string?
raft/raft.go
Outdated
peers := c.peers | ||
if len(cs.Nodes) > 0 { | ||
if len(peers) > 0 { | ||
voters := c.peers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use peers instead of voters. i do not want to create a new term for now.
raft/raft.go
Outdated
for _, p := range peers { | ||
r.prs[p] = &Progress{Next: 1, ins: newInflights(r.maxInflight)} | ||
for _, n := range voters { | ||
r.prs[n] = &Progress{Next: 1, ins: newInflights(r.maxInflight), isLearner: false} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert this change. bool is false by default.
raft/raftpb/raft.proto
Outdated
} | ||
|
||
enum ConfChangeType { | ||
ConfChangeAddNode = 0; | ||
ConfChangeRemoveNode = 1; | ||
ConfChangeUpdateNode = 2; | ||
ConfChangeAddLearner = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConfChangeAddLearnerNode
raft/raft.go
Outdated
if _, has := r.prs[n]; has { | ||
panic(fmt.Sprintf("cannot specify both Voter and Learner for node: %x", n)) | ||
} | ||
r.prs[n] = &Progress{Next: 1, ins: newInflights(r.maxInflight), isLearner: true} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it is easier to have a separate prs map for learners.
return true | ||
} | ||
|
||
func (r *raft) restoreNode(nodes []uint64, isLearner bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably restoreNode can take both nodes and learnerNodes as args.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
raft/raft.go
Outdated
} | ||
|
||
// promotable indicates whether state machine can be promoted to leader, | ||
// which is true when its own id is in progress list. | ||
// which is true when its own id is in progress list and its not learner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not a learner
raft/raft.go
Outdated
} | ||
|
||
// promotable indicates whether state machine can be promoted to leader, | ||
// which is true when its own id is in progress list. | ||
// which is true when its own id is in progress list and its not learner. | ||
func (r *raft) promotable() bool { | ||
_, ok := r.prs[r.id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if r.isLearer {
return false
}
...
is cleaner
raft/raft.go
Outdated
r.addLearnerNode(id, true) | ||
} | ||
|
||
func (r *raft) addLearnerNode(id uint64, isLearner bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addNodeOrLearnerNode.
or probably just duplicate code a little bit, and remove this func.
return | ||
} | ||
if isLearner { | ||
// can only change Learner to Voter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log here.
raft/raft.go
Outdated
} | ||
if isLearner { | ||
// can only change Learner to Voter | ||
r.logger.Infof("%x ignore addLearner for %x [%s]", r.id, id, r.prs[id]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to log the reason why addLearner is ignored.
return count | ||
} | ||
|
||
func (r *raft) quorum() int { return r.voterCount()/2 + 1 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is called frequently, maybe now we can use a variable to cache the quorum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
raft/raft_test.go
Outdated
@@ -3278,7 +3388,7 @@ func entsWithConfig(configFunc func(*Config), terms ...uint64) *raft { | |||
for i, term := range terms { | |||
storage.Append([]pb.Entry{{Index: uint64(i + 1), Term: term}}) | |||
} | |||
cfg := newTestConfig(1, []uint64{}, 5, 1, storage) | |||
cfg := newTestConfig(1, []uint64{1, 2, 3, 4, 5}, 5, 1, storage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this is changed?
raft/raft_test.go
Outdated
@@ -3293,7 +3403,7 @@ func entsWithConfig(configFunc func(*Config), terms ...uint64) *raft { | |||
func votedWithConfig(configFunc func(*Config), vote, term uint64) *raft { | |||
storage := NewMemoryStorage() | |||
storage.SetHardState(pb.HardState{Vote: vote, Term: term}) | |||
cfg := newTestConfig(1, []uint64{}, 5, 1, storage) | |||
cfg := newTestConfig(1, []uint64{1, 2, 3, 4, 5}, 5, 1, storage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this is changed?
raft/raftpb/raft.proto
Outdated
@@ -76,13 +76,15 @@ message HardState { | |||
} | |||
|
|||
message ConfState { | |||
repeated uint64 nodes = 1; | |||
repeated uint64 nodes = 1; // Voters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this comment. voter
is meaningless without any context. or we need to explain exactly what is nodes.
raft/raftpb/raft.proto
Outdated
@@ -76,13 +76,15 @@ message HardState { | |||
} | |||
|
|||
message ConfState { | |||
repeated uint64 nodes = 1; | |||
repeated uint64 nodes = 1; // Voters | |||
repeated uint64 learners = 2; // Nonvoters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here. we need to explain what it nonvoters.
we may need to handle the prs before merging this PR. also the prs in status struct, https://github.com/coreos/etcd/blob/master/raft/status.go#L30 type Status struct {
ID uint64
pb.HardState
SoftState
Applied uint64
Progress map[uint64]Progress
LeadTransferee uint64
} |
Seem this can handle leaner more easily, but may introduce complexity in other places like in /cc @xiang90 |
right. someone should give it a try at least to see if it is simpler. |
closed in favor of #8751 |
Partly implement #8568, only add Nonvoter support.
for protobuf, maybe we can add a version for pb.ConfState.