Skip to content

Commit

Permalink
kvserver: use background context when applying snapshots
Browse files Browse the repository at this point in the history
Following a3fd4fb, context errors are now propagated back up the
stack when receiving snapshots. However, this triggered a
`maybeFatalOnRaftReadyErr` assertion which crashed the node.

`handleRaftReadyRaftMuLocked` (which is called directly when applying
snapshots) does not appear prepared to safely handle arbitrary context
cancellation, as it is typically called with the Raft scheduler's
background context. Furthermore, by the time we call it we have already
received the entire snapshot, so it does not seem useful to abort
snapshot application just because the caller goes away.

This patch therefore uses a new background context for applying
snapshots, disconnected from the client's context, once the entire
snapshot has been received.

Release note: None
  • Loading branch information
erikgrinaker committed Dec 7, 2021
1 parent 39923c0 commit 541adf9
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 2 deletions.
2 changes: 1 addition & 1 deletion pkg/kv/kvserver/store_raft.go
Original file line number Diff line number Diff line change
Expand Up @@ -336,14 +336,14 @@ func (s *Store) processRaftSnapshotRequest(
var expl string
var err error
stats, expl, err = r.handleRaftReadyRaftMuLocked(ctx, inSnap)
maybeFatalOnRaftReadyErr(ctx, expl, err)
if !stats.snap.applied {
// This line would be hit if a snapshot was sent when it isn't necessary
// (i.e. follower was able to catch up via the log in the interim) or when
// multiple snapshots raced (as is possible when raft leadership changes
// and both the old and new leaders send snapshots).
log.Infof(ctx, "ignored stale snapshot at index %d", snapHeader.RaftMessageRequest.Message.Snapshot.Metadata.Index)
}
maybeFatalOnRaftReadyErr(ctx, expl, err)
return nil
})
}
Expand Down
8 changes: 7 additions & 1 deletion pkg/kv/kvserver/store_snapshot.go
Original file line number Diff line number Diff line change
Expand Up @@ -660,7 +660,13 @@ func (s *Store) receiveSnapshot(
return err
}
inSnap.placeholder = placeholder
if err := s.processRaftSnapshotRequest(ctx, header, inSnap); err != nil {

// Use a background context for applying the snapshot, as handleRaftReady is
// not prepared to deal with arbitrary context cancellation. Also, we've
// already received the entire snapshot here, so there's no point in
// abandoning application half-way through if the caller goes away.
applyCtx := s.AnnotateCtx(context.Background())
if err := s.processRaftSnapshotRequest(applyCtx, header, inSnap); err != nil {
return sendSnapshotError(stream, errors.Wrap(err.GoError(), "failed to apply snapshot"))
}
return stream.Send(&SnapshotResponse{Status: SnapshotResponse_APPLIED})
Expand Down

0 comments on commit 541adf9

Please sign in to comment.