Skip to content

Commit

Permalink
etcdserver: Use panic instead of fatal on no space left error
Browse files Browse the repository at this point in the history
When using the embed package to embed etcd, sometimes the storage prefix
being used might be full. In this case, this code path triggers, causing
an: `etcdserver: create wal error: no space left on device` error, which
causes a fatal. A fatal differs from a panic in that it also calls
os.Exit(1). In this situation, the calling program that embeds the etcd
server will be abruptly killed, which prevents it from cleaning up
safely, and giving a proper error message. Depending on what the calling
program is, this can cause corruption and data loss.

This patch switches the fatal to a panic. Ideally this would be a
regular error which would get propagated upwards to the StartEtcd
command, but in the meantime at least this can be caught with recover().

This fixes the most common fatal that I've experienced, but there are
surely more that need looking into. If possible, the errors should be
threaded down into the code path so that embedding etcd can be more
robust.

Fixes: etcd-io#10588
  • Loading branch information
purpleidea committed Mar 27, 2019
1 parent 9c2b88d commit 368f70a
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions etcdserver/raft.go
Original file line number Diff line number Diff line change
Expand Up @@ -427,9 +427,9 @@ func startNode(cfg ServerConfig, cl *membership.RaftCluster, ids []types.ID) (id
)
if w, err = wal.Create(cfg.Logger, cfg.WALDir(), metadata); err != nil {
if cfg.Logger != nil {
cfg.Logger.Fatal("failed to create WAL", zap.Error(err))
cfg.Logger.Panic("failed to create WAL", zap.Error(err))
} else {
plog.Fatalf("create wal error: %v", err)
plog.Panicf("create wal error: %v", err)
}
}
peers := make([]raft.Peer, len(ids))
Expand Down

0 comments on commit 368f70a

Please sign in to comment.