- There is a problem in the TLA+ specification and the Golang implementation of Egalitarian Paxos.
- A counter-example in TLA+ is provided (under
tla+/CounterExample.cfg
). - The counter-example can be executed with the following command:
docker run --rm -ti 0track/epaxos-counter-example:latest
. - The algorithm reaches a state in which processes disagree on the dependencies of a command; this breaks safety.
- A detailed explanation of the problem is available here.
-
How was the bug discovered?
When auditing the recovery mechanism of EPaxos. -
Is there a fix?
Each process needs to maintain the last ballot at which it voted. This requires an additional ballot variable in the algorithm. Such an approach is implemented in the following repository. The correctness of the resulting algorithm has not been established yet.