Modeling the Raft Distributed Consensus Protocol in LNT

Hugues Evrard
(Google)

Consensus protocols are crucial for reliable distributed systems as they let them cope with network and server failures. For decades, most consensus protocols have been designed as variations of the seminal Paxos, yet in 2014 Raft was presented as a new, "understandable" protocol, meant to be easier to implement than the notoriously subtle Paxos family. Raft has since been used in various industrial projects, e.g. Hashicorp's Consul or etcd (used by Google's Kubernetes). The correctness of Raft is established via a manual proof, based on a TLA+ specification of the protocol. This paper reports our experience in modeling Raft in the LNT process algebra. We found a couple of issues with the original TLA+ specification of Raft, which has been corrected since. More generally, this exercise offers a great opportunity to discuss how to best use the features of the LNT formal language and the associated CADP verification toolbox to model distributed protocols, including network and server failures.

Invited Paper in Ansgar Fehnker and Hubert Garavel: Proceedings of the 4th Workshop on Models for Formal Analysis of Real Systems (MARS 2020), Dublin, Ireland, April 26, 2020, Electronic Proceedings in Theoretical Computer Science 316, pp. 15–39.
Published: 26th April 2020.

ArXived at: https://dx.doi.org/10.4204/EPTCS.316.2 bibtex PDF
References in reconstructed bibtex, XML and HTML format (approximated).
Comments and questions to: eptcs@eptcs.org
For website issues: webmaster@eptcs.org