Marc's Blog

About Me

My name is Marc Brooker. I've been writing code, reading code, and living vicariously through computers for as long as I can remember. I like to build things that work. I also dabble in brewing, cooking and skiing.

I'm currently an engineer at Amazon Web Services (AWS) in Seattle, working on computing without computers and block storage without disks.
All opinions are my own.

Related Links

Amazon Elastic Block Store is Hiring! @MarcJBrooker on Twitter

Viewstamped Replication: The Less-Famous Consensus Protocol

The first practical consensus protocol may be the least famous.

There's no doubt that Paxos is the most famous distributed consensus protocol. Distributed systems reading lists (such as this one by Dan Creswell and this one by Christopher Meiklejohn) nearly all include at least one paper describing it. That's as it should be. There's no doubt Paxos has been extremely influential in industry, and has formed the basis of many extremely successful systems. Lamport's Paxos Made Simple is very readable, and papers like Google's Paxos Made Live have helped raise the visibility of good Paxos implementation techniques.

Raft, on the other hand, is the most fashionable distributed consensus protocol. It seems like everybody's implementing it right now. I don't know if it's as understandable as it claims to be, but it's definitely in vogue. While I'm not a big fan of technology fads, I find it difficult to be upset about this one. Anything that encourages people to use well-proven distributed algorithms instead of crafting their own is good in my book.

By comparison to Paxos and Raft, one distributed consensus protocol seems frequently overlooked: Oki and Liskov's Viewstamped Replication. Introduced in May 1988 in Brian Oki's PhD thesis, Viewstamped Replication predates the first publication of Paxos by about a year. If you're looking for intrigue you may be disappointed: both Lamport and Liskov claim the inventions were independent. First, from Viewstamped Replication Revisited:

VR was originally developed in the 1980s, at about the same time as Paxos, but without knowledge of that work.

and from Keith Marzullo's comments in the 1998 reprinting of The Part-Time Parliament:

The author was also apparently unaware that the view management protocol by Oki and Liskov seems to be equivalent to the Paxon protocol.

In many ways, the Paxos protocol as described in The Part-Time Parliament and the Viewstamped Replication protocol are surprisingly different. Paxos' Synod protocol is the basic building block of consensus, and is used more-or-less directly for data replication. On the other hand, in VR, normal requests are merely stamped with a view number on their way through the primary, and are sent to all the replicas in parallel. The similarities start to become apparent when looking at how that view number is chosen: VR's view change protocol. In fact, the view change protocol describe in Section 4.2 of Viewstamped Replication Revisited bears a striking resemblance to the Paxos Synod protocol, especially when compared to the description in Section 2.2 of Paxos Made Simple.

It's easy to believe that these two protocols are, in fact, the same. That doesn't appear to be the case. A new paper by van Renesse et. al., titled Vive La Difference: Paxos vs. Viewstamped Replication vs. Zab, looks at Paxos and VR through the lenses of refinement and abstraction, and finds they are not exactly equivalent due to design decisions in the way they refine a model the paper calls Multi-Consensus. One of the key differences is active (Paxos) vs. passive (VR) replication:

Passive vs. Active Replication: In active replication, at least f + 1 replicas each must execute operations. In passive replication, only the sequencer executes operations, but it has to propagate state updates to the backups.

I don't have an answer for why Paxos is so much more famous than Viewstamped Replication. The first publication of viewstamped replication was more readable, though less entertaining, than the first publication of Paxos. Implemented out of the paper, VR likely has better performance properties than Paxos, for similar implementation effort and complexity. Barbara Liskov is more widely known among programmers and computer scientists than Leslie Lamport, thanks to the Liskov substitution principle. I can't think of a good explanation at all.

Whatever the cause, both Viewstamped Replication and Viewstamped Replication Revisited are well worth including on your distributed systems reading list.