Linearizable Replicated State Machines with Lattice Agreement
2018, arXiv (Cornell University)
Abstract
This paper studies the lattice agreement problem in asynchronous systems and explores its application to building linearizable replicated state machines (RSM). First, we propose an algorithm to solve the lattice agreement problem in $O(\log f)$ asynchronous rounds, where $f$ is the number of crash failures that the system can tolerate. This is an exponential improvement over the previous best upper bound. Second, Faleiro et al have shown in [Faleiro et al. PODC, 2012] that combination of conflict-free data types and lattice agreement protocols can be applied to implement linearizable RSM. They give a Paxos style lattice agreement protocol, which can be adapted to implement linearizable RSM and guarantee that a command can be learned in at most $O(n)$ message delays, where $n$ is the number of proposers. Later on, Xiong et al in [Xiong et al. DISC, 2018] give a lattice agreement protocol which improves the $O(n)$ guarantee to be $O(f)$. However, neither protocols is practical for building a linearizable RSM. Thus, in the second part of the paper, we first give an improved protocol based on the one proposed by Xiong et al. Then, we implement a simple linearizable RSM using the our improved protocol and compare our implementation with an open source Java implementation of Paxos. Results show that better performance can be obtained by using lattice agreement based protocols to implement a linearizable RSM compared to traditional consensus based protocols.
References (26)
- Leslie Lamport, The part-time parliament, ACM Transactions on Com- puter Systems (TOCS), v.16 n.2, p.133-169, May 1998
- Lamport, L. Paxos made simple. ACM SIGACT News 32, 4 (Dec. 2001), 18-25.
- M. Shapiro, N. Pregui Ca, C. Baquero, and M. Zawirski. Convergent and commutative replicated data types Bulletin of the European Association for Theoretical Computer Science (EATCS), (104):6788, 2011.
- M. Shapiro, "Conflict-Free Replicated Data Types", Proc. 13th Int l Conf. Stabilization Safety and Security of Distributed Systems (SSS 11) ACM, pp. 386-400, 2011.
- L. Lamport, "Fast Paxos Technical Report MSR-TR-2005-112", 2005.
- Jose M. Faleiro , Sriram Rajamani , Kaushik Rajan , G. Ramalingam , Kapil Vaswani, Generalized lattice agreement, Proceedings of the 2012 ACM symposium on Principles of distributed computing, July 16-18, 2012, Madeira, Portugal.
- Hagit Attiya , Maurice Herlihy , Ophir Rachman, Efficient Atomic Snapshots Using Lattice Agreement (Extended Abstract), Proceedings of the 6th International Workshop on Distributed Algorithms, p.35-53, November 02-04, 1992.
- Fischer, M. J.; Lynch, N. A.; Paterson, M. S. (1985). "Impossibility of distributed consensus with one faulty process". Journal of the ACM. 32 (2): 374382.
- Seth Gilbert and Nancy Lynch, "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services", ACM SIGACT News, Volume 33 Issue 2 (2002), pg. 5159.
- Xiong Zheng, Changyong Hu and Vijay K. Garg. Lattice Agreement in Message Passing Systems. http://arxiv.org/abs/1807.11557.
- L. Lamport, Time, clocks, and the ordering of events in a distributed system, Communications of the ACM, vol. 21, no. 7, pp. 558565, 1978.
- M. P. Herlihy and J. M. Wing, Linearizability: a correctness condition for concurrent objects, ACM Trans. Program. Lang. Syst., vol. 12, pp. 463492, July 1990.
- H. Attiya and O. Rachman. Atomic snapshots in O(nlogn) operations SICOMP. 31(2):642-664, Oct. 2001.
- C. E. Bezerra, F. Pedone, and R. Van Renesse. Scalable state-machine replication. In Dependable Systems and Networks (DSN), 2014 44th Annual IEEE/IFIP International Conference on, pages 331342. IEEE, 2014.
- Yanhua Mao , Flavio P. Junqueira , Keith Marzullo, Mencius: building efficient replicated state machines for WANs, Proceedings of the 8th USENIX conference on Operating systems design and implementation, p.369-384, December 08-10, 2008, San Diego, California.
- J. Koczak, N. Santos, T. Zurkowski, P. T. Wojciechowski, and A. Schiper. JPaxos: state machine replication based on the Paxos protocol. Technical report, EPFL, 2011.
- M. Biely, Z. Milosev ic, N. Santos, and A. Schiper, S-paxos: Offloading the leader for high throughput state machine replication, in SRDS, 2012.
- Shapiro, Marc and Preguic ¸a, Nuno and Baquero, Carlos and Zawirski, Marek. Convergent and commutative replicated data types. Bulletin- European Association for Theoretical Computer Science, 104, 67-88, 2011.
- Schneider, Fred B. Implementing,Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Computing Surveys (CSUR), 22, 4,299-319, 1990.
- Yehuda Afek , Hagit Attiya , Danny Dolev , Eli Gafni , Michael Merritt , Nir Shavit, Atomic snapshots of shared memory, Journal of the ACM (JACM), v.40 n.4, p.873-890, Sept. 1993.
- Dolev, Danny and Strong, H Raymond. "Authenticated algorithms for Byzantine agreement", SIAM Journal on Computing, v.12 n.4, p.656-666, 1983.
- H. Attiya, M. Herlihy and O. Rachman, Atomic Snapshots Using Lattice Agreement, Distributed Computing, V. 8, n.3, p.121-132, November 1992.
- Maurice P. Herlihy and Jeannette M. Wing. Linearizability: a correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 12:463492, July 1990.
- Chandra, Tushar; Griesemer, Robert; Redstone, Joshua (2007). "Paxos Made Live An Engineering Perspective". PODC '07: 26th ACM Symposium on Principles of Distributed Computing.
- Lamport, Leslie, Massa, Mike. "Cheap Paxos". Proceedings of the International Conference on Dependable Systems and Networks, 2004.
- Martin Biely , Zarko Milosevic , Nuno Santos , Andre Schiper, S-Paxos: Offloading the Leader for High Throughput State Machine Replication, Proceedings of the 2012 IEEE 31st Symposium on Reliable Distributed Systems, p.111-120, October 08-11, 2012.