Cassandra -A Decentralized Structured Storage System
https://doi.org/10.1145/1773912.1773922Abstract
Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of failure. Cassandra aims to run on top of an infrastructure of hundreds of nodes (possibly spread across different data centers). At this scale, small and large components fail continuously. The way Cassandra manages the persistent state in the face of these failures drives the reliability and scalability of the software systems relying on this service. While in many ways Cassandra resembles a database and shares many design and implementation strategies therewith, Cassandra does not support a full rela-tional data model; instead, it provides clients with a simple data model that supports dynamic control over data layout and format. Cassandra system was designed to run on cheap commodity hardware and handle high write through-put while not sacrificing read efficiency.
References (21)
- REFERENCES
- MySQL AB. Mysql.
- Atul Adya, William J. Bolosky, Miguel Castro, Gerald Cermak, Ronnie Chaiken, John R. Douceur, Jon Howell, Jacob R. Lorch, Marvin Theimer, and Roger P. Wattenhofer. Farsite: Federated, available, and reliable storage for an incompletely trusted environment. In In Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI, pages 1-14, 2002.
- Mike Burrows. The chubby lock service for loosely-coupled distributed systems. In OSDI '06: Proceedings of the 7th symposium on Operating systems design and implementation, pages 335-350, Berkeley, CA, USA, 2006. USENIX Association.
- Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. Bigtable: A distributed storage system for structured data. In In Proceedings of the 7th Conference on USENIX Symposium on Operating Systems Design and Implementation -Volume 7, pages 205-218, 2006.
- Abhinandan Das, Indranil Gupta, and Ashish Motivala. Swim: Scalable weakly-consistent infection-style process group membership protocol. In DSN '02: Proceedings of the 2002 International Conference on Dependable Systems and Networks, pages 303-312, Washington, DC, USA, 2002. IEEE Computer Society.
- Giuseppe de Candia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: amazon Õs highly available key-value store. In Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, pages 205-220. ACM, 2007.
- Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large clusters. Commun. ACM, 51(1):107-113, 2008.
- Xavier Défago, Péter Urbán, Naohiro Hayashibara, and Takuya Katayama. The φ accrual failure detector. In RR IS-RR-2004-010, Japan Advanced Institute of Science and Technology, pages 66-78, 2004.
- Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The google file system. In SOSP '03: Proceedings of the nineteenth ACM symposium on Operating systems principles, pages 29-43, New York, NY, USA, 2003. ACM.
- Jim Gray and Pat Helland. The dangers of replication and a solution. In In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pages 173-182, 1996.
- David Karger, Eric Lehman, Tom Leighton, Matthew Levine, Daniel Lewin, and Rina Panigrahy. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In In ACM Symposium on Theory of Computing, pages 654-663, 1997.
- Matthew L. Massie, Brent N. Chun, and David E. Culler. The ganglia distributed monitoring system: Design, implementation, and experience. Parallel Computing, 30:2004, 2004.
- Benjamin Reed and Flavio Junquieira. Zookeeper.
- Peter Reiher, John Heidemann, David Ratner, Greg Skinner, and Gerald Popek. Resolving file conflicts in the ficus file system. In USTC'94: Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference, pages 12-12, Berkeley, CA, USA, 1994. USENIX Association.
- Robbert Van Renesse, Yaron Minsky, and Mark Hayden. A gossip-style failure detection service. In Service, Ť Proc. Conf. Middleware, pages 55-70, 1996.
- Mahadev Satyanarayanan, James J. Kistler, Puneet Kumar, Maria E. Okasaki, Ellen H. Siegel, and David C. Steere. Coda: A highly available file system for a distributed workstation environment. IEEE Trans. Comput., 39(4):447-459, 1990.
- Ion Stoica, Robert Morris, David Liben-nowell, David R. Karger, M. Frans Kaashoek, Frank Dabek, and Hari Balakrishnan. Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Transactions on Networking, 11:17-32, 2003.
- D. B. Terry, M. M. Theimer, Karin Petersen, A. J. Demers, M. J. Spreitzer, and C. H. Hauser. Managing update conflicts in bayou, a weakly connected replicated storage system. In SOSP '95: Proceedings of the fifteenth ACM symposium on Operating systems principles, pages 172-182, New York, NY, USA, 1995. ACM.
- Robbert van Renesse, Dan Mihai Dumitriu, Valient Gough, and Chris Thomas. Efficient reconciliation and flow control for anti-entropy protocols. In Proceedings of the 2nd Large Scale Distributed Systems and Middleware Workshop (LADIS '08), New York, NY, USA, 2008. ACM.
- Matt Welsh, David Culler, and Eric Brewer. Seda: an architecture for well-conditioned, scalable internet services. In SOSP '01: Proceedings of the eighteenth ACM symposium on Operating systems principles, pages 230-243, New York, NY, USA, 2001. ACM.