A Fast, General System for Buffered Persistent Data Structures
2021, 50th International Conference on Parallel Processing
https://doi.org/10.1145/3472456.3472458Abstract
The emergence of fast, dense, nonvolatile main memory suggests that certain long-lived data might remain in their natural pointerrich format across program runs and hardware reboots. Operations on such data must currently be instrumented with explicit writeback and fence instructions to ensure consistency in the wake of a crash. Techniques to minimize the cost of this instrumentation are an active topic of research. We present what we believe to be the first general-purpose approach to building buffered persistent data structures, and a system, Montage, to support that approach. Montage is built on top of the Ralloc nonblocking persistent allocator. It employs a millisecondgranularity epoch clock, and ensures that no operation appears to span an epoch boundary. It also arranges to persist only that data minimally required to reconstruct the structure after a crash. If a crash occurs in epoch , all work performed in epochs and − 1 is lost, but work from prior epochs is preserved, consistently. As in traditional file and database systems, a sync operation can be used to flush buffers on demand; the Montage sync is extremely fast. We describe the implementation of Montage, argue its correctness, and report unprecedented throughput for persistent queues, sets/mappings, and general graphs. CCS CONCEPTS • Theory of computation → Parallel computing models; • Computing methodologies → Concurrent algorithms; • Computer systems organization → Reliability.
References (50)
- D. Aksun and J. Larus. Durability through NVM checkpointing (poster). In 12th Non-Volatile Memories Wkshp., Mar. 2021.
- H. A. Beadle, W. Cai, H. Wen, and M. L. Scott. Nonblocking persistent software transactional memory. In 27th Intl. Conf. on High Performance Computing, Data, and Analytics (HiPC), Dec. 2020.
- W. Cai, H. Wen, H. A. Beadle, C. Kjellqvist, M. Hedayati, and M. L. Scott. Under- standing and optimizing persistent memory allocation. In 19th Intl. Symp. on Memory Management (ISMM), June 2020.
- D. R. Chakrabarti, H.-J. Boehm, and K. Bhandari. Atlas: Leveraging locks for non-volatile memory consistency. In ACM Conf. on Object Oriented Programming Systems Languages & Applications (OOPSLA), Oct. 2014.
- A. Chatzistergiou, M. Cintra, and S. D. Viglas. REWIND: Recovery write-ahead system for in-memory non-volatile data-structures. Proc. of the VLDB Endowment, 8(5):497-508, Jan. 2015.
- S. Chen and Q. Jin. Persistent B+-trees in non-volatile main memory. Proc. of the VLDB Endowment, 8(7):786-797, Feb. 2015.
- J. Coburn, A. M. Caulfield, A. Akel, L. M. Grupp, R. K. Gupta, R. Jhala, and S. Swan- son. NV-Heaps: Making persistent objects fast and safe with next-generation, non-volatile memories. In 16th Intl. Conf. on Architectural Support for Program- ming Languages and Operating Systems (ASPLOS), Mar. 2011.
- N. Cohen, D. T. Aksun, H. Avni, and J. R. Larus. Fine-grain checkpointing with in-cache-line logging. In 24th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2019.
- N. Cohen, D. T. Aksun, and J. R. Larus. Object-oriented recovery for non-volatile memory. Proc. of the ACM on Programming Languages, 2(OOPSLA), Oct. 2018.
- B. Cooper. YCSB core workloads. https://github.com/brianfrankcooper/YCSB/ wiki/Core-Workloads, 2010.
- A. Correia, P. Felber, and P. Ramalhete. Romulus: Efficient algorithms for per- sistent transactional memory. In 30th ACM Symp. on Parallel Algorithms and Architectures (SPAA), July 2018.
- T. David, A. Dragojević, R. Guerraoui, and I. Zablotchi. Log-free concurrent data structures. In Usenix Annual Technical Conf. (ATC), July 2018.
- M. Friedman, N. Ben-David, Y. Wei, G. E. Blelloch, and E. Petrank. NVTraverse: In NVRAM data structures, the destination is more important than the journey. In 41st ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), June 2020.
- M. Friedman, M. Herlihy, V. Marathe, and E. Petrank. A persistent lock-free queue for non-volatile memory. In 23rd ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming (PPoPP), 2018.
- E. R. Giles, K. Doshi, and P. Varman. SoftWrAP: A lightweight framework for transactional support of storage class memory. In 31st Symp. on Mass Storage Systems and Technologies (MSST), May-June 2015.
- J. Gu, Q. Yu, X. Wang, Z. Wang, B. Zang, H. Guan, and H. Chen. Pisces: A scalable and efficient persistent transactional memory. In Usenix Annual Technical Conf. (ATC), July 2019.
- S. Haria, M. D. Hill, and M. M. Swift. MOD: Minimally ordered durable datas- tructures for persistent memory. In 25th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Mar. 2020.
- T. C.-H. Hsu, H. Brügner, I. Roy, K. Keeton, and P. Eugster. NVthreads: Practical persistence for multi-threaded applications. In 12th European Conf. on Computer Systems (EuroSys), Apr. 2017.
- D. Hwang, W.-H. Kim, Y. Won, and B. Nam. Endurable transient inconsistency in byte-addressable persistent B+-tree. In 16th Usenix Conf. on File and Storage Technologies (FAST), Feb. 2018.
- J. Izraelevitz, T. Kelly, and A. Kolli. Failure-atomic persistent memory updates via JUSTDO logging. In 21st Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Apr. 2016.
- J. Izraelevitz, H. Mendes, and M. L. Scott. Linearizability of persistent memory objects under a full-system-crash failure model. In Intl. Symp. on Distributed Computing (DISC), Sep. 2016.
- J. Izraelevitz, J. Yang, L. Zhang, J. Kim, X. Liu, A. Memaripour, Y. J. Soh, Z. Wang, Y. Xu, S. R. Dulloor, J. Zhao, and S. Swanson. Basic performance measurements of the Intel Optane DC persistent memory module, Aug. 2019. arXiv:1903.05714v3.
- C. Kjellqvist, M. Hedayati, and M. L. Scott. Safe, fast sharing of memcached as a protected library. In 49th Intl. Conf. on Parallel Processing (ICPP), Aug. 2020.
- R. M. Krishnan, J. Kim, A. Mathew, X. Fu, A. Demeri, C. Min, and S. Kannan. Durable transactional memory can scale with TimeStone. In 25th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Mar. 2020.
- H. T. L., F. Keir, and P. I. A. A practical multi-word compare-and-swap operation. In 16th Intl. Symp. on Distributed Computing (DISC), Oct. 2002.
- S. K. Lee, K. H. Lim, H. Song, B. Nam, and S. H. Noh. WORT: Write optimal radix tree for persistent memory storage systems. In 15th Usenix Conf. on File and Storage Technologies (FAST), Feb. 2017.
- R. Leite and R. Rocha. LRMalloc: A modern and competitive lock-free dynamic memory allocator. In 13th Intl. Meeting on High Performance Computing for Computational Science (VECPAR), Sept. 2018.
- J. Leskovec and R. Sosič. SNAP: A general-purpose network analysis and graph- mining library. ACM Trans. on Intelligent Systems and Technology, 8(1):1:1-1:20, July 2016.
- Q. Liu, J. Izraelevitz, S. K. Lee, M. L. Scott, S. H. Noh, and C. Jung. iDO: Compiler- directed failure atomicity for nonvolatile memory. In 51st Intl. Symp. on Microar- chitecture (MICRO), Oct. 2018.
- Y. Liu, V. Luchangco, and M. Spear. Mindicators: A scalable approach to quies- cence. In 33rd IEEE Intl. Conf. on Distributed Computing Systems (ICDCS), July 2013.
- P. Mahapatra, M. D. Hill, and M. M. Swift. Don't persist all: Efficient persistent data structures, 2019. arXiv:1905.13011.
- A. Memaripour, J. Izraelevitz, and S. Swanson. Pronto: Easy and fast persis- tence for volatile data structures. In 25th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Mar. 2020.
- A. Memaripour and S. Swanson. Breeze: User-level access to non-volatile main memories for legacy software. In 36th Intl. Conf. on Computer Design (ICCD), Oct. 2018.
- M. Nam, H. Cha, K. Jin, J. Seo, and B. Nam. B3-tree: Byte-addressable binary B-tree for persistent memory. ACM Trans. on Storage, 16(3):17:1-17:27, July 2020.
- F. Nawab, J. Izraelevitz, T. Kelly, C. B. M. III, D. R. Chakrabarti, and M. L. Scott. Dalí: A periodically persistent hash map. In Intl. Symp. on Distributed Computing (DISC), Oct. 2017.
- I. Oukid, J. Lasperas, A. Nica, T. Willhalm, and W. Lehner. FPTree: A hybrid SCM-DRAM persistent and concurrent B-tree for storage class memory. In Intl. Conf. on the Management of Data (SIGMOD), June-July 2016.
- M. Pavlovic, A. Kogan, V. J. Marathe, and T. Harris. Brief announcement: Persis- tent multi-word compare-and-swap. In ACM Symp. on Principles of Distributed Computing (PODC), July 2018.
- P. Ramalhete, A. Correia, P. Felber, and N. Cohen. OneFile: A wait-free persistent transactional memory. In 49th IEEE/IFIP Intl. Conf. on Dependable Systems and Networks (DSN), June 2019.
- T. Riegel, P. Felber, and C. Fetzer. A lazy snapshot algorithm with eager validation. In 20th Intl. Symp. on Distributed Computing (DISC), Sept. 2006.
- S. Scargall. Using persistent memory devices with the Linux device map- per. https://pmem.io/2018/05/15/using_persistent_memory_devices_with_the_ linux_device_mapper.html#io-alignment-considerations, May 2018.
- D. Schwalb, M. Dreseler, M. Uflacker, and H. Plattner. NVC-hashmap: A persistent and concurrent hashmap for non-volatile memories. In 3rd VLDB Wkshp. on In-Memory Data Management and Analytics (IMDM), Aug. 2015.
- U. U. and A. M. Rudoff. Introduction to Programming with Persistent Memory from Intel. https://software.intel.com/en-us/articles/introduction-to- programming-with-persistent-memory-from-intel, Aug. 2017.
- S. Venkataraman, N. Tolia, P. Ranganathan, and R. H. Campbell. Consistent and durable data structures for non-volatile byte-addressable memory. In 9th Usenix Conf. on File and Storage Technologies (FAST), Feb. 2011.
- H. Volos, A. J. Tack, and M. M. Swift. Mnemosyne: Lightweight persistent memory. In 16th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Mar. 2011.
- C. Wang, Q. Wei, L. Wu, S. Wang, C. Chen, X. Xiao, J. Yang, M. Xue, and Y. Yang. Persisting RB-Tree into NVM in a consistency perspective. ACM Trans. on Storage, 14(1):6:1-6:27, Feb. 2018.
- Z. Wu, K. Lu, A. Nisbet, W. Zhang, and M. Luján. PMThreads: Persistent memory threads harnessing versioned shadow copies. In 41st ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), Mar. 2020.
- Y. Xu, J. Izraelevitz, and S. Swanson. Clobber-NVM: Log less, re-execute more. In 26th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2021.
- J. Yang and J. Leskovec. SNAP dataset: Orkut social network and ground-truth communities. https://snap.stanford.edu/data/com-Orkut.html, May 2012.
- J. Yang, Q. Wei, C. Chen, C. Wang, K. L. Yong, and B. He. NV-Tree: Reducing consistency cost for NVM-based single level systems. In 13th Usenix Conf. on File and Storage Technologies (FAST), Feb. 2015.
- Y. Zuriel, M. Friedman, G. Sheffi, N. Cohen, and E. Petrank. Efficient lock-free durable sets. Proc. of the ACM on Programming Languages, 3(OOPSLA), Oct. 2019.