Academia.eduAcademia.edu

Outline

Valmar: High-bandwidth real-time streaming data management

2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST)

Abstract

In applications ranging from radio telescopes to Internet traffic monitoring, our ability to generate data has outpaced our ability to effectively capture, mine, and manage it. These ultra-high-bandwidth data streams typically contain little useful information and most of the data can be safely discarded. Periodically, however, an event of interest is observed and a large segment of the data must be preserved, including data preceding detection of the event. Doing so requires guaranteed data capture at source rates, line speed filtering to detect events and data points of interest, and TiVo-like ability to save past data once an event has been detected. We present Valmar, a system for guaranteed capture, indexing, and storage of ultra-highbandwidth data streams. Our results show that Valmar performs at nearly full disk bandwidth, up to several orders of magnitude faster than flat file and database systems, works well with both small and large data elements, and allows concurrent read and search access without compromising data capture guarantees.

References (18)

  1. "ARGUS FAQ," http://www.qosient.com/argus/faq.shtml, 2011. [Online]. Available: http://www.qosient.com/argus/faq.shtml
  2. D. Bigelow, S. Brandt, J. Bent, and H. Chen, "Mahanaxar: Quality of service guarantees in high-bandwidth, real-time streaming data storage," in Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on, may 2010, pp. 1 -11.
  3. P. Bosch, S. Mullender, and P. Jansen, "Clockwise: a mixed-media file system," in Multimedia Computing and Systems, 1999. IEEE Interna- tional Conference on, vol. 2, July 1999, pp. 277 -281 vol.2.
  4. S. Brandt, C. Maltzahn, N. Polyzotis, and W.-C. Tan, "Fusing data management services with file systems," in Proceedings of the 4th Annual Workshop on Petascale Data Storage, ser. PDSW '09. New York, NY, USA: ACM, 2009, pp. 42-46. [Online]. Available: http://doi.acm.org.oca.ucsc.edu/10.1145/1713072.1713085
  5. Cisco Systems, "Introduction to Cisco IOS NetFlow -A Technical Overview," Cisco Systems, Tech. Rep. C17-408326-01, October 2007.
  6. D. K. Gifford, P. Jouvelot, M. A. Sheldon, and J. W. O'Toole, Jr., "Semantic file systems," in Proceedings of the thirteenth ACM symposium on operating systems principles, ser. SOSP '91. New York, NY, USA: ACM, 1991, pp. 16-25. [Online]. Available: http://doi.acm.org.oca.ucsc.edu/10.1145/121132.121138
  7. L. C. Grid, "Gridbriefings: Grid computing in five minutes," August 2008.
  8. W. W. Hsu, A. J. Smith, and H. C. Young, "The automatic improvement of locality in storage systems," ACM Trans. Comput. Syst., vol. 23, no. 4, pp. 424-473, 2005.
  9. S. Kornexl, V. Paxson, H. Dreger, A. Feldmann, and R. Sommer, "Building a time machine for efficient recording and retrieval of high- volume network traffic," in IMC '05: Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement. Berkeley, CA, USA: USENIX Association, 2005, pp. 23-23.
  10. A. Molano, K. Juvva, and R. Rajkumar, "Real-time filesystems. Guar- anteeing timing constraints for disk accesses in RT-Mach," in The 18th IEEE Real-Time Systems Symposium, December 1997, pp. 155-165.
  11. A. Povzner, T. Kaldewey, S. Brandt, R. Golding, T. M. Wong, and C. Maltzahn, "Efficient guaranteed disk request scheduling with fahrrad," in Eurosys '08: Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008. New York, NY, USA: ACM, 2008, pp. 13-25.
  12. A. Povzner, D. Sawyer, and S. Brandt, "Horizon: efficient deadline- driven disk i/o management for distributed storage systems," in Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, ser. HPDC '10. New York, NY, USA: ACM, 2010, pp. 1-12. [Online]. Available: http: //doi.acm.org.oca.ucsc.edu/10.1145/1851476.1851478
  13. A. Rajasekar, S. Lu, R. Moore, F. Vernon, J. Orcutt, and K. Lindquist, "Accessing sensor data using meta data: a virtual object ring buffer framework," in DMSN '05: Proceedings of the 2nd international work- shop on data management for sensor networks. New York, NY, USA: ACM, 2005, pp. 35-42.
  14. L. Reuther and M. Pohlack, "Rotational-position-aware real-time disk scheduling using a dynamic active subset (DAS)," in Real-Time Systems Symposium, 2003. RTSS 2003. 24th IEEE, dec. 2003, pp. 374 -385.
  15. C. A. N. Soules and G. R. Ganger, "Connections: using context to enhance file search," in Proceedings of the twentieth ACM symposium on operating systems principles, ser. SOSP '05. New York, NY, USA: ACM, 2005, pp. 119-132. [Online]. Available: http://doi.acm.org.oca.ucsc.edu/10.1145/1095810.1095822
  16. S. Tilak, P. Hubbard, M. Miller, and T. Fountain, "The Ring Buffer Network Bus (RBNB) DataTurbine Streaming Data Middleware for Environmental Observing Systems," in e-Science, Bangalore, India, 10/12/2007.
  17. C. A. Waldspurger and W. E. Weihl, "Lottery scheduling: Flexible proportional-share resource management," in Proceedings of the First Symposium on Operating Systems Design and Implementation, Novem- ber 1994.
  18. J. Wu and S. Brandt, "Providing quality of service support in object- based file system," in 24th IEEE Conference on Mass Storage Systems and Technologies, September 2007, pp. 157-170.