Academia.eduAcademia.edu

Outline

Shape sensitive geometric monitoring

2008

Abstract

An important problem in distributed, dynamic databases is to continuously monitor the value of a function defined on the nodes, and check that it satisfies some threshold constraint. We introduce a monitoring method, based on a geometric interpretation of the problem, which enables to define local constraints at the nodes. It is guaranteed that as long as none of these constraints is violated, the value of the function did not cross the threshold. We generalize previous work on geometric monitoring, and solve two problems which seriously hampered its performance: as opposed to the constraints used so far, which depend only on the current values of the local data, here we incorporate their temporal behavior. Also, the new constraints are tailored to the geometric properties of the specific monitored function. In addition, we extend the concept of safe zones for the monitoring problem, and show that previous work on geometric monitoring is a special case of the proposed extension. Exp...

References (35)

  1. REFERENCES
  2. Shipra Agrawal, Supratim Deb, K. V. M. Naidu, and Rajeev Rastogi. Efficient detection of distributed constraint violations. In ICDE '07, pages 1320-1324.
  3. Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. In STOC '96, pages 20-29.
  4. Arvind Arasu and Gurmeet Singh Manku. Approximate counts and quantiles over sliding windows. In PODS '04, pages 286-296.
  5. Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, and Jennifer Widom. Models and issues in data stream systems. In PODS '02, pages 1-16.
  6. Brian Babcock and Chris Olston. Distributed top-k monitoring. In SIGMOD '03, pages 28-39.
  7. Donald Carney, Ugur Çetintemel, Mitch Cherniack, Christian Convey, Sangdon Lee, Greg Seidman, Michael Stonebraker, Nesime Tatbul, and Stanley B. Zdonik. Monitoring streams -a new class of data management applications. In VLDB '02, pages 215-226.
  8. Amit Chakrabarti, Graham Cormode, and Andrew McGregor. A near-optimal algorithm for computing the entropy of a stream. In SODA '07.
  9. Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding frequent items in data streams. In ICALP '02, pages 693-703.
  10. Edith Cohen and Martin J. Strauss. Maintaining time-decaying stream aggregates. J. Algorithms, 59(1):19-36.
  11. G. Cormode, R. Keralapura, and J. Ramimirtham. Communication-efficient distributed monitoring of thresholded counts. In SIGMOD '06.
  12. Graham Cormode and Minos Garofalakis. Sketching streams through the net: distributed approximate query tracking. In VLDB '05, pages 13-24.
  13. Graham Cormode, Minos Garofalakis, S. Muthukrishnan, and Rajeev Rastogi. Holistic aggregates in a networked world: distributed tracking of approximate quantiles. In SIGMOD '05, pages 25-36.
  14. Graham Cormode, S. Muthukrishnan, and Wei Zhuang. Conquering the divide: Continuous clustering of distributed data streams. In ICDE '07, pages 1036-1045.
  15. Graham Cormode, S. Muthukrishnan, and Wei Zhuang. What's different: Distributed, continuous monitoring of duplicate-resilient aggregates on data streams. In ICDE '06, page 57.
  16. Abhinandan Das, Sumit Ganguly, Minos Garofalakis, and Rajeev Rastogi. Distributed set-expression cardinality estimation. In VLDB '04, pages 312-323.
  17. Mayur Datar, Aristides Gionis, Piotr Indyk, and Rajeev Motwani. Maintaining stream statistics over sliding windows: (extended abstract). In SODA '02, pages 635-644.
  18. Mark Dilman and Danny Raz. Efficient reactive monitoring. In INFOCOM '01, pages 1012-1019.
  19. Gereon Frahling, Piotr Indyk, and Christian Sohler. Sampling in dynamic data streams and applications. In SCG '05, pages 142-149.
  20. Ling Huang, Minos Garofalakis, Joseph Hellerstein, Anthony Joseph, and Nina Taft. Toward sophisticated detection with distributed triggers. In MineNet '06, pages 311-316.
  21. Ling Huang, XuanLong Nguyen, Minos N. Garofalakis, Joseph M. Hellerstein, Michael I. Jordan, Anthony D. Joseph, and Nina Taft. Communication-efficient online detection of network-wide anomalies. In INFOCOM '07, pages 134-142.
  22. Ankur Jain, Joseph M. Hellerstein, Sylvia Ratnasamy, and David Wetherall. A wakeup call for internet monitoring systems: The case for distributed triggers. In Proc. 3rd ACM SIGCOMM Workshop on Hot Topics in Networks (HotNets), 2004.
  23. David D. Lewis, Yiming Yang, Tony G. Rose, and Fan Li. Rcv1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5:361-397, 2004.
  24. Samuel Madden and Michael J. Franklin. Fjording the stream: An architecture for queries over streaming sensor data. In ICDE '02, page 555.
  25. Samuel Madden, Mehul Shah, Joseph M. Hellerstein, and Vijayshankar Raman. Continuously adaptive continuous queries over streams. In SIGMOD '02, pages 49-60.
  26. Amit Manjhi, Vladislav Shkapenyuk, Kedar Dhamdhere, and Christopher Olston. Finding (recently) frequent items in distributed data streams. In ICDE '05, pages 767-778.
  27. Gurmeet Singh Manku and Rajeev Motwani. Approximate frequency counts over data streams. In VLDB '02, pages 346-357.
  28. Chris Olston, Jing Jiang, and Jennifer Widom. Adaptive filters for continuous queries over distributed data streams. In SIGMOD '03, pages 563-574.
  29. P.A. Parrilo. Semidefinite programming relaxations for semialgebraic problems. Mathematical Programming, 96(2):293-320, 2003.
  30. T.G. Rose, M. Stevenson, and M. Whitehead. The Reuters Corpus Volume 1 -from Yesterday's News to Tomorrow's Language Resources. In LREC '02, pages 827-832.
  31. Izchak Sharfman, Assaf Schuster, and Daniel Keren. A geometric approach to monitoring threshold functions over distributed data streams. In SIGMOD '06, pages 301-312.
  32. Yiming Yang and Jan O. Pedersen. A comparative study on feature selection in text categorization. In ICML '97, pages 412-420.
  33. Byoung-Kee Yi, Nikolaos Sidiropoulos, Theodore Johnson, H. V. Jagadish, Christos Faloutsos, and Alexandros Biliris. Online data mining for co-evolving time sequences. In ICDE '00, page 13.
  34. Yonggang Jerry Zhao, Ramesh Govindan, and Deborah Estrin. Computing aggregates for monitoring wireless sensor networks. In SNPA 03.
  35. Yunyue Zhu and Dennis Shasha. Statstream: Statistical monitoring of thousands of data streams in real time. In VLDB '02, pages 358-369.