Academia.eduAcademia.edu

Outline

Mining Partially Periodic Event Patterns with Unknown Periods

2001

https://doi.org/10.1109/ICDE.2001.914829

Abstract

Periodic behavior is common in real-world applications. However in many cases, periodicities are partial in that they are present only intermittently. The authors study such intermittent patterns, which they refer to as p-patterns. The formulation of p-patterns takes into account imprecise time information (e.g., due to unsynchronized clocks in distributed environments), noisy data (e.g., due to extraneous events), and shifts in phase and/or periods. We structure mining for p-patterns as two sub-tasks: (1) finding the periods of p-patterns and (2) mining temporal associations. For (2), a level-wise algorithm is used. For (1), we develop a novel approach based on a chi-squared test, and study its performance in the presence of noise. Further we develop two algorithms for mining p-patterns based on the order in which the aforementioned sub-tasks are performed: the period-first algorithm and the association-first algorithm. Our results show that the association-first algorithm has a higher tolerance to noise; the period-first algorithm is more computationally efficient and provides flexibility as to the specification of support levels. In addition, we apply the period-first algorithm to mining data collected from two production computer networks, a process that led to several actionable insights

FAQs

sparkles

AI

What are the key challenges in mining partially periodic event patterns?add

The paper identifies challenges including unknown periods, noise interference, and handling phase shifts, complicating p-pattern detection.

How do the proposed algorithms address noise in event patterns?add

The association-first algorithm demonstrates higher tolerance to noise, outperforming the period-first algorithm as noise-to-signal ratios increase.

What is the significance of using the chi-squared test for detecting periods?add

The chi-squared test enables dynamic threshold adjustments based on inter-arrival times, enhancing detection accuracy for various period lengths.

How do the findings relate to practical applications in network monitoring?add

The algorithms yielded actionable insights in real network data, identifying potential security intrusions and configuration errors during evaluations.

What insights were gained regarding noise-to-signal ratios in pattern detection?add

Successful detection rates significantly deteriorate when noise-to-signal ratios exceed 1, indicating challenges in discerning true periodicity amid noise.

References (19)

  1. C. Aggarwal, C. Aggarwal, and V.V.V Parsad. Depth rst generation of long patterns. In Int'l Conf. on Knowledge Discovery and Data Mining, 2000.
  2. R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proc. of VLDB, pages 207{216, 1993.
  3. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. of VLDB, 1994.
  4. R.J. Bayardo. E ciently mining long patterns from database. In SIGMOD, pages 85{93, 1998.
  5. C. Bettini, X. Wang, and S. Jajodia. Mining temporal relationships with multiple granularities in time sequences. Data Engineering Bulletin, 21:32{38, 1998.
  6. R. COOLEY, J. SRIVASTAVA, and B. MOBASHER. Web mining: Information and pattern discovery on the world wide web. In Proceedings of the 9th IEEE International Conference on Tools with Arti cial Intelligence (ICTAI'97), 1997.
  7. J. Han, G. Dong, and Y. Yin. E cient mining of partially periodic patterns in time series database. In Int. Conf. Data Engineering, 1999.
  8. J. Han, W. Gong, and Y. Yin. Mining segment-wise periodic patterns in time-related database. In Int'l Conf. on Knowledge Discovery and Data Mining, 1998.
  9. H.O. Lancaster. The Chi-squared distibution. John Wiley & Sons, New York, 1969.
  10. S. Ma and J.L. Hellerstein. Eventbrowser: A exible tool for scalable analysis of event data. In DSOM'99, 1999.
  11. H. Mannila, H. Toivonen, and A. Verkamo. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1(3), 1997.
  12. T. Oates, M. Schmill, D. Jensen, and P. Cohen. A family of algorithms for nding temporal structure in data. In 6th Intl. Workshop on AI and Statistics, 1997.
  13. B. Ozden, S. Ramaswamy, and A. Silberschatz. Cyclic association rules. In Int. Conf. Data Engineering, pages 412{421, 1998.
  14. S.M. Roos. Introduction to probability and statistics for engineers and scientists. John Wiley & Sons, New York, 1987.
  15. R. Srikant and R. Agrawal. Mining sequential patterns: Generalizations and performance improve- ments. In Proc. of the Fifth Int'l Conference on Extending Database Technology (EDBT). Avignon, France., 1996.
  16. J. Yang, W. Wang, and P. Yu. Mining asynchronous periodic pattern in time series. In Int'l Conf. on Knowledge Discovery and Data Mining, 2000.
  17. O. Zaane, M. Xin, and J. Han. Discovering web access patterns and trends by applying olap and data mining technology on web logs. In Proc. Advances in Digital Libraries ADL'98, pages 19{29, 1998.
  18. M. Zaki. Fast mining of sequential patterns in very large databases, 1997. Technical Report URCS TR 668, University of Rochester.
  19. T. Zhang, R. Ramakrishnan, and M. Livny. Birch: A new data clustering algorithm and its applications. Data Mining and Knowledge Discovery, pages 141{182, 1997.