Academia.eduAcademia.edu

Outline

On-line new event detection and tracking

1998, Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval

Abstract

We define and describe the related problems of new event detection and event tracking within a stream of broadcast news stories. We focus on a strict on-line setting-i.e., the system must make decisions about one story before looking at any subsequent stories. Our approach to detection uses a single pass clustering algorithm and a novel thresholding model that incorporates the properties of events as a major component. Our approach to tracking is similar to typical information filtering methods. We discuss the value of "surprising" features that have unusual occurrence characteristics, and briefly explore on-line adaptive filtering to handle evolving events in the news. New event detection and event tracking are part of the Topic Detection and Tracking (TDT) initiative.

References (19)

  1. '31 [71 PI J. Allan. Incremental relevance feedback for infor- mation filtering. In Proceedings of SIGIR '96, pages 270-278, 1996.
  2. J. Allan, L. Ballesteros, J. Callan, W. Croft, and Z. Lu. Recent experiments with inquery. In The Fourth Text REtrieval Conference (TREC-4), pages 49-63, 1995.
  3. J. Allan, J. Carbonell, G. Doddington, J. Yamron, and Y. Yang. Topic detection and tracking pilot study: Final report. In Proceedings of the DARPA Broadcast New !&anscription and Understanding Workshop, 1998. Forthcoming.
  4. C. Buckley and G. Salton. Optimization of relevance feedback weights. In Proceedings of SIGIR '95, pages 351-357, 1995.
  5. J. Callan. Document filtering with inference net- works. In Proceedings of SIGIR '96, pages 262-269, 1996.
  6. C. Carrick and C. Watters. Automatic association of news items. Information Processing 8 Management, 33(5):615-632, 1997.
  7. P. Cohen. Empirical Methods for Artijicial Intelli- gence. The MIT Press, Cambridge, Massachusetts, 1995.
  8. G. DeJong. Prediction and substantiation: A new approach to natural language processing. Cognitive Science, 3:251-273, 1979. PI WI IllI WI P31 (141
  9. P. Hayes, L. Knecht, and M. Cellio. A News Story Categorization System, pages 518-526. Morgan Kaufmann Publishing, San Francisco, 1997. Origi- nally appeared in Proceedings of the 2nd Conference on Applied Natural Language Processing, 1988.
  10. K. S. Jones and P. Willett, editors. Readings in In- formation Retrieval. Morgan Kaufmann Publishing, San Francisco, 1997. Chapter 4, pages 167-256.
  11. R. Kohavi. A study of cross-validation and boot- strap for accuracy estimation and model selection.
  12. In Proceedings of International Joint Conference on Artijicial Intelligence, 1995.
  13. W. Lam, S. Mukhopadhyay, J. Most&a, and M. Palakal. Detection of shifts in user interests for personalized information filtering. In Proceedings of SIGIR '96, pages 317-325, 1996.
  14. D. Lewis, R. Schapire, J. Callan, and R. Papka.
  15. Training algorithms for linear text classifiers. In Pro- ceedings of SIGIR '96, pages 298-306, 1996.
  16. D. D. Lewis. The TREC-5 filtering track. In E. M. Voorhees and D. K. Harman, editors, The Fijth Text REtrieval Conference (TREC-5), pages 75-96, Nov. 1997. NIST Special Publication 500-238.
  17. A. Martin, T. K. G. Doddington, M. Ordowski, and M. Przybocki. The DET curve in assessment of detection task performance. In Proceedings of Eu- roSpeech'97, volume 4, pages 1895-1898, 1997. B. Masland, G. Linoff, and D. Waltz. Classifying news stories using memory based reasoning. In Pro- ceedings of SIGIR '92, pages 59-65, 1992.
  18. R. Papka, J. Callan, and A. Barto. Text-baaed in- formation retrieval using exponentiated gradient de- scent. In Proceedings of the 10th Annual Conference of Advances in Neural Information Processing Sys- tems, pages 3-9, 1996.
  19. G. Salton. Automatic Text Processing. Addison- Wesley Publishing Co, Massachusetts, 1989.