Storybase: Towards Building a Knowledge Base for News Events
Abstract
To better organize and understand online news information, we propose Storybase 1 , a knowledge base for news events that builds upon Wikipedia current events and daily Web news. It first constructs stories and their timelines based on Wikipedi-a current events and then detects and links daily news to enrich those Wikipedia stories with more comprehensive events. We encode events and develop efficient event clustering and chaining techniques in an event space. We demonstrate Storybase with a news events search engine that helps find historical and ongoing news stories and inspect their dynamic timelines.
References (24)
- Amr Ahmed, Qirong Ho, Jacob Eisenstein, Eric Xing, Alexander J Smola, and Choon Hui Teo. 2011. U- nified analysis of streaming news. In WWW, pages 267-276.
- James Allan. 2002. Introduction to topic detection and tracking. In James Allan, editor, Topic Detec- tion and Tracking, volume 12 of The Information Retrieval Series, pages 1-16.
- Andrei Z Broder. 1997. On the resemblance and con- tainment of documents. In Compression and Com- plexity of Sequences 1997., pages 21-29. IEEE.
- Nathanael Chambers and Daniel Jurafsky. 2008. Un- supervised learning of narrative event chains. In A- CL, pages 789-797.
- Moses S Charikar. 2002. Similarity estimation tech- niques from rounding algorithms. In STOC, pages 380-388.
- Martin Ester, Hans-Peter Kriegel, Jörg Sander, and X- iaowei Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD, volume 96, pages 226-231.
- Jerome H Friedman. 2001. Greedy function approxi- mation: a gradient boosting machine. Annals of s- tatistics, pages 1189-1232.
- Gabriel Pui Cheong Fung, Jeffrey Xu Yu, Philip S. Yu, and Hongjun Lu. 2005. Parameter free bursty events detection in text streams. In VLDB, pages 181-192.
- J. A. Hartigan and M. A. Wong. 1979. A k-means clustering algorithm. JSTOR: Applied Statistics, 28(1):100-108.
- Qi He, Kuiyu Chang, and Ee-Peng Lim. 2007. An- alyzing feature trajectories for event detection. In SIGIR, pages 207-214.
- Heng Ji and Ralph Grishman. 2011. Knowledge base population: Successful approaches and challenges. In ACL, pages 1148-1158.
- Jon Kleinberg. 2002. Bursty and hierarchical structure in streams. In KDD, pages 91-101.
- Erdal Kuzey and Gerhard Weikum. 2014. Evin: build- ing a knowledge base of events. In WWW compan- ion, pages 103-106.
- Kalev Leetaru and Philip A Schrodt. 2013. Gdelt: Global data on events, location, and tone, 1979- 2012. In Paper presented at the ISA Annual Con- vention, volume 2, page 4.
- Jiwei Li and Sujian Li. 2013. Evolutionary hierarchi- cal dirichlet process for timeline summarization. In ACL, pages 556-560.
- Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David Mc- Closky. 2014. The Stanford CoreNLP natural lan- guage processing toolkit. In ACL, pages 55-60.
- Brendan O'Connor, Brandon M Stewart, and Noah A Smith. 2013. Learning to extract international re- lations from political context. In ACL (1), pages 1094-1104.
- Loc Paulev, Herv Jgou, and Laurent Amsaleg. 2010. Locality sensitive hashing: A comparison of hash function types and querying mechanisms. Pattern Recognition Letters, 31(11):1348 -1358.
- Saša Petrović, Miles Osborne, and Victor Lavrenko. 2012. Using paraphrases for improving first sto- ry detection in news and twitter. In NAACL, pages 338-346.
- Dafna Shahaf and Carlos Guestrin. 2010. Connecting the dots between news articles. In KDD, pages 623- 632.
- Xavier Tannier and Véronique Moriceau. 2013. Build- ing event threads out of multiple news articles. In EMNLP, pages 958-967.
- Benjamin Van Durme and Ashwin Lall. 2010. Online generation of locality sensitive hash signatures. In ACL, pages 231-235.
- Rui Yan, Liang Kong, Congrui Huang, Xiaojun Wan, Xiaoming Li, and Yan Zhang. 2011. Timeline gen- eration through evolutionary trans-temporal summa- rization. In EMNLP, pages 433-443.
- Yiming Yang, Jaime G Carbonell, Ralf D Brown, Thomas Pierce, Brian T Archibald, and Xin Li- u. 1999. Learning approaches for detecting and tracking news events. IEEE Intelligent Systems, 14(4):32-43.