Academia.eduAcademia.edu

Outline

EDeR: A Dataset for Exploring Dependency Relations Between Events

2023, arXiv (Cornell University)

https://doi.org/10.48550/ARXIV.2304.01612

Abstract

Relation extraction is a central task in natural language processing (NLP) and information retrieval (IR) research. We argue that an important type of relation not explored in NLP or IR research to date is that of an event being an argument-required or optionalof another event. We introduce the humanannotated Event Dependency Relation dataset (EDeR) which provides this dependency relation. The annotation is done on a sample of documents from the OntoNotes dataset, which has the added benefit that it integrates with existing, orthogonal, annotations of this dataset. We investigate baseline approaches for predicting the event dependency relation, the best of which achieves an accuracy of 82.61% for binary argument/non-argument classification. We show that recognizing this relation leads to more accurate event extraction (semantic role labelling) and can improve downstream tasks that depend on this, such as co-reference resolution. Furthermore, we demonstrate that predicting the three-way classification into the required argument, optional argument or nonargument is a more challenging task.

References (54)

  1. Jun Araki, Zhengzhong Liu, Eduard Hovy, and Teruko Mitamura. 2014. Detecting subevent structure for event coreference resolution. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 4553- 4558, Reykjavik, Iceland. European Language Re- sources Association (ELRA).
  2. Jorge Balazs, Edison Marrese-Taylor, Pablo Loyola, and Yutaka Matsuo. 2017. Refining raw sentence representations for textual entailment recognition via attention. In Proceedings of the 2nd Work- shop on Evaluating Vector Space Representations for NLP, pages 51-55, Copenhagen, Denmark. As- sociation for Computational Linguistics.
  3. Steven Bethard. 2013. ClearTK-TimeML: A minimal- ist approach to TempEval 2013. In Second Joint Conference on Lexical and Computational Seman- tics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (Se- mEval 2013), pages 10-14, Atlanta, Georgia, USA. Association for Computational Linguistics.
  4. Lila Boualili, Jose G. Moreno, and Mohand Boughanem. 2020. Markedbert: Integrating traditional ir cues in pre-trained language models for passage retrieval. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SI- GIR '20, page 1977-1980, New York, NY, USA. Association for Computing Machinery.
  5. Andrew P Bradley. 1997. The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern recognition, 30(7):1145-1159.
  6. Xavier Carreras and Lluís Màrquez. 2005. Introduc- tion to the CoNLL-2005 shared task: Semantic role labeling. In Proceedings of the Ninth Confer- ence on Computational Natural Language Learning (CoNLL-2005), pages 152-164, Ann Arbor, Michi- gan. Association for Computational Linguistics.
  7. Taylor Cassidy, Bill McDowell, Nathanel Chambers, and Steven Bethard. 2014. An annotation frame- work for dense event ordering. Technical report, Carnegie-Mellon Univ Pittsburgh PA.
  8. Nathanael Chambers. 2013. Navytime: Event and time ordering from raw text. Technical report, Naval Academy, Annapolis MD.
  9. Nathanael Chambers, Taylor Cassidy, Bill McDowell, and Steven Bethard. 2014a. Dense event ordering with a multi-pass architecture. Transactions of the Association for Computational Linguistics, 2:273- 284.
  10. Nathanael Chambers, Taylor Cassidy, Bill McDowell, and Steven Bethard. 2014b. Dense event ordering with a multi-pass architecture. Transactions of the Association for Computational Linguistics, 2:273- 284.
  11. Miao Chen, Ganhui Lan, Fang Du, and Victor Lobanov. 2020. Joint learning with pre-trained transformer on named entity recognition and relation extraction tasks for clinical analytics. In Proceedings of the 3rd Clinical Natural Language Processing Work- shop, pages 234-242, Online. Association for Com- putational Linguistics.
  12. Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, and He He. 2022. Meta-learning via language model in-context tuning. In Proceedings of the 60th Annual Meeting of the Association for Computational Lin- guistics (Volume 1: Long Papers), pages 719-730, Dublin, Ireland. Association for Computational Lin- guistics.
  13. Leyang Cui, Yu Wu, Shujie Liu, Yue Zhang, and Ming Zhou. 2020. MuTual: A dataset for multi-turn dia- logue reasoning. In Proceedings of the 58th Annual Meeting of the Association for Computational Lin- guistics, pages 1406-1416, Online. Association for Computational Linguistics.
  14. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language under- standing. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171-4186, Minneapolis, Minnesota. Associ- ation for Computational Linguistics.
  15. Kawin Ethayarajh. 2019. How contextual are contextu- alized word representations? Comparing the geom- etry of BERT, ELMo, and GPT-2 embeddings. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Lan- guage Processing (EMNLP-IJCNLP), pages 55-65, Hong Kong, China. Association for Computational Linguistics.
  16. Jannik Fischbach, Julian Frattini, Arjen Spaans, Max- imilian Kummeth, Andreas Vogelsang, Daniel Mendez, and Michael Unterkalmsteiner. 2021. Au- tomatic detection of causality in requirement arti- facts: the cira approach. In International Work- ing Conference on Requirements Engineering: Foun- dation for Software Quality, pages 19-36, Cham. Springer International Publishing.
  17. Goran Glavaš and Jan Šnajder. 2014. Constructing co- herent event hierarchies from news stories. In Pro- ceedings of TextGraphs-9: the workshop on Graph- based Methods for Natural Language Processing, pages 34-38, Doha, Qatar. Association for Compu- tational Linguistics.
  18. Goran Glavaš, Jan Šnajder, Marie-Francine Moens, and Parisa Kordjamshidi. 2014. HiEve: A corpus for extracting event hierarchies from news stories. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 3678-3683, Reykjavik, Iceland. European Language Resources Association (ELRA).
  19. Isuru Gunasekara and Isar Nejadgholi. 2018. A review of standard text classification practices for multi- label toxicity identification of online content. In Proceedings of the 2nd Workshop on Abusive Lan- guage Online (ALW2), pages 21-25, Brussels, Bel- gium. Association for Computational Linguistics.
  20. Jan Hajič, Massimiliano Ciaramita, Richard Johans- son, Daisuke Kawahara, Maria Antònia Martí, Lluís Màrquez, Adam Meyers, Joakim Nivre, Sebastian Padó, Jan Štěpánek, Pavel Straňák, Mihai Surdeanu, Nianwen Xue, and Yi Zhang. 2009. The CoNLL- 2009 shared task: Syntactic and semantic depen- dencies in multiple languages. In Proceedings of the Thirteenth Conference on Computational Nat- ural Language Learning (CoNLL 2009): Shared Task, pages 1-18, Boulder, Colorado. Association for Computational Linguistics.
  21. Eduard Hovy, Teruko Mitamura, Felisa Verdejo, Jun Araki, and Andrew Philpot. 2013. Events are not simple: Identity, non-identity, and quasi-identity. In Workshop on events: Definition, detection, coref- erence, and representation, pages 21-28, Atlanta, Georgia. Association for Computational Linguistics.
  22. Pere-Lluís Huguet Cabot and Roberto Navigli. 2021. REBEL: Relation extraction by end-to-end language generation. In Findings of the Association for Com- putational Linguistics: EMNLP 2021, pages 2370- 2381, Punta Cana, Dominican Republic. Associa- tion for Computational Linguistics.
  23. Fan Jiang and Trevor Cohn. 2021. Incorporating syn- tax and semantics in coreference resolution with het- erogeneous graph attention network. In Proceedings of the 2021 Conference of the North American Chap- ter of the Association for Computational Linguistics: Human Language Technologies, pages 1584-1591, Online. Association for Computational Linguistics.
  24. Wojciech Kryscinski, Nitish Shirish Keskar, Bryan Mc- Cann, Caiming Xiong, and Richard Socher. 2019. Neural text summarization: A critical evaluation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Lan- guage Processing (EMNLP-IJCNLP), pages 540- 551, Hong Kong, China. Association for Computa- tional Linguistics.
  25. Natsuda Laokulrat, Makoto Miwa, Yoshimasa Tsu- ruoka, and Takashi Chikayama. 2013. UTTime: Temporal relation classification using deep syntactic features. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Pro- ceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), pages 88-92, Atlanta, Georgia, USA. Association for Computa- tional Linguistics.
  26. Md Tahmid Rahman Laskar, Jimmy Xiangji Huang, and Enamul Hoque. 2020. Contextualized embed- dings based transformer encoder for sentence simi- larity modeling in answer selection task. In Proceed- ings of the Twelfth Language Resources and Eval- uation Conference, pages 5505-5514, Marseille, France. European Language Resources Association.
  27. Seonghyeon Lee, Dongha Lee, Seongbo Jang, and Hwanjo Yu. 2022. Toward interpretable semantic textual similarity via optimal transport-based con- trastive sentence learning. In Proceedings of the 60th Annual Meeting of the Association for Compu- tational Linguistics (Volume 1: Long Papers), pages 5969-5979, Dublin, Ireland. Association for Com- putational Linguistics.
  28. Beth Levin et al. 1999. Objecthood: An event structure perspective. Proceedings of CLS, 35(1):223-247.
  29. Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization.
  30. Youmi Ma, Tatsuya Hiraoka, and Naoaki Okazaki. 2022. Joint entity and relation extraction based on table labeling using convolutional neural networks. In Proceedings of the Sixth Workshop on Structured Prediction for NLP, pages 11-21, Dublin, Ireland. Association for Computational Linguistics.
  31. Christopher Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven Bethard, and David McClosky. 2014. The Stanford CoreNLP natural language pro- cessing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Lin- guistics: System Demonstrations, pages 55-60, Bal- timore, Maryland. Association for Computational Linguistics.
  32. Dominique Mariko, Hanna Abi-Akl, Kim Trottier, and Mahmoud El-Haj. 2022. The financial causality extraction shared task (FinCausal 2022). In Pro- ceedings of the 4th Financial Narrative Processing Workshop @LREC2022, pages 105-107, Marseille, France. European Language Resources Association.
  33. Dominique Mariko, Estelle Labidurie, Yagmur Ozturk, Hanna Abi Akl, and Hugues de Mazancourt. 2020. Data processing and annotation schemes for fin- causal shared task.
  34. Paramita Mirza and Sara Tonelli. 2016a. CATENA: CAusal and TEmporal relation extraction from NAt- ural language texts. In Proceedings of COLING 2016, the 26th International Conference on Compu- tational Linguistics: Technical Papers, pages 64-75, Osaka, Japan. The COLING 2016 Organizing Com- mittee.
  35. Paramita Mirza and Sara Tonelli. 2016b. CATENA: CAusal and TEmporal relation extraction from NAt- ural language texts. In Proceedings of COLING 2016, the 26th International Conference on Compu- tational Linguistics: Technical Papers, pages 64-75, Osaka, Japan. The COLING 2016 Organizing Com- mittee.
  36. Makoto Miwa and Yutaka Sasaki. 2014. Modeling joint entity and relation extraction with table repre- sentation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1858-1869, Doha, Qatar. Associa- tion for Computational Linguistics.
  37. Qiang Ning, Hao Wu, and Dan Roth. 2018. A multi- axis annotation scheme for event temporal relations. In Proceedings of the 56th Annual Meeting of the As- sociation for Computational Linguistics (Volume 1: Long Papers), pages 1318-1328, Melbourne, Aus- tralia. Association for Computational Linguistics.
  38. Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10):1345-1359.
  39. Yannis Papanikolaou, Ian Roberts, and Andrea Pier- leoni. 2019. Deep bidirectional transformers for re- lation extraction without supervision. In Proceed- ings of the 2nd Workshop on Deep Learning Ap- proaches for Low-Resource NLP (DeepLo 2019), pages 67-75, Hong Kong, China. Association for Computational Linguistics.
  40. Sameer Pradhan, Alessandro Moschitti, Nianwen Xue, Hwee Tou Ng, Anders Björkelund, Olga Uryupina, Yuchen Zhang, and Zhi Zhong. 2013. Towards ro- bust linguistic analysis using OntoNotes. In Pro- ceedings of the Seventeenth Conference on Computa- tional Natural Language Learning, pages 143-152, Sofia, Bulgaria. Association for Computational Lin- guistics.
  41. Sameer Pradhan, Alessandro Moschitti, Nianwen Xue, Olga Uryupina, and Yuchen Zhang. 2012. CoNLL- 2012 shared task: Modeling multilingual unre- stricted coreference in OntoNotes. In Joint Confer- ence on EMNLP and CoNLL -Shared Task, pages 1-40, Jeju Island, Korea. Association for Computa- tional Linguistics.
  42. James Pustejovsky, Patrick Hanks, Roser Sauri, An- drew See, Robert Gaizauskas, Andrea Setzer, Dragomir Radev, Beth Sundheim, David Day, Lisa Ferro, et al. 2003a. The timebank corpus. In Proceedings of Corpus linguistics, volume 2003, page 40, Lancaster, UK. Corpus linguistics.
  43. James Pustejovsky, Patrick Hanks, Roser Saurí, Andrew See, Rob Gaizauskas, Andrea Setzer, Dragomir Radev, Beth Sundheim, David Day, Lisa Ferro, and Marcia Lazo. 2003b. The timebank cor- pus. Proceedings of Corpus Linguistics, 2003:40.
  44. Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Lan- guage models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  45. Malka Rappaport Hovav, Edit Doron, and Ivy Sichel. 2010. Lexical Semantics, Syntax, and Event Struc- ture. Oxford University Press, UK.
  46. Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter.
  47. Alessandro Seganti, Klaudia Firlag, Helena Skowron- ska, Michal Satlawa, and Piotr Andruszkiewicz. 2021. Multilingual entity and relation extraction dataset and model. In Proceedings of the 16th Con- ference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL 2021, Online, April 19 -23, 2021, online. Associa- tion for Computational Linguistics.
  48. Fiona Anting Tan, Ali Hürriyetoglu, Tommaso Caselli, Nelleke Oostdijk, Tadashi Nomoto, Hansi Het- tiarachchi, Iqra Ameer, Onur Uca, Farhana Ferdousi Liza, and Tiancheng Hu. 2022. The causal news cor- pus: Annotating causal relations in event sentences from news.
  49. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need.
  50. Linyi Yang, Zhen Wang, Yuxiang Wu, Jie Yang, and Yue Zhang. 2022. Towards fine-grained causal rea- soning and qa.
  51. Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Car- bonell, Ruslan Salakhutdinov, and Quoc V. Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding.
  52. Yu Zhang, Qingrong Xia, Shilin Zhou, Yong Jiang, Guohong Fu, and Min Zhang. 2022. Semantic role labeling as dependency parsing: Exploring latent tree structures inside arguments. In Proceedings of the 29th International Conference on Computa- tional Linguistics, pages 4212-4227, Gyeongju, Re- public of Korea. International Committee on Com- putational Linguistics.
  53. Jieming Zhu, Quanyu Dai, Liangcai Su, Rong Ma, Jinyang Liu, Guohao Cai, Xi Xiao, and Rui Zhang. 2022. Bars: Towards open benchmarking for recom- mender systems.
  54. Liu Zhuang, Lin Wayne, Shi Ya, and Zhao Jun. 2021. A robustly optimized BERT pre-training approach with post-training. In Proceedings of the 20th Chi- nese National Conference on Computational Lin- guistics, pages 1218-1227, Huhhot, China. Chinese Information Processing Society of China.