Academia.eduAcademia.edu

Outline

UOW: Semantically Informed Text Similarity

Abstract

The UOW submissions to the Semantic Textual Similarity task at SemEval-2012 use a supervised machine learning algorithm along with features based on lexical, syntactic and semantic similarity metrics to predict the semantic equivalence between a pair of sentences. The lexical metrics are based on wordoverlap. A shallow syntactic metric is based on the overlap of base-phrase labels. The semantically informed metrics are based on the preservation of named entities and on the alignment of verb predicates and the overlap of argument roles using inexact matching. Our submissions outperformed the official baseline, with our best system ranked above average, but the contribution of the semantic metrics was not conclusive.

References (9)

  1. Thomas Back, David B. Fogel, and Zbigniew Michalewicz, editors. 1999. Evolutionary Com- putation 1, Basic Algorithms and Operators. IOP Publishing Ltd., Bristol, UK, 1st edition.
  2. Chris Callison-Burch, Philipp Koehn, Christof Monz, Kay Peterson, Mark Przybocki, and Omar Zaidan. 2010. Findings of the 2010 joint workshop on sta- tistical machine translation and metrics for machine translation. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pages 17-53, Uppsala, Sweden, July.
  3. Timothy Chklovski and Patrick Pantel. 2004. VerbO- cean: Mining the Web for Fine-Grained Semantic Verb Relations. In Dekang Lin and Dekai Wu, editors, Pro- ceedings of EMNLP 2004, pages 33-40, Barcelona, Spain, July.
  4. Ronan Collobert, Jason Weston, Leon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural Language Processing (almost) from Scratch.
  5. Michael Denkowski and Alon Lavie. 2010. Meteor-next and the meteor paraphrase tables: Improved evaluation support for five target languages. In Proceedings of the Joint Fifth Workshop on Statistical Machine Transla- tion and MetricsMATR, pages 339-342, July.
  6. Christiane Fellbaum, editor. 1998. WordNet An Elec- tronic Lexical Database. Cambridge, MA ; London, May.
  7. Dekang Lin. 1998. Automatic retrieval and clustering of similar words. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguis- tics and 17th International Conference on Computa- tional Linguistics -Volume 2, ACL '98, pages 768- 774, Stroudsburg, PA, USA.
  8. Kishore Papineni, Salim Roukos, Todd Ward, and Wei- Jing Zhu. 2002. Bleu: a method for automatic eval- uation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computa- tional Linguistics, ACL '02, pages 311-318, Strouds- burg, PA, USA.
  9. Miguel Rios, Wilker Aziz, and Lucia Specia. 2011. Tine: A metric to assess mt adequacy. Proceedings of the Sixth Workshop on Statistical Machine Translation. Karin Kipper Schuler. 2006. VerbNet: A Broad- Coverage, Comprehensive Verb Lexicon. Ph.D. thesis, University of Pennsylvania.