Academia.eduAcademia.edu

Outline

Fast LR parsing using rich (Tree Adjoining) Grammars

2002, Proceedings of the ACL-02 conference on Empirical methods in natural language processing - EMNLP '02

https://doi.org/10.3115/1118693.1118707

Abstract

We describe an LR parser of parts-ofspeech (and punctuation labels) for Tree Adjoining Grammars (TAGs), that solves table conflicts in a greedy way, with limited amount of backtracking. We evaluate the parser using the Penn Treebank showing that the method yield very fast parsers with at least reasonable accuracy, confirming the intuition that LR parsing benefits from the use of rich grammars. 1 Unlike (Wright and Wrigley, 1991)'s approach who tries to transpose PCFG probabilities to LR tables, facing difficulties which, to the best of our knowledge, have not been yet solved to content (cf. also (Ng and Tomita, 1991; Wright et al., 1991; Abney et al., 1999)).

References (33)

  1. Steven Abney, David McAllester, and Fernando Pereira. 1999. Relating probabilistic grammar and automata. In Proceedings of the 37th Annual Meeting of the As- sociation for Computational Linguistics, College Park, MD, USA.
  2. Ezra Black, Steven Abney, C. Gdaniec, Ralph Grish- man, P. Harrison, Don Hindle, R. Ingria, Fred Je- linek, Judith Klavans, Mark Liberman, Mitchell Mar- cus, Salim Roukos, Beatrice Santorini, and T. Strza- lkowski. 1991. A procedure for quantitatively com- paring the syntactic coverage of english grammars. In Proceedings of the DARPA Speech and Natural Lan- guage Workshop, San Mateo, CA, USA.
  3. Ted Briscoe and John Carroll. 1993. Generalized proba- bilistic LR parsing of natural language (corpora) with unification-based grammars. Computational Linguis- tics, 19(1):25-59.
  4. Ted Briscoe and John Carroll. 1995. Developing and evaluating a probabilistic LR parser of part-of-speech and punctuation labels. In Proceedings of the 4th In- ternational Workshop on Parsing Technologies (IWPT- 95), pages 48-58, Prague/Karlovy Vary, Czech Repub- lic.
  5. John Carroll and Ted Briscoe. 1992. Probabilistic nor- malisation and unpacking of packed parse forests for unification-based grammars. In Proceedings of the AAAI Fall Symposium on Probabilistic Approaches to Natural Language, Cambridge, MA, USA.
  6. John Carroll and Ted Briscoe. 1996. Apportioning de- velopment effort in a probabilistic LR parsing sys- tem through evaluation. In Proceedings of the Con- ference on Empirical Methods in NLP, pages 92-100, Philadelphia, PA, USA.
  7. David Chiang. 2000. Statistical parsing with an automatically-extracted Tree Adjoining Grammar. In Proceedings of the 38th Annual Meeting of the As- sociation for Computational Linguistics, Hong Kong, China.
  8. Michael Collins. 1997. Three generative, lexicalised models for statistical parsing. In Proceedings of the 35th Annual Meeting of the Association for Computa- tional Linguistics, pages 16-23, Madrid, Spain.
  9. Kentaro Inui, Virach Sornlertlamvanich, Hozumi Tanaka, and Takenobu Tokunaga. 1997. A new formalization of probabilistic GLR parsing. In Proceedings of the 5th International Workshop on Parsing Technologies (IWPT-97), Cambridge, MA, USA.
  10. Aravind K. Joshi and Yves Schabes. 1997. Tree- Adjoining Grammars. In Handbook of Formal Lan- guages, volume 3, pages 69-123. Springer-Verlag, Berlin.
  11. Aravind K. Joshi, L. Levy, and M. Takahashi. 1975. Tree Adjunct Grammars. Journal of Computer and System Sciences, 10(1).
  12. Alexandra Kinyon. 1997. Un algorithme d'analyse LR(0) pour les grammaires d'arbres adjoints lex- icaliseées.
  13. In D. Genthial, editor, Quatrième conférence annuelle sur Le Traitement Automatique du Langage Naturel, Actes, pages 93-102, Grenoble, France.
  14. Donald E. Knuth. 1965. On the translation of languages from left to right. Information and Control, 8(6):607- 639.
  15. Bernard Lang. 1974. Deterministic techniques for efficient non-deterministic parsers. In Automata, Languages and Programming, 2nd Colloquium, vol- ume 14 of Lecture Notes in Computer Science, pages 255-269, Saarbrücken. Springer-Verlag, Berlin.
  16. Mitchell Marcus, Grace Kim, Mary Ann Marcinkiewicz, Robert MacIntyre, Ann Bies, Mark Ferguson, Karen Katz, and Britta Schasberger. 1994. The Penn Tree- bank: Annotating predicate argument structure. In Proceedings of the 1994 Human Language Technology Workshop.
  17. Paola Merlo. 1996. Parsing with Principles and Classes of Information. Kluwer Academic Publishers, Boston, MA, USA.
  18. Mark-Jan Nederhof. 1998. An alternative LR algorithm for TAGs. In Proceedings of the 36th Annual Meet- ing of the Association for Computational Linguistics and 16th International Conference on Computational Linguistics, Montreal, Canada.
  19. See-Kiong Ng and Masaru Tomita. 1991. Probabilistic LR parsing for Generalized context-free grammars. In Proceedings of the Second International Workshop on Parsing Technologies (IWPT-91), Cancun, Mexico.
  20. Fernado Pereira. 1985. A new characterization of attach- ment preferences. In David R. Dowty, Lauri Kartunen, and Arnold M. Zwicky, editors, Natural Language Parsing: Psychological, computational, and theoret- ical perspectives, pages 307-319. Cambridge Univer- sity Press, New York, NY, USA.
  21. Carlos A. Prolo. 2000. An efficient LR parser generator for Tree Adjoining Grammars. In Proceedings of the 6th International Workshop on Parsing Technologies (IWPT-2000), Trento, Italy.
  22. Carlos A. Prolo. 2002. LR parsing for Tree Adjoioning Grammars and its application to corpus-based natural language parsing. Ph.D. thesis proposal, University of Pennsylvania.
  23. Tobias Ruland. 2000. A context-sensitive model for probabilistic LR parsing of spoken language with transformation-based postprocessing. In Proceedings of the 18th International Conference on Computa- tional Linguistics (COLING'2000), pages 677-683, Saarbrücken, Germany.
  24. Y. Schabes and K. Vijay-Shanker. 1990. Determinis- tic left to right parsing of tree adjoining languages. In Proceedings of 28th Annual Meeting of the Associ- ation for Computational Linguistics, pages 276-283, Pittsburgh, Pennsylvania, USA.
  25. Yves Schabes and Richard C. Waters. 1995. Tree In- sertion Grammar: a cubic-time, parsable formalism that lexicalizes Context-Free Grammar without chang- ing the trees produced. Computational Linguistics, 21(4):479-513.
  26. Stuart Shieber and Mark Johnson. 1993. Variations on incremental interpretation. Journal of Psycholinguis- tic Research, 22(2):287-318.
  27. Stuart M. Shieber. 1983. Sentence disambiguation by a Shift-Reduce parsing technique. In Proceedings of the 21st Annual Meeting of the Association for Compu- tational Linguistics, pages 119-122, Cambridge, MA, USA.
  28. Khalil Sima'an. 2000. Tree-gram parsing: Lexical de- pendencies and structural relations. In Proceedings of the 38th Annual Meeting of the Association for Com- putational Linguistics, Hong Kong, China.
  29. Michael K. Tanenhaus and John C. Trueswell. 1995. Sentence comprehension. In Joanne L. Miller and Pe- ter D. Eiwas, editors, Speech, Language, and Commu- nication, pages 217-262. Academic Press, San Diego, CA, USA.
  30. Masaru Tomita. 1985. Efficient Parsing for Natural Lan- guage. Kluwer Academic Publishers, Boston, MA, USA.
  31. H. Wright and E. N. Wrigley. 1991. GLR parsing with probability. In Masaru Tomita, editor, Generalized LR Parsing, pages 113-128. Kluwer Academic Publish- ers, Boston, MA, USA.
  32. Jerry Wright, Ave Wrigley, and Richard Sharman. 1991. Adaptive probabilistic Generalized LR parsing. In Proceedings of the Second International Workshop on Parsing Technologies (IWPT-91), Cancun, Mexico.
  33. Fei Xia. 2001. Investigating the Relationship between Grammars and Treebanks for Natural Languages. Ph.D. thesis, Department of Computer and Informa- tion Science, University of Pennsylvania.