Academia.eduAcademia.edu

Outline

Benchmarking of Statistical Dependency Parsers for French

2010

Abstract

We compare the performance of three statistical parsing architectures on the problem of deriving typed dependency structures for French. The architectures are based on PCFGs with latent variables, graph-based dependency parsing and transition-based dependency parsing, respectively. We also study the influence of three types of lexical information: lemmas, morphological features, and word clusters. The results show that all three systems achieve competitive performance, with a best labeled attachment score over 88%. All three parsers benefit from the use of automatically derived lemmas, while morphological features seem to be less important. Word clusters have a positive effect primarily on the latent variable parser.

References (41)

  1. Abeillé, A. and N. Barrier. 2004. Enriching a french treebank. In LREC'04.
  2. Bikel, D. M. 2002. Design of a multi-lingual, parallel- processing statistical parsing engine. In HLT-02.
  3. Brown, P., V. Della Pietra, P. Desouza, J. Lai, and R. Mercer. 1992. Class-based n-gram models of natural language. Computational linguistics, 18(4).
  4. Buchholz, S. and E. Marsi. 2006. CoNLL-X shared task on multilingual dependency parsing. In CoNLL 2006.
  5. Candito, M. and B. Crabbé. 2009. Improving gener- ative statistical parsing with semi-supervised word clustering. In IWPT'09.
  6. Candito, M. and D. Seddah. 2010. Parsing word clus- ters. In NAACL/HLT Workshop SPMRL 2010.
  7. Candito, M., B. Crabbé, and P. Denis. 2010. Statis- tical french dependency parsing : treebank conver- sion and rst results. In LREC 2010.
  8. Carroll, J., E. Briscoe, and A. Sanlippo. 1998. Parser evaluation: A survey and a new proposal. In LREC 1998.
  9. Cer, D., M.-C. de Marneffe, D. Jurafsky, and C. Man- ning. 2010. Parsing to stanford dependencies: Trade-offs between speed and accuracy. In LREC 2010.
  10. Charniak, E. 2000. A maximum entropy inspired parser. In NAACL 2000.
  11. Crabbé, B. and M. Candito. 2008. Expériences d'analyse syntaxique statistique du français. In TALN 2008.
  12. de la Clergerie, E. V., C. Ayache, G. de Chalendar, G. Francopoulo, C. Gardent, and P. Paroubek. 2008. Large scale production of syntactic annotations for french. In First International Workshop on Auto- mated Syntactic Annotations for Interoperable Lan- guage Resources.
  13. de Marneffe, M.-C., B. MacCartney, and C. D. Man- ning. 2006. Generating typed dependency parses from phrase structure parses. In LREC 2006.
  14. Denis, P. and B. Sagot. 2009. Coupling an an- notated corpus and a morphosyntactic lexicon for state-of-the-art pos tagging with less human effort. In PACLIC 2009.
  15. Eisner, J. 1996. Three new probabilistic models for dependency parsing: An exploration. In COLING 1996.
  16. Fan, R.-E., K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. 2008. LIBLINEAR: A library for large linear classication. Journal of Machine Learning Research, 9.
  17. Kahane, S., A. Nasr, and O. Rambow. 1998.
  18. Pseudo-projectivity: A polynomially parsable non- projective dependency grammar. In ACL/COLING 1998.
  19. Klein, D. and C. D. Manning. 2003. Accurate unlexi- calized parsing. In ACL 2003.
  20. Koo, T., X. Carreras, and M. Collins. 2008. Sim- ple semi-supervised dependency parsing. In ACL- 08:HLT.
  21. Kübler, S. 2008. The PaGe 2008 shared task on pars- ing german. In ACL-08 Workshop on Parsing Ger- man.
  22. Lin, D. 1995. A dependency-based method for evalu- ating broad-coverage parsers. In IJCAI-95.
  23. Magerman, D. M. 1995. Statistical decision-tree mod- els for parsing. In ACL 1995.
  24. Matsuzaki, T., Y. Miyao, and J. Tsujii. 2005. Proba- bilistic cfg with latent annotations. In ACL 2005.
  25. McDonald, R. and J. Nivre. 2007. Characterizing the errors of data-driven dependency parsing mod- els. In EMNLP-CoNLL 2007.
  26. McDonald, R. and F. Pereira. 2006. Online learning of approximate dependency parsing algorithms. In EACL 2006.
  27. McDonald, R., K. Lerman, and F. Pereira. 2006. Mul- tilingual dependency analysis with a two-stage dis- criminative parser. In CoNLL 2006.
  28. McDonald, R. 2006. Discriminative Learning and Spanning Tree Algorithms for Dependency Parsing. Ph.D. thesis, University of Pennsylvania.
  29. Nivre, J., Johan Hall, and Jens Nilsson. 2006. Malt- parser: A data-driven parser-generator for depen- dency parsing. In LREC 2006.
  30. Nivre, J., J. Hall, S. Kübler, R. McDonald, J. Nils- son, S. Riedel, and D. Yuret. 2007. The CoNLL 2007 shared task on dependency parsing. In CoNLL Shared Task of EMNLP-CoNLL 2007.
  31. Nivre, J. 2006. Inductive Dependency Parsing. Springer.
  32. Nivre, J. 2008. Algorithms for deterministic incre- mental dependency parsing. Computational Lin- guistics, 34.
  33. Paroubek, P., L.-G. Pouillot, I. Robba, and A. Vilnat. 2005. Easy : Campagne d'évaluation des analy- seurs syntaxiques. In TALN 2005, EASy workshop : campagne d'évaluation des analyseurs syntaxiques.
  34. Petrov, S. and D. Klein. 2007. Improved inference for unlexicalized parsing. In NAACL-07: HLT.
  35. Petrov, S., L. Barrett, R. Thibaux, and D. Klein. 2006. Learning accurate, compact, and interpretable tree annotation. In ACL 2006.
  36. Sagot, B. 2010. The Lefff, a freely available and large- coverage morphological and syntactic lexicon for french. In LREC 2010.
  37. Schluter, N. and J. van Genabith. 2009. Dependency parsing resources for french: Converting acquired lfg f-structure. In NODALIDA 2009.
  38. Seddah, D., M. Candito, and B. Crabbé. 2009. Cross parser evaluation and tagset variation: a french tree- bank study. In IWPT 2009.
  39. Seddah, D., G. Chrupa!a, O. Cetinoglu, J. van Gen- abith, and M. Candito. 2010. Lemmatization and statistical lexicalized parsing of morphologically- rich languages. In NAACL/HLT Workshop SPMRL 2010.
  40. Tsarfaty, R. 2006. Integrated morphological and syn- tactic disambiguation for modern hebrew. In COL- ING/ACL 2006 Student Research Workshop.
  41. Yamada, H. and Y. Matsumoto. 2003. Statistical de- pendency analysis with support vector machines. In IWPT 2003.