Academia.eduAcademia.edu

Outline

Predicting MT Quality as a Function of the Source Language

2006

Abstract

This paper describes one phase of a large-scale machine translation (MT) quality assurance project. We explore a novel approach to discriminating MT-unsuitable source sentences by predicting the expected quality of the output. 1 The resources required include a set of source/MT sentence pairs, human judgments on the output, a source parser, and an MT system. We extract a number of syntactic, semantic, and lexical features from the source sentences only and train a classifier that we call the "

References (17)

  1. References
  2. Martin, L.E. (1990). Knowledge Extraction. In Proceedings of the Twelfth Annual Conference of the Cognitive Science Society. Hillsdale, NJ: Lawrence Erlbaum Associates, pp. 252-262.
  3. Aikawa, T, M. Melero, L. Schwartz, and A. Wu. (2001). Sentence generation for multilingual machine translation. In Proceedings of the MT Summit VIII, Santiago de Compostela, Spain.
  4. Banerjee, Satanjeev and Alon Lavie. (2005). METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. ACL'05 Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization.
  5. Blatz, John, Eric Fitzgerald, George Foster, Simona Gandrabur, Cyril Goutte, Alex Kulesza, Alberto Sanchis, Nicola Ueffing. (2004). Confidence estimation for machine translation. In Proceedings of COLING 2004, pp. 315-321.
  6. Corston-Oliver, Simon, Michael Gamon, Eric Ringger, and Robert C. Moore. (2002). An overview of Amalgam: A machine-learned generation module. In Proceedings of the International Natural Language Generation (INLG) Conference. New York, USA. pp. 33-40.
  7. Gamon, Michael, Anthony Aue and Martine Smets. (2005). Sentence-Level MT evaluation without reference translations: beyond language modeling In Proceedings of EAMT 2005.
  8. Heidorn, G. (2000). Intelligent writing assistance. In R.Dale, H.Moisl and H.Somers (eds.), A Handbook of Natural Language Processing: Techniques and Applications for the Processing of Language as Text. New York: Marcel Dekker, pp. 181-207.
  9. Liu, Ding and Daniel Gildea. (2005). Syntactic features for evaluation of machine translation. ACL'05 Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization.
  10. Menezes, Arul and Stephen D. Richardson. (2001). A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora. In Proceedings of the Workshop on Data-driven Machine Translation at 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France, pp. 39-46.
  11. Nyberg, E. H. and T. Mitamura. (1996). Controlled language and knowledge-based machine translation: Principles and practice, In Proceedings of the First International Workshop on Controlled Language Applications.
  12. Papineni, Kishore A., Salim Roukos, Todd Ward and Wei-Jing Zhu. (2002). BLEU: A method for automatic evaluation of machine translation. In Proceedings of ACL 2002, pp. 311-318.
  13. Quirk, Christopher. (2004). Training a sentence-level machine translation confidence measure. In Proceedings of LREC 2004, pp 825-828.
  14. Rajman, Martin and Tony Hartley. (2001). Automatically predicting MT systems rankings compatible with Fluency, Adequacy or Informativeness scores. In Proceedings of the MT Summit VIII, Santiago de Compostela, Spain.
  15. Reuther, U. (2003). Two in one -Can it work? Readability and translatability by means of controlled language. Controlled Language Application Workshop (CLAW-03), pp. 124-132.
  16. Richardson, Steve (2004). Machine translation of online product support articles using a data-driven MT system. In Proceedings of AMTA 2004, pp. 246-251.
  17. Uchimoto, K., N. Hayashida, T. Ishida, H. Isahara. (2005). Automatic rating of machine translatability. In Proceedings of MT Summit X, pp. 235-242.