Hybrid data-driven models of machine translation

Declan Groves

Hybrid data-driven models of machine translation

2005, Machine Translation

Abstract

This paper presents an extended, harmonised account of our previous work on combining subsentential alignments from phrase-based statistical machine translation (SMT) and example-based MT (EBMT) systems to create novel hybrid data-driven systems capable of outperforming the baseline SMT and EBMT systems from which they were derived. In previous work, we demonstrated that while an EBMT system is capable of outperforming a phrase-based SMT (PBSMT) system constructed from freely available resources, a hybrid ‘example-based’ SMT system incorporating marker chunks and SMT subsentential alignments is capable of outperforming both baseline translation models for French–English translation. In this paper, we show that similar gains are to be had from constructing a hybrid ‘statistical’ EBMT system. Unlike the previous research, here we use the Europarl training and test sets, which are fast becoming the standard data in the field. On these data sets, while all hybrid ‘statistical’ EBMT variants still fall short of the quality achieved by the baseline PBSMT system, we show that adding the marker chunks to create a hybrid ‘example-based’ SMT system outperforms the two baseline systems from which it is derived. Furthermore, we provide further evidence in favour of hybrid systems by adding an SMT target-language model to the EBMT system, and demonstrate that this too has a positive effect on translation quality. We also show that many of the subsentential alignments derived from the Europarl corpus are created by either the PBSMT or the EBMT system, but not by both. In sum, therefore, despite the obvious convergence of the two paradigms, the crucial differences between SMT and EBMT contribute positively to the overall translation quality. The central thesis of this paper is that any researcher who continues to develop an MT system using either of these approaches will benefit further from integrating the advantages of the other model; dogged adherence to one approach will lead to inferior systems being developed.

References (128)

Abaitua, J. (1998). Complex Predicates in Basque: from Lexical Forms to Func- tional Structures. Ph.d. thesis, University of Manchester, UK.
Aduriz, I., Arriola, J. M., Artola, X., de Ilarraza, A. D., Gojenola, K., and Maritx- alar, M. (1997). Morphosyntactic Disambiguation for Basque Based on the Con- straint Grammar Formalism. In Proceedings of the 2nd International Conference on Recent Advances in Natural Language Processing (RANLP-97), pages 282-288, Tzigov Chark, Bulgaria.
Algeria, I., de Ilarraza, A. D., Labaka, G., Lersundi, M., Mayor, A., Sarasola, K., Forcada, M., Oritz, S., and Padró, L. (2005). An Open Architecture for Transfer- Based Machine Translation. In Machine Translation Summit X: Workshop on Open-Source Machine Translation, pages 7-14, Phuket, Thailand.
Armstrong, S., Flanagan, M., Graham, Y., Groves, D., Mellebeek, B., Morris- sey, S., Stroppa, N., and Way, A. (2006). MaTrEx: Machine Translation Using Examples. In TC-Star OpenLab on Speech Translation, Trento, Italy. Slides at http://tc-star.org/openlab2006/day1/Groves openlab.pdf.
Aue, A., Menezes, A., Moore, R., Quirk, C., and Ringger, E. (2004). Statistical Machine Translation Using Labeled Semantic Dependency Graphs. In Proceed- ings of the Tenth Conference on Theoretical and Methodological Issues in Machine Translation (TMI-04), pages 125-134, Baltimore, MD.
Bangalore, S., Murdock, V., and Riccardi, G. (2002). Boostrapping Bilingual Data using Consensus Translation for a Multilingual Instant Messaging System. In Pro- ceedings of the 19th International Conference on Computational Linguistics (COL- ING 2002), pages 1-7, Taipei, Taiwan.
Becker, J. D. (1975). The Phrasal Lexicon. In Proceedings of Theoretical Issues in Natural Language Processing, pages 70-73, Cambridge, MA.
Birch, A., Callison-Burch, C., and Osborne, M. (2006). Constraining the Phrase- Based, Joint Probability Statitistical Translation Model. In Proceedings of the 7th Biennial Conference of the Association for Machine Translation in the Americas, pages 10-18, Boston, MA.
Bod, R. (1992). A Computational Model of Language Performance: Data Oriented Parsing. In Proceedings of the 15th[sic] International Conference on Computational Linguistics (COLING'92), pages 855-859, Nantes, France.
Brown, P., Cocke, J., Pietra, S. D., Pietra, V. D., Jelinek, F., Mercer, R., and Roosin, P. (1988). A Statistical Approach to Language Translation. In Proceedings of the 12th International Conference on Computational Linguistics (COLING-88), pages 71-76, Budapest, Hungary.
Brown, P., Cocke, J., Pietra, S. D., Pietra, V. D., Jelinek, F., Mercer, R., and Roossin, P. (1990). A Statistical Approach to Machine Translation. Computational Linguistics, 16:79-85.
Brown, P., Pietra, S. D., Pietra, V. D., and Mercer, R. (1991). Word-Sense Disam- biguation Using Statistical Methods. In Proceedings of the 29th Annual Meeting of the Assosiation for Compuatational Lingusitics (ACL-91), pages 264-270, Berke- ley, CA.
Brown, P., Pietra, S. D., Pietra, V. D., and Mercer, R. (1993). The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguis- tics, 19(2):263-311.
Brown, R. (1996). Example-Based Machine Translation in the Pangloss System. In Proceedings of the 16th International Conference on Computational Linguistics (COLING-96), pages 169-174, Copenhagen, Denmark.
Brown, R. (1999). Adding Linguistic Knowledge to a Lexical Example-based Trans- lation System. In Proceedings of the Eighth Conference on Theoretical and Method- ological Issues in Machine Translation TMI-99, pages 22-32, Chester, England.
Brown, R. (2000). Automated Generalization of Translation Examples. In Pro- ceedings of the Eighteenth International Conference on Computational Linguistics (COLING-2000 in Europe), pages 125-131, Saarbrücken, Germany.
Brown, R., Hutchinson, R., Bennett, P., Carbonell, J., and Janson, P. (2003). Reducing Boundary Friction Using Translation-Fragment Overlap. In Machine Translation Summit IX, pages 24-31, New Orleans, LA.
Burbank, A., Carpuat, M., Clark, S., Dreyer, M., Fox, P., Groves, D., Hall, K., Hearne, M., Melamed, D., Shen, Y., Way, A., Wellington, B., and Wu, D. (2005). Statistical Machine Translation by Parsing: Fi- nal Report. Johns Hopkins Univeristy CLSP Summer Workshop 2005. http://www.clsp.jhu.edu/ws2005/groups/statistical/documents/finalreport.pdf.
Callison-Burch, C., Osborne, M., and Koehn, P. (2006). Re-evaluating the Role of BLEU in Machine Translation Research. In Proceedings of the 11th European Chapter of the Association for Computational Linguistics (EACL-06), pages 249- 256, Trento, Italy.
Carbonell, J., Klein, S., Miller, D., Steinbaum, M., Grassiany, T., and Frey, J. (2006). Context-Based Machine Translation. In Proceedings of the 7th Confer- ence of the Association for Machine Translation in the Americas, pages 19-28, Cambridge, MA.
Carl, M. and Hansen, S. (1999). Linking Translation Memories with Example- Based Machine Translation. In Machine Translation Summit VII, pages 617-624, Singapore.
Charniak, E., Knight, K., and Yamada, K. (2003). Syntax-Based Language Models for Statistical Machine Translation. In Proceedings of the Ninth Machine Transla- tion Summit, pages 40-46, New Orleans, LO.
Chen, K. and Chen, H.-H. (1995). Machine Translation: An Integrated Approach. In Proceedings of the Sixth Conference on Theoretical and Methodological Issues in Machine Translation (TMI-95), pages 287-294, Leuven, Belgium.
Chiang, D. (2005). A Hierarchical Phrase-Based Model for Statistical Machine Translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pages 263-279, Ann Arbor, MI.
Chomsky, N. (1969). Quine's Empirical Assumptions. In Davidson, D. and Hin- tikka, J., editors, Words and Objections. Essays on the Work of W. V. Quine. Reidel, Dordrecht, The Netherlands.
Church, K. and Hanks, P. (1990). Word Association Norms, Mutual Information, and Leixcography. Computational Linguistics, 16(1):22-29.
Collins, B. (1998). Example-Based Machine Translation: An Adaptation-Guided Retrieval Approach. Ph.D. Thesis, Trinity College, Dublin, Ireland.
Cranias, L., Papageorgiou, H., and Piperidis, S. (1994). A Matching Technique in Example-Based Machine Translation. In Proceedings of the 15th International Conference on Computational Linguistics (COLING 1994), pages 100-104, Kyoto, Japan.
Dempster, A., Laird, N., and Rubin, D. (1977). Maximum Likelihood from Incom- plete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B, 39(1):1-38.
Doddington, G. (2002). Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics. In Human Language Technology: Notebook Proceedings, pages 128-132, San Diego, CA.
Frederking, R. and Nirenburg, S. (1994). Three Heads are Better than One. In Proceedings of the 4th Conference on Applied Natural Language Processing, pages 95-100, Stuttgart, Germany.
Frederking, R., Nirenburg, S., Farwell, D., Helmreich, S., Hovy, E., Knight, K., Beale, S., Domashnev, C., Attardo, D., Grannes, D., and Brown, R. (1994). Inte- grating Translations from Multiple Sources within the Pangloss Mark III Machine Translation System. In Technology Partnerships for Crossing the Language Barrier: Proceedings of the First Conference of the Association for Machine Translation in the Americas, pages 73-80, Columbia, Maryland.
Gamma, E., Helm, R., Johnson, R., and Vlissides, J. (1995). Design Patters: Elements of Reusable Objected-Oriented Software. Addison-Wesley Longman Pub- lishing Co., Inc., Boston, MA.
Gerloff, P. (1987). Identifying the Unit of Analysis in Translation. In Faerch, C. and Kasper, G., editors, Introspection in Second Language Research, pages 135- 158. Calvedon; Philadelphia: Multilingual Matters.
Germann, U., Jahr, M., Knight, K., Marcu, D., and Yamada, K. (2001). Fast Decoding and Optimal Decoding for Machine Translation. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics and 10th Conference of the European Chapter, pages 228-235, Toulouse, France.
Goodman, J. (2001). A Bit of Progress in Language Modeling. Computer Speech and Language, 15(1):403-434.
Gough, N. (2005). Example-Based Machine Translation Using the Marker Hypoth- esis. PhD Thesis, Dublin City University, Dublin, Ireland.
Gough, N. and Way, A. (2004a). Example-Based Controlled Translation. In Pro- ceedings of the Ninth Conference of the European Association for Machine Trans- lation Workshop, pages 73-81, Valetta, Malta.
Gough, N. and Way, A. (2004b). Robust Large-Scale EBMT with Marker-Based Segmentation. In Proceedings of the Tenth Conference on Theoretical and Method- ological Issues in Machine Translation (TMI-04), pages 95-104, Baltimore, MD.
Green, T. (1979). The Necessity of Syntax Markers. Two experiments with artificial languages. Journal of Verbal Learning and Behavior, 18:481-496.
Groves, D., Hearne, M., and Way, A. (2004). Robust Sub-Sentential Alignment of Phrase-Structure Trees. In Proceedings of the 20th International Conference on Computational Linguistics (COLING-04), pages 1072-1078, Geneva, Switzerland.
Groves, D. and Way, A. (2005). Hybrid Example-Based SMT: the Best of Both Worlds? In Proceedings of the ACL 2005 Workshop on Building and Using Parallel Texts: Data-Driven Machine Translation and Beyond, pages 183-190, Ann Arbor, MI. Groves, D. and Way, A. (2006a). Hybrid Data-Driven Models of MT. In Machine Translation, Special Issue on EBMT. (in press).
Groves, D. and Way, A. (2006b). Hybridity in MT: Experiments on the Europarl Corpus. In Proceedings of the 11th Conference of the European Association for Machine Translation, pages 115-124, Oslo, Norway.
Hassan, H., Hearne, M., Sima'an, K., and Way, A. (2006). Syntactic Phrase-Based Statistical Machine Translation. In Proceedings of the IEEE 2006 Workshop on Spoken Language Translation, Palm Beach, Aruba (to appear).
Hearne, M. (2005). Data-Oriented Models of Parsing and Translation. PhD Thesis, Dublin City University, Dublin, Ireland.
Hearne, M. and Way, A. (2003). Seeing the Wood for the Trees: Data-Oriented Translation. In Machine Translation Summit IX, pages 165-172, New Orleans, LA.
Hearne, M. and Way, A. (2006). Disambiguation Strategies for Data-Oriented Translation. In Proceedings of the 11th Conference of the European Association for Machine Translation, pages 59-68, Oslo, Norway.
Hutchins, J. (2005). Towards a Definition of Example-Based Machine Translation. In Machine Translation Summit X: Second Workshop on Example-Based Machine Translation, pages 63-70, Phuket, Thailand.
Hutchins, J. and Somers, H. (1992). An Introduction to Machine Translation, pages 161 -174. Academic Press, London.
Imamura, K., Okuma, H., Watanabe, T., and Sumita, E. (2004). Example-based Machine Translation Based on Syntactic Transfer with Statistical Models. In Pro- ceedings of the 20th International Conference on Computational Linguistics (COL- ING 2004), pages 99-105, Geneva, Switzerland.
Jelinek, F. (1998). Statistical Methods for Speech Recognition, chapter 5. The MIT Press, Cambridge, MA.
Jelinek, F. and Mercer, R. L. (1980). Interpolated Estimation of Markov Source Parameters from Sparse Data. In Proceedings of the Workshop on Pattern Recog- nition in Practice, pages 381-402, Amsterdam, the Netherlands: North-Holland.
Jurafsky, D. and Martin, J. H. (2000). Speech and Language Processing: An In- troduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall, Upper Saddle River, NJ.
Kaji, H., Kida, Y., and Morimoto, Y. (1992). Learning Translation Templates from Bilingual Text. In Proceedings of the 15th [sic] Internation Conference on Computational Linguistics (COLING-92), pages 672-678, Nantes, France.
Kaplan, R. and Bresnan, J. (1982). Lexical Functional Grammar, a Formal System for Grammatical Representation. In Bresnan, J., editor, The Mental Representation of Grammatical Relations, pages 173-281. MIT Press, Cambridge, MA.
Karlsson, F., Voutilainen, A., Heikkila, J., and Anttila, A. (1995). Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text. Mouton de Gruyter, Berlin, New York.
Kay, M. (1997). The Proper Place of Men and Machines in Language Translation. Machine Translation, 12(1):3-23.
Khadivi, S. and Ney, H. (2005). Automatic Filtering of Bilingual Corpora for Statistical Machine Translation. In In Proceedings 10th International Conference on Application of Natural Language to Information Systems, NLDB 2005, pages 263-274, Alicante, Spain. Springer Verlag.
Kneser, R. and Ney, H. (1995). Improved Backing-off for M-gram Language Mod- eling. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, volume 1, pages 181-184. Detroit, MI.
Knight, K. (1999). Decoding Complexity in Word-Replacement Translation Mod- els. Computational Linguistics, 25(4):607-615.
Koehn, P. (2004a). Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models. In Frederking, R. and Taylor, K., editors, Machine Translation: From Real Users to Research; AMTA 2004, LNAI 3265, pages 115- 124.
Koehn, P. (2004b). Statistical Significance Tests for Machine Translation Eval- uation. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP-2004), pages 388-395, Barcelona, Spain.
Koehn, P. (2005). Europarl: A Parallel Corpus for Statistical Machine Translation. In Machine Translation Summit X, pages 79-86, Phuket, Thailand.
Koehn, P., Och, F., and Marcu, D. (2003). Statistical Phrase-Based Translation. In Human Language Technology Conference (HLT-NAACL), pages 48-54, Edmonton, Canada.
Langlais, P. and Simard, M. (2002). Merging Example-Based and Statistical Ma- chine Translation. Machine Translation: From Research to Real Users, AMTA- 2002, LNAI 2499, pages 104-113.
Lee, Y.-S. (2004). Morphological Analysis for Statistical Machine Translation. In Proceedings of the Joint Meeting of the Human Language Technology Conference and the North American Chapter of the Association for Computational Linguistics (HLT-NAACL-04), pages 57-60, Lisbon, Portugal.
Leusch, G., Ueffing, N., and Ney, H. (2006). CDER: Efficient MT Evaluation Using Block Movements. In Proceedings of 11th Conference of the European Chapter of the Association for Compuatational Linguistics (EACL-06), pages 241-248, Trento, Italy. Levenshtein, V. I. (1965). Binary Codes Capable of Correcting Spurious Insertions and Deletions of Ones. Problems of Information Transition, 1:8-17.
Lidstone, G. J. (1920). Note on the General Case of the Bayes-Laplace forumla for Inductive or a posteriori Probabilities. Transactions of the Faculty of Actuaries, 8:192-192.
Marcu, D. (2001). Towards a Unified Approach to Memory-and Statistical-Based Machine Translation. In Proceedings of the 39th Annual Meeting of the Associa- tion for Compuatational Linguistics and 10th Conference of the European Chapter, pages 378-385, Toulouse, France.
Marcu, D., Wang, W., Echihabi, A., and Knight, K. (2006). SPMT: Statistical Machine Translation with Syntactified Target Language Phrases. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 44-52, Sydney, Australia.
Marcu, D. and Wong, W. (2002). A Phrase-Based Joint Probability Model for Statistical Machine Translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), pages 133-139, Philadel- phia, PA.
Marcus, M., Kim, G., Marcinkiewicz, M. A., MacIntyre, R., Bies, A., Ferguson, M., Katz, K., and Schasberger, B. (1994). The Penn Treebank: Annotating Predicate Argument Structure. In Proceedings of the ARPA Human Language Technology Workshop, pages 114-119, Princeton, New Jersey.
Maruyama, H. and Watanabe, H. (1992). Tree Cover Search Algorithm for Example-Based Translation. In Proceedings of the Fourth Conference on Theoret- ical and Methodological Issues in Machine Translation (TMI-92), pages 173-184, Montréal, Canada.
Matsumoto, Y., Ishimoto, H., and Utsuro, T. (1993). Structural Matching of Parallel Texts. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pages 23-30, Columbus, OH.
Mellebeek, B., Owczarzak, K., Groves, D., van Genabith, J., and Way, A. (2006).
A Syntactic Skeleton for Statistical Machine Translation. In Proceedings of the 11th Conference of the European Association for Machine Translation, pages 195-202, Oslo, Norway.
Nagao, M. (1984). A Framework of a Mechanical Translation between Japanese and English by Analogy Principle. In Elithorn, A. and Banerji, R., editors, Artificial and Human Intelligence, pages 173-180. North-Holland, Amsterdam, The Netherlands.
Nie, J.-Y. and Cai, J. (2001). Filtering Noisy Parallel Corpora of Web Pages. In IEEE Symposium on NLP and Knowledge Engineering, pages 453-458, Tuscon, AZ. Nirenburg, S., Domashnev, C., and Grannes, D. J. (1993). Two Approaches to Matching in Example-Based Machine Translation. In Proceedings of the Fifth In- ternational Conference on Theoretical and Methodological Issues in Machine Trans- lation TMI '93: MT in the Next Generation, pages 47-57, Kyoto, Japan.
Och, F. J. (1999). An Efficient Method to Determine Bilingual Word Classes.
In EACL '99: Ninth Conference of the European Chapter of the Association for Computational Linguistics, pages 71-76, Bergen, Norway.
Och, F. J. (2003). Minimum Error Rate Training in Statistical Machine Transla- tion. In Proceedings of 41st Annual Meeting of the Association for Computational Linguistics (ACL-03), pages 160-167, Sapporo, Japan.
Och, F. J. and Ney, H. (2002). Discriminative Training and Maximum Entropy Models for Statistical Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL-02), pages 295- 302, Philadelphia, PA.
Och, F. J. and Ney, H. (2003). A Systematic Comparison of Various Statistical Alignment Models. Computational Linguistics, 29:19-51.
Och, F. J., Tillmann, C., and Ney, H. (1999). Improved Alignment Models for Statistical Machine Translation. In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pages 20-28, University of Maryland, College Park, MD.
Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2001). BLEU: a Method for Automatic Evaluation of Machine Translation. Technical report, IBM T.J. Watson Research Center, Yorktown Heights, NY.
Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002). BLEU: a Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL-02), pages 311- 318, Philadelphia, PA.
Paul, M., Doi, T., Hwang, Y., Imamura, K., and Sumita, E. (2005a). Nobody is Perfect: ATR's Hybrid Approach to Spoken Language Translation. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT 2005), pages 55-62, Pittsburgh, PA.
Paul, M., Sumita, E., and Yamamoto, S. (2005b). A Machine Learning Approach to Hypothesis Selection of Greedy Decoding for SMT. In Machine Translation Summit X: Second Workshop on Example-Based Machine Translation, pages 117- 124, Phuket, Thailand.
Planas, E. and Furuse, O. (2003). Formalizing Translation Memory. In Carl, M. and Way, A., editors, Recent Advances in Example-Based Machine Translation, pages 157-188. Kluwer Academic Publishers, Dordrecht, The Netherlands.
Poutsma, A. (2000). Data-Oriented Translation: Using the Data-Oriented Parsing framework for Machine Translation. Master's thesis, University of Amsterdam, The Netherlands.
Poutsma, A. (2003). Machine Translation with Tree-DOP. In Bod, R., Scha, R., and Sima'an, K., editors, Data-Oriented Parsing, pages 339-357. Stanford CA: CSLI Publications.
Quirk, C. and Menezes, A. (2006). Dependency Treelet Translation: the conver- gence of statistical and example-based machine translation? In Machine Transla- tion, Special Issue on EBMT. (in press).
Sato, S. and Nagao, M. (1990). Toward Memory-Based Translation. In Proceedings of the 13th International Conference on Computational Linguistics (COLING-90), volume 3, pages 247-252, Helsinki, Finland.
Schäler, R. (1996). Machine Translation, Translation Memories and the Phrasal Lexicon: the Localisation Perspective. In Proceedings of the First Workshop of the European Association for Machine Translation: Language Resources, Terminology, Economics and User Needs, pages 21-33, Vienna, Austria.
Shen, W., Delaney, B., and Anderson, T. (2006). The MITLL/AFRL
TC-Star System: Experiments with Large Vocabulary Speech Translation. In TC-Star OpenLab on Speech Translation, Trento, Italy. Slides at http://tc-star.org/openlab2006/day3/Wade TC-Star-Openlab-presentation-v1a.pdf.
Somers, H. (1999). Review Article: Example-based Machine Translation. Machine Translation, 14(2):113-157.
Somers, H. (2003). An Overview of EBMT. In Carl, M. and Way, A., editors, Recent Advances in Example-Based Machine Translation, pages 3-57. Kluwer Academic Publishers, Dordrecht, The Netherlands.
Somers, H., McLean, I., and Jones, D. (1994). Experiments in Multilingual Example-Based Generation. In Proceedings of the 3rd Conference on the Cog- nitive Science of Natural Language Processing (CSNLP-94), [pages not numbered], Dublin, Ireland.
Stolcke, A. (2002). SRILM -An Extensible Language Modeling Toolkit. In Proceed- ings of the 7th International Conference on Spoken Language Processing, volume 2, pages 901-904, Denver, CO.
Stroppa, N., Groves, D., Way, A., and Sarasola, K. (2006). Example-Based Machine Translation of the Basque Language. In Proceedings of the 7th Biennial Conference of the Association for Machine Translation in the Americas, pages 232-241, Boston, MA. Stroppa, N. and Way, A. (2006). MaTrEx: DCU Machine Translation System for IWSLT 2006. In Proceedings of the International Workshop on Spoken Language Translation, pages 31-36, Kyoto, Japan.
Sumita, E. (2003). EBMT Using DP-Matching Between Word Sequences. In Carl, M. and Way, A., editors, Recent Advances in Example-Based Machine Translation, pages 189-210. Kluwer Academic Publishers, Dordrecht, The Netherlands.
Sumita, E., Iida, H., and Kohyama, H. (1990). Translating With Examples: A New Approach to Machine Translation. In Third International Conference on Theoreti- cal and Methodological Issues in Machine Translation, pages 203-212, Austin, TX.
Tiedemann, J. (2004). Word to Word Alignment Strategies. In Proceedings of the 20th International Conference on Computational Linguistics (COLING-2004), pages 212-218, Geneva, Switzerland.
Tillmann, C. and Ney, H. (2000). Word Re-Ordering and DP-Based Search in Statistical Machine Translation. In The 18th International Conference on Compu- tational Linguistics (COLING-00), pages 850-856, Saarbrücken, Germany.
Tillmann, C., Vogel, S., Ney, H., Zubiaga, A., and Sawaf, H. (1997). Accelerated DP Based Search for Statistical Translation. In Proceedings of the European Conference on Speech Communication and Technology, pages 2667-2670, Rhodes, Greece.
Turcato, D. and Popowich, F. (2003). What is Example-Based Machine Trans- lation? In Carl, M. and Way, A., editors, Recent Advances in Example-Based Machine Translation, pages 59-79. Kluwer Academic Publishers, Dordrecht, The Netherlands.
Turian, J., Shen, L., and Melamed, I. D. (2003). Evaluation of Machine Transla- tion and its Evaluation. In Machine Translation Summit IX, pages 386-393, New Orleans, LA.
Ueffing, N., Och, F., and Ney, H. (2002). Generation of Word Graphs in Statistical Machine Translation. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, pages 156-163, Philadelphia, PA.
Veale, T. and Way, A. (1997). Gaijin: A Bootstrapping Approach to Example- Based Machine Translation. In Proceedings of the 2nd International Conference on Recent Advances in Natural Language Processing (RANLP-97), pages 239-244, Tzigov Chark, Bulgaria.
Vogel, S. and Ney, H. (2000). Construction of a Hierarchical Translation Memory. In Proceedings of the 18th International Conference on Computational Linguistics: COLING 2000 in Europe, pages 1131-1135, Saarbrücken, Germany.
Vogel, S., Zhang, Y., Huang, F., Tribble, A., Venugopal, A., Zhao, B., and Waibel, A. (2003). The CMU Statistical Machine Translation System. In Proceedings of MT Summit IX, pages 110-117, New Orleans, LA.
Wagner, R. A. and Fischer, M. J. (1974). The String-to-String Correction Problem. Journal of the Association of Computing Machinery, 21(1):168-173.
Watanabe, H. (1992). A Similarity-Driven Transfer System. In Proceedings of the 15th [sic] International Conference on Computational Linguistics (COLING-1992), pages 770-776, Nantes, France.
Watanabe, H., Kurohashi, S., and Aramaki, E. (2003). Finding translation patterns from paired source and target dependency structures. In Carl, M. and Way, A., editors, Recent Advances in Example-Based Machine Translation, pages 397-420.
Way, A. (2003). Translating With Examples: the LFG-DOT Models of Transla- tion.
In Carl, M. and Way, A., editors, Recent Advances in Example-Based Machine Translation, pages 443-472. Kluwer Academic Publishers, Dordrecht, The Nether- lands.
Way, A. and Gough, N. (2003). wEBMT : Developing and Validating an Example- Based Machine Translation System using the World Wide Web. Computational Linguistics: Special Issue on the Web as Corpus, 29(3):421-457.
Way, A. and Gough, N. (2005a). Comparing Example-Based and Statistical Ma- chine Translation. Natural Language Engineering, 11(3):295-309.
Way, A. and Gough, N. (2005b). Controlled Translation in an Example-Based En- vironment: what do Automatic Evaluation Metrics tell us? Machine Translation, 19(2):1-36.
Weaver, W. (1949). Recent Contributions to the Mathematical Theory of Commu- nication. In Shannon, C. E. and Weaver, W., editors, The Mathematical Theory of Communication, pages 94-117. The University of Illinois Press, Urbana, IL.
Wu, D. (2006). MT Model Space: Statistical vs. Compositional vs. Example-Based Machine Translation. In Machine Translation, Special Issue on EBMT. (in press).
Yamada, K. and Knight, K. (2001). A Syntax-Based Statistical Translation Model. In Proceedings of the 39th Annual Conference for the Association for Computational Linguistics and 15th Conference of the European Chapter, pages 523-530, Toulouse, France.
Yamada, K. and Knight, K. (2002). A Decoder for Syntax-Based Statistical SMT. In Proceedings of the 40th Annual Conference for the Association for Computational Linguistics, pages 303-310, Philadelphia, PA.
Zens, R., Bender, O., Hasan, S., Khadivi, S., Matsuov, E., Xu, J., Zhang, Y., and Ney, H. (2005). The RWTH Phrase-Based Statistical Machine Translation System. In Proceedings of the International Workshop on Spoken Language Trans- lation (IWSLT-05), pages 155-162, Pittsburgh, PA.
Zens, R. and Ney, H. (2004). Improvements in Phrase-Based Statistical Machine Translation. In Proceedings of the Human Language Technology Conference / North American Chapter of the Association of Computational Linguistics Annual Meeting (HLT-NAACL-04), pages 257-264, Boston, MA.
Zens, R., Ney, H., Watanabe, T., and Sumita, E. (2004). Reordering Constraints for Phrase-based Statistical Machine Translation. In In Proceedings of the 20th International Conference on Computational Linguistics (COLING-04), pages 205- 211, Geneva, Switzerland.
Zhang, Y. and Vogel, S. (2004). Measuring Confidence Intervals for Machine Trans- lation Evaluation Metrics. In Proceedings of the Tenth Conference on Theoretical and Methodological Issues in Machine Translation (TMI-04), pages 4-6, Baltimore, MD.

Hybrid data-driven models of machine translation

Sign up for access to the world's latest research

Abstract

Related papers

References (128)

Related papers

Related topics