Quality Estimation Research Papers

Chapter 10: scate Taxonomy and Corpus of Machine Translation Errors

2025, Trends in E-Tools and Resources for Translators and Interpreters

Quality estimation (qe) and error analysis of machine translation (mt) output remain active areas in Natural Language Processing (nlp) research. Many recent efforts have focused on machine learning (ml) systems to estimate the mt quality,... more

descriptionView Paper arrow_downwardDownload

The Matecat Tool

by Mauro Cettolo

2025

We present a new web-based CAT tool providing translators with a professional work environment, integrating translation memories, terminology bases, concordancers, and machine translation. The tool is completely developed as open source... more

descriptionView Paper arrow_downwardDownload

The SDL Language Weaver Systems in the WMT12 Quality Estimation Shared Task

by Nguyễn Hữu Bách

2024

We present in this paper the system submissions of the SDL Language Weaver team in the WMT 2012 Quality Estimation shared-task. Our MT quality-prediction systems use machine learning techniques (M5P regression-tree and SVM-regression... more

Table 1: Performance of the Baseline Features using M5P and SVR models on the test set

Table 2: Performance of the Moses-based Features with an M5P model on the test set. them are inspired from word-based confidence esti- mation, in which the alignment consensus between the source words and target-translation words are informative indicators for gauging the quality of a translation hypothesis. The one-to-one (O20) word alignments are obtained from the decoding logs of Moses. We use the TreeTagger to obtain Spanish POS tags* and a maximum-entropy POS tagger for English. Since Spanish and English POS tag sets are different, we normalize their fine-grained POS tag sets into a coarser tag set by mapping the orig- inal POS tags into more general linguistic concepts such as noun, verb, adjective, adverb, preposition, determiner, number, and punctuation. them are inspired from word-based confidence esti-

Table 5: SVR-model performance for dev and test sets.

descriptionView Paper arrow_downwardDownload

Pushing the Right Buttons: Adversarial Evaluation of Quality Estimation

by Constantin Orasan

2024

Current Machine Translation (MT) systems achieve very good results on a growing variety of language pairs and datasets. However, they are known to produce fluent translation outputs that can contain important meaning errors, thus... more

descriptionView Paper arrow_downwardDownload

BERGAMOT-LATTE Submissions for the WMT20 Quality Estimation Shared Task

by Lisa Yankovskaya

2024

This paper presents our submission to the WMT2020 Shared Task on Quality Estimation (QE). We participate in Task and Task 2 focusing on sentence-level prediction. We explore (a) a black-box approach to QE based on pre-trained... more

descriptionView Paper arrow_downwardDownload

BERGAMOT-LATTE Submissions for the WMT20 Quality Estimation Shared Task

by Lisa Yankovskaya

2024, Empirical Methods in Natural Language Processing

This paper presents our submission to the WMT2020 Shared Task on Quality Estimation (QE) 1. We participate in Task 1 and Task 2 focusing on sentence-level prediction. We explore (a) a black-box approach to QE based on pre-trained... more

descriptionView Paper arrow_downwardDownload

UPC-CORE: What Can Machine Translation Evaluation Metrics and Wikipedia Do for Estimating Semantic Textual Similarity?

by Jordi Turmo

2024

In this paper we discuss our participation to the 2013 Semeval Semantic Textual Similarity task. Our core features include (i) a set of metrics borrowed from automatic machine translation, originally intended to evaluate automatic against... more

descriptionView Paper arrow_downwardDownload

DiscoTK: Using Discourse Structure for Machine Translation Evaluation

by Preslav Nakov

2024, arXiv (Cornell University)

We present novel automatic metrics for machine translation evaluation that use discourse structure and convolution kernels to compare the discourse tree of an automatic translation with that of the human reference. We experiment with five... more

Figure 2: Five different representations of the discourse tree (DT) for the sentence “The new organisa- tional structure will also allow us to enter the market with a joint offer of advertising products, to better link the creation of content for all the titles published and, last but not least, to continue to streamline significantly the business management of the company,” added Cermak. Note that to avoid visual clutter, (b)-(e) show alternative representations only for the highlighted subtree in (a).

Table 1: Evaluation results on WMT12 and WMT13 datasets at segment and system level for the main combined DiscoTK measures proposed in this paper.

descriptionView Paper arrow_downwardDownload

Spoken Language Translation Graphs Re-decoding using Automatic Quality Assessment

by Ngọc Thành Lê

2024, HAL (Le Centre pour la Communication Scientifique Directe)

This paper investigates how automatic quality assessment of spoken language translation (SLT) can help re-decoding SLT output graphs and improving the overall speech translation performance. Using robust word confidence measures (from... more

descriptionView Paper arrow_downwardDownload

Spoken Language Translation Graphs Re-decoding using Automatic Quality Assessment

by Ngọc Thành Lê

2024, HAL (Le Centre pour la Communication Scientifique Directe)

This paper investigates how automatic quality assessment of spoken language translation (SLT) can help re-decoding SLT output graphs and improving the overall speech translation performance. Using robust word confidence measures (from... more

descriptionView Paper arrow_downwardDownload

Anchor points for bilingual extraction from small specialized comparable corpora

by Emmanuel Morin

2024

Les recherches en extraction lexicale bilingue à partir de corpus comparables ont abouti à des résultats prometteurs pour les corpus très volumineux en utilisant une méthode d'alignement dite directe. Le changement d'échelle induit par... more

descriptionView Paper arrow_downwardDownload

Modelling the Constraints of Spatial Environment in Fauna Movement Simulations: Comparison of a Boundaries Accurate Function and a Cost Function

by Anne Ruas

2024, ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences

Landscape influences fauna movement at different levels, from habitat selection to choices of movements' direction. Our goal is to provide a development frame in order to test simulation functions for animal's movement. We describe our... more

descriptionView Paper arrow_downwardDownload

TransQuest at WMT2020: Sentence-Level Direct Assessment

by Ruslan Mitkov

2024

This paper presents the team TransQuest's participation in Sentence-Level Direct Assessment shared task in WMT 2020. We introduce a simple QE framework based on cross-lingual transformers, and we use it to implement and evaluate two... more

descriptionView Paper arrow_downwardDownload

An Exploratory Analysis of Multilingual Word-Level Quality Estimation with Cross-Lingual Transformers

by Ruslan Mitkov

2024, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Most studies on word-level Quality Estimation (QE) of machine translation focus on languagespecific models. The obvious disadvantages of these approaches are the need for labelled data for each language pair and the high cost required to... more

descriptionView Paper arrow_downwardDownload

Findings of the WMT 2020 Shared Task on Quality Estimation

by Erick Fonseca

2023

We report the results of the WMT20 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word, sentence and document levels. This edition included new... more

descriptionView Paper arrow_downwardDownload

MT-ComparEval: Graphical evaluation interface for Machine Translation development

by Eleftherios Avramidis

2023, The Prague Bulletin of Mathematical Linguistics

The tool described in this article has been designed to help MT developers by implementing a web-based graphical user interface that allows to systematically compare and evaluate various MT engines/experiments using comparative analysis... more

descriptionView Paper arrow_downwardDownload

Qos Management in Real-Time Spatial Big Data Using Feedback Control Scheduling

by Sana Hamdi

2023, ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences

Geographic Information System (GIS) is a computer system designed to capture, store, manipulate, analyze, manage, and present all types of spatial data. Spatial data, whether captured through remote sensors or large scale simulations has... more

descriptionView Paper arrow_downwardDownload

Intelligent Hybrid Man-Machine Translation Quality Estimation

by Mona Habib

2023, arXiv (Cornell University)

Inferring evaluation scores based on human judgments is invaluable compared to using current evaluation metrics which are not suitable for real-time applications e.g. post-editing. However, these judgments are much more expensive to... more

descriptionView Paper arrow_downwardDownload

Evaluating MT systems with BEER

by Khalil Sima'an

2023, The Prague Bulletin of Mathematical Linguistics

We present BEER, an open source implementation of a machine translation evaluation metric. BEER is a metric trained for high correlation with human ranking by using learning-to-rank training methods. For evaluation of lexical accuracy it... more

descriptionView Paper arrow_downwardDownload

Distortion Estimation in Compressed Music Using Only Audio Fingerprints

by Jan Doets

2023, IEEE Transactions on Audio, Speech, and Language Processing

An audio fingerprint is a compact yet very robust representation of the perceptually relevant parts of an audio signal. It can be used for content-based audio identification, even when the audio is severely distorted. Audio compression... more

descriptionView Paper arrow_downwardDownload

Assessing shop completeness in OpenStreetMap for two federal states in Germany

by Sven Lautenbach

2023, AGILE: GIScience Series

The completeness of the number of Open-StreetMap (OSM) retail stores was estimated for two federal states of Germany at district level. An intrinsic measurement was applied that fits saturation models on the cumulative curve of the number... more

descriptionView Paper arrow_downwardDownload

A no-reference bitstream-based perceptual model for video quality estimation of videos affected by coding artifacts and packet losses

by Muhammad Taha Shahid

2023, Human Vision and Electronic Imaging XX

This is an author produced version of a conference paper. The paper has been peer-reviewed but may not include the final publisher proof-corrections or pagination of the proceedings.

descriptionView Paper arrow_downwardDownload

Tencent submission for WMT20 Quality Estimation Shared Task

by xinjie wen

2023

This paper presents Tencent’s submission to the WMT20 Quality Estimation (QE) Shared Task: Sentence-Level Post-editing Effort for English-Chinese in Task 2. Our system ensembles two architectures, XLM-based and Transformer-based... more

descriptionView Paper arrow_downwardDownload

Tencent submission for WMT20 Quality Estimation Shared Task

by xinjie wen

2023

This paper presents Tencent’s submission to the WMT20 Quality Estimation (QE) Shared Task: Sentence-Level Post-editing Effort for English-Chinese in Task 2. Our system ensembles two architectures, XLM-based and Transformer-based... more

descriptionView Paper arrow_downwardDownload

QUALES: Estimación Automática de Calidad de Traducción Mediante Aprendizaje Automático Supervisado y No-Supervisado

by Igor Ellakuria

2023, Proces. del Leng. Natural

La estimacion automatica de calidad (EAC) de la traduccion automatica consiste en medir la calidad de traducciones sin acceso a referencias humanas, habitualmente mediante metodos de aprendizaje automatico. Un buen sistema EAC puede... more

descriptionView Paper arrow_downwardDownload

Qos Management in Real-Time Spatial Big Data Using Feedback Control Scheduling

by Sana Hamdi

2023, ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences

Geographic Information System (GIS) is a computer system designed to capture, store, manipulate, analyze, manage, and present all types of spatial data. Spatial data, whether captured through remote sensors or large scale simulations has... more

descriptionView Paper arrow_downwardDownload

Quality estimation for translation selection

by Kashif ullah Shah ktk

2023

We describe experiments on quality estimation to select the best translation among multiple options for a given source sentence. We consider a realistic and challenging setting where the translation systems used are unknown, and no... more

descriptionView Paper arrow_downwardDownload

An efficient and user-friendly tool for machine translation quality estimation

by Kashif ullah Shah ktk

2023

We present a new version of QUEST-an open source framework for machine translation quality estimation-which brings a number of improvements: (i) it provides a Web interface and functionalities such that non-expert users, e.g. translators... more

descriptionView Paper arrow_downwardDownload

An investigation on the effectiveness of features for translation quality estimation

by Kashif ullah Shah ktk

2023

We describe a systematic analysis on the effectiveness of features commonly exploited for the problem of predicting machine translation quality. Using a feature selection technique based on Gaussian Processes, we identify small subsets of... more

descriptionView Paper arrow_downwardDownload

An Exploratory Analysis of Multilingual Word-Level Quality Estimation with Cross-Lingual Transformers

by Constantin Orasan

2023, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Most studies on word-level Quality Estimation (QE) of machine translation focus on languagespecific models. The obvious disadvantages of these approaches are the need for labelled data for each language pair and the high cost required to... more

descriptionView Paper arrow_downwardDownload

Semantic Textual Similarity in Quality Estimation

by Constantin Orasan

2023

Quality Estimation (QE) predicts the quality of machine translation output without the need for a reference translation. This quality can be defined differently based on the task at hand. In an attempt to focus further on the adequacy and... more

descriptionView Paper arrow_downwardDownload

New classification quality estimators for analysis of documentary information: Application to patent analysis and web mapping

by Jean-Charles Lamirel

2023, Scientometrics

The information analysis process includes a cluster analysis or classification step associated with an expert validation of the results. In this paper, we propose new measures of Recall/Precision for estimating the quality of cluster... more

descriptionView Paper arrow_downwardDownload

Machine Translation Evaluation Resources and Methods: A Survey

by Lifeng Han

2023, Cornell University - arXiv

We introduce a Machine Translation (MT) evaluation survey that contains both manual and automatic evaluation methodologies. The traditional human evaluation criteria mainly include intelligibility, fidelity, fluency, adequacy,... more

Figure 1: Human Evaluation Methods language. The requirement that a translation is of

where c is the total length of candidate transla- ion corpus, and r refers to the sum of effective reference sentence length in the corpus. If there are multi-references for each candidate sentence, hen the nearest length as compared to the candi- date sentence is selected as the effective one. In he BLEU metric, the n-gram precision weight \,, is usually selected as uniform weight. However, he 4-gram precision value is usually very low or even zero when the test corpus is small. To weight more heavily those n-grams that are more informa- tive, (Doddington, 2002) proposes the NIST met- ric with the information weight added.

descriptionView Paper arrow_downwardDownload

Quality Estimation for Machine Translation Using the Joint Method of Evaluation Criteria and Statistical Modeling

by Lifeng Han

2023

This paper is to introduce our participation in the WMT13 shared tasks on Quality Estimation for machine translation without using reference translations. We submitted the results for Task 1.1 (sentence-level quality estimation), Task 1.2... more

descriptionView Paper arrow_downwardDownload

D 6 . 3 : Improved Corpus-based Approaches

by Lianet Sepúlveda Torres

2023

descriptionView Paper arrow_downwardDownload

A no-reference bitstream-based perceptual model for video quality estimation of videos affected by coding artifacts and packet losses

by Muhammad Zohaib Shahid

2023, Human Vision and Electronic Imaging XX

This is an author produced version of a conference paper. The paper has been peer-reviewed but may not include the final publisher proof-corrections or pagination of the proceedings.

descriptionView Paper arrow_downwardDownload

Recent research activities in videosurveillance at UNIFI::MICC

by F. Dini

2023, dmi.unisa.it

Recently, image quality validation has been largely investigated to increase recognition rates and to support decisions of authentication systems. This may be useful to alarm a video surveillance application for a particular intrusion... more

descriptionView Paper arrow_downwardDownload

Sentence-level quality estimation by predicting HTER as a multi-component metric

by Eleftherios Avramidis

2022, Proceedings of the Second Conference on Machine Translation

This submission investigates alternative machine learning models for predicting the HTER score on the sentence level. Instead of directly predicting the HTER score, we suggest a model that jointly predicts the amount of the 4 distinct... more

descriptionView Paper arrow_downwardDownload

Selecting Feature Sets for Comparative and Time-Oriented Quality Estimation of Machine Translation Output

by Eleftherios Avramidis

2022

This paper describes a set of experiments on two sub-tasks of Quality Estimation of Machine Translation (MT) output. Sentence-level ranking of alternative MT outputs is done with pairwise classifiers using Logistic Regression with... more

descriptionView Paper arrow_downwardDownload

Comparative Quality Estimation for Machine Translation Observations on Machine Learning and Features

by Eleftherios Avramidis

2022, The Prague Bulletin of Mathematical Linguistics

A deeper analysis on Comparative Quality Estimation is presented by extending the state-of-the-art methods with adequacy and grammatical features from other Quality Estimation tasks. The previously used linear method, unable to cope with... more

descriptionView Paper arrow_downwardDownload

Machine learning methods for comparative and time-oriented Quality Estimation of Machine Translation output

by Eleftherios Avramidis

2022

This paper describes a set of experiments on two sub-tasks of Quality Estimation of Machine Translation (MT) output. Sentence-level ranking of alternative MT outputs is done with pairwise classifiers using Logistic Regression with... more

descriptionView Paper arrow_downwardDownload

Fine-grained evaluation of Quality Estimation for Machine translation based on a linguistically motivated Test Suite

by Eleftherios Avramidis

2022

We present an alternative method of evaluating Quality Estimation systems, which is based on a linguistically-motivated Test Suite. We create a test-set consisting of 14 linguistic error categories and we gather for each of them a set of... more

descriptionView Paper arrow_downwardDownload

Translation Quality Assessment: A Brief Survey on Manual and Automatic Methods

by Lifeng Han

2022, ArXiv

To facilitate effective translation modeling and translation studies, one of the crucial questions to address is how to assess translation quality. From the perspectives of accuracy, reliability, repeatability and cost, translation... more

metrics as “simple n-gram word surface match- ing”. Further developed metrics also take linguis- tic features into account such as syntax and se- mantics, including POS, sentence structure, tex- tual entailment, paraphrase, synonyms, named en- tities, multi-word expressions (MWEs), semantic roles and language models. We classify these met- rics that utilize the linguistic features as “Deeper Linguistic Features (aware)”. This classification is only for easier understanding and better organiza- tion of the content. It is not easy to separate these two categories clearly since sometimes they merge with each other. For instance, some metrics from the first category might also use certain linguis- tic features. Furthermore, we will introduce some recent models that apply deep learning into the TQA framework, as in Fig. 2. Due to space lim- itations, we present MT quality estimation (QE) task which does not rely on reference translations during the automated computing procedure in the appendices.

descriptionView Paper arrow_downwardDownload

UPF-Cobalt Submission to WMT15 Metrics Task

by Anton Malinovskiy

2022, Proceedings of the Tenth Workshop on Statistical Machine Translation

An important limitation of automatic evaluation metrics is that, when comparing Machine Translation (MT) to a human reference, they are often unable to discriminate between acceptable variation and the differences that are indicative of... more

descriptionView Paper arrow_downwardDownload

The SDL Language Weaver Systems in the WMT12 Quality Estimation Shared Task

by Nguyen Dinh Nhat Bach

2022

We present in this paper the system submissions of the SDL Language Weaver team in the WMT 2012 Quality Estimation shared-task. Our MT quality-prediction systems use machine learning techniques (M5P regression-tree and SVM-regression... more

descriptionView Paper arrow_downwardDownload

WORD2VEC vs Dbnary Ou Comment Re Concilier Representations Distribuees et Reseaux Lexico Semantiques Le Cas De L Evaluation en Traduction Automatique

by Hervé Blanchon

2022, TALN 2016

Cet article présente une approche associant réseaux lexico-sémantiques et représentations distribuées de mots appliquée à l'évaluation de la traduction automatique. Cette étude est faite à travers l'enrichissement d'une métrique bien... more

descriptionView Paper arrow_downwardDownload

Qos Management in Real-Time Spatial Big Data Using Feedback Control Scheduling

by sami faiz

2022, ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences

Geographic Information System (GIS) is a computer system designed to capture, store, manipulate, analyze, manage, and present all types of spatial data. Spatial data, whether captured through remote sensors or large scale simulations has... more

descriptionView Paper arrow_downwardDownload

Quality Estimation for English-Hungarian Machine Translation Systems with Optimized Semantic Features

by Zijian Győző Yang

2022, Computational Linguistics and Intelligent Text Processing

Quality estimation at run-time for machine translation systems is an important task. The standard automatic evaluation methods that use reference translations cannot evaluate MT results in real-time and the correlation between the results... more

descriptionView Paper arrow_downwardDownload

CobaltF: A Fluent Metric for MT Evaluation

by Anton Malinovskiy

2022, Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

The vast majority of Machine Translation (MT) evaluation approaches are based on the idea that the closer the MT output is to a human reference translation, the higher its quality. While translation quality has two important aspects,... more

descriptionView Paper arrow_downwardDownload

Reference-based Metrics can be Replaced with Reference-less Metrics in Evaluating Grammatical Error Correction Systems

by Kentaro Inui

2022

In grammatical error correction (GEC), automatically evaluating system outputs requires gold-standard references, which must be created manually and thus tend to be both expensive and limited in coverage. To address this problem, a... more

descriptionView Paper arrow_downwardDownload

Quality Estimation

Related Topics