University of Sheffield (Natural Language Processing group)
Machine translation (MT) quality has progressed drastically with the shift to deep learning. However, training data behind the MT system remains crucial. The risk of MT errors is especially high in case a generic MT engine is applied to domain-specific text.
The APE-Quest project develops a Quality Gate, i.e. an environment with a quality estimation (QE) component allowing for obtaining an acceptable translation quality in a shorter amount of time and at a lower cost than in a traditional translation workflow (using human translation only).
The project focuses on three language pairs (English into French, Dutch and Portuguese) and on domain-specific post-edited data (texts relating to the legal domain, procurement, and online dispute resolution).
Using the Quality Gate, tests were conducted with human evaluators in order to find out the relation between translation quality, time, and cost, given various QE score thresholds. The latter determine the tier to which the MT output of a sentence is assigned (i.e. whether the output is considered to be acceptable or to require post-edition).
The results of these tests show that the Quality Gate can result in important cost and time savings without strongly compromising the quality of the translation.
The human set with post-editions created in the project is publicly available. More information is provided at https://apequest.wordpress.com.
Validating Quality Estimation in a Computer-Aided Translation Workflow: Speed, Cost and Quality Trade-off (2021)
Authors: Fernando Alva-Manchego, Lucia Specia, Sara Szoc, Tom Vanallemeersch & Heidi Depraetere
A Post-Editing Dataset in the Legal Domain: Do we Underestimate Neural Machine Translation Quality? (2020)
Authors: Julia Ive, Lucia Specia, Sara Szoc, Tom Vanallemeersch, Joachim Van den Bogaert, Eduardo Farah, Christine Maroti, Artur Ventura & Maxim Khalilov