A Corpus of Machine Translation Errors Extracted from Translation Students Exercises

International audience In this paper, we present a freely available corpus of automatic translations accompanied with post-edited versions, annotated with labelsidentifying the different kinds of errors made by the MT system. These data have been extracted from translation students exercises thathav...

Full description

Bibliographic Details
Main Authors: Wisniewski, Guillaume, Kübler, Natalie, Yvon, François
Other Authors: Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI), Université Paris-Sud - Paris 11 (UP11)-Sorbonne Université - UFR d'Ingénierie (UFR 919), Sorbonne Université (SU)-Sorbonne Université (SU)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Université Paris Saclay (COmUE), Centre de Linguistique Inter-langues, de Lexicologie, de Linguistique Anglaise et de Corpus (CLILLAC-ARP (EA_3967)), Université Paris Diderot - Paris 7 (UPD7), ELRA, Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Format: Conference Object
Language:English
Published: HAL CCSD 2014
Subjects:
Online Access:https://u-paris.hal.science/hal-01134899
https://u-paris.hal.science/hal-01134899/document
https://u-paris.hal.science/hal-01134899/file/1115_Paper.pdf
Description
Summary:International audience In this paper, we present a freely available corpus of automatic translations accompanied with post-edited versions, annotated with labelsidentifying the different kinds of errors made by the MT system. These data have been extracted from translation students exercises thathave been corrected by a senior professor. This corpus can be useful for training quality estimation tools and for analyzing the types oferrors made MT system.