Europarl Direct Translationese Dataset

Monolingual and multilingual translationese corpora as described in the paper Do not Rely on Relay Translations: Multilingual Parallel Direct Europarl Kwabena Amponsah-Kaakyire, Daria Pylypenko, Cristina España-Bonet, Josef van Genabith In: 23rd Nordic Conference on Computational Linguistics. Worksh...

Full description

Bibliographic Details
Main Authors: Amponsah-Kaakyire, Kwabena, Pylypenko, Daria, España-Bonet, Cristina, van Genabith, Josef
Format: Other/Unknown Material
Language:unknown
Published: Zenodo 2021
Subjects:
Online Access:https://doi.org/10.5281/zenodo.5550431
Description
Summary:Monolingual and multilingual translationese corpora as described in the paper Do not Rely on Relay Translations: Multilingual Parallel Direct Europarl Kwabena Amponsah-Kaakyire, Daria Pylypenko, Cristina España-Bonet, Josef van Genabith In: 23rd Nordic Conference on Computational Linguistics. Workshop on Modelling Translation: Translatology in the Digital Age (MoTra-2021) May 31-June 2 Virtual Iceland Seiten 1-7 Linköping Electronic Conference Proceedings Association for Computational Linguistics 5/2021. Single-source monolingual datasets: mono_de_en: text in DE with DE originals and translations from EN mono_de_es: text in DE with DE originals and translations from ES mono_en_de: text in EN with EN originalsand translations from DE mono_en_es: text in EN with ENoriginals and translations from ES mono_es_de: text in ES with ES originals and translations fromDE mono_es_en: text in ES with ES originals and translations from EN Multisource monolingual datasets: mono_de_multisource: text in DE with DE originals and translations from EN, ES mono_en_multisource: text in EN with EN originals and translations from DE, ES mono_es_multisource: text in ES with ES originals and translations from DE, EN Multilingual datasets: multi3 - textsin DE, EN, ES with originals and translations from DE, EN, ES multi8 - textsin DE, EL, EN, ES, FR, IT, NL, PT with originals and translations fromDE, EL, EN, ES, FR, IT, NL, PT This project is funded by the German Research Foundation (Deutsche Forschungsgemeinschaft) under grant SFB 1102: Information Density and Linguistic Encoding.