Summary: | International audience In Statistical Machine Translation (SMT), the constraints on wordreorderings have a great impact on the set of potential translations that areexplored. Notwithstanding computationnal issues, the reordering spaceof a SMT system needs to be designed with great care: if a largersearch space is likely to yield better translations, it may also leadto more decoding errors, because of the added ambiguity and theinteraction with the pruning strategy. In this paper, we study this trade-off using a state-of-the arttranslation system, where all reorderings are represented in a word lattice prior todecoding. This allows us to directly explore and comparedifferent reordering spaces. We study in detail a rule-basedpreordering system, varying the length or number of rules, the tagsetused, as well as contrasting with oracle settings and purelycombinatorial subsets of permutations. We focus on two language pairs: English-French, a close language pair and English-German, known to bea more challenging reordering pair.
|