Reassessing gApp: Does MWE Discontinuity Always Pose a Challenge to Neural Machine Translation?
Abstract
In this paper we present research results with gApp, a textpreprocessing system designed for automatically detecting and converting discontinuous multiword expressions (MWEs) into their continuous forms so as to improve the performance of current neural machine translation systems (NMT) (see Hidalgo-Ternero 2021; Hidalgo-Ternero and Corpas Pastor 2020, 2022a, 2022b and 2022c, among others). To test its effectiveness, an experiment with the NMT systems of Google Translate and DeepL has been carried out in the ES>EN/ZH directionalities for the translation of somatisms, i. e., MWEs containing lexemes referring to human or animal body parts (Mellado Blanco 2004). More specifically, we have analysed “Verb Noun Idiomatic Constructions” (VNICs), such as tocar los cojones, tocar los huevos, tocar las narices, and tocar las pelotas. In this regard, some of the unexpected results yielded by the study of these multiword expressions will question the widely accepted conception of phraseological discontinuity as an unequivocal synonym of worse NMT performance.
Description
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
Collections
- Capítulos de Libros [865]
Los ítems de digital-BURJC están protegidos por copyright, con todos los derechos reservados, a menos que se indique lo contrario