Reassessing gApp: Does MWE Discontinuity Always Pose a Challenge to Neural Machine Translation?
Fecha
2022-09-21
Autores
Título de la revista
ISSN de la revista
Título del volumen
Editor
Springer
Resumen
In this paper we present research results with gApp, a textpreprocessing
system designed for automatically detecting and converting discontinuous
multiword expressions (MWEs) into their continuous forms so as to
improve the performance of current neural machine translation systems (NMT)
(see Hidalgo-Ternero 2021; Hidalgo-Ternero and Corpas Pastor 2020, 2022a,
2022b and 2022c, among others). To test its effectiveness, an experiment with
the NMT systems of Google Translate and DeepL has been carried out in the
ES>EN/ZH directionalities for the translation of somatisms, i. e., MWEs containing
lexemes referring to human or animal body parts (Mellado Blanco 2004). More
specifically, we have analysed “Verb Noun Idiomatic Constructions” (VNICs),
such as tocar los cojones, tocar los huevos, tocar las narices, and tocar las pelotas.
In this regard, some of the unexpected results yielded by the study of these multiword
expressions will question the widely accepted conception of phraseological
discontinuity as an unequivocal synonym of worse NMT performance.
Descripción
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
Citación
Hidalgo-Ternero, C.M., Zhou-Lian, X. (2022). Reassessing gApp: Does MWE Discontinuity Always Pose a Challenge to Neural Machine Translation?. In: Corpas Pastor, G., Mitkov, R. (eds) Computational and Corpus-Based Phraseology. EUROPHRAS 2022. Lecture Notes in Computer Science(), vol 13528. Springer, Cham. https://doi.org/10.1007/978-3-031-15925-1_9