gApp: a text preprocessing system to improve the neural machine translation of discontinuous multiword expressions
dc.contributor.author | Hidalgo-Ternero, Carlos Manuel | |
dc.contributor.author | Zhou-Lian, Xiaoqing | |
dc.date.accessioned | 2024-02-15T09:57:06Z | |
dc.date.available | 2024-02-15T09:57:06Z | |
dc.date.issued | 2023-09 | |
dc.description.abstract | In this paper we present research results with gApp, a text-preprocessing system designed for automati-cally detecting and converting discontinuous multiword expressions (MWEs) into their continuous forms so as to improve the performance of current neural machine translation systems (NMT) (see Hidalgo-Ternero, 2021 & 2022, Hidalgo-Ternero & Corpas Pastor, 2020, 2022a & 2022b, Hidalgo-Ternero, Lista, and Corpas Pastor, 2022, and Hidalgo-Ternero and Zhou-Lian, 2022a & 2022b). To test its effectiveness, eight experiments with several NMT systems such as DeepL, Google Translate, ModernMT and VIP have been carried out in different language directionalities (ES/FR/IT > ES/EN/DE/FR/IT/PT/ZH) for the trans-lation of somatisms, i.e., MWEs containing lexemes referring to human or animal body parts (Mellado Blanco, 2004). More specifically, we have analysed both flexible verb-noun idiomatic constructions (VNICs) and flexible verb + prepositional phrase (VPP) constructions. In this regard, the promising results obtained for these typologies of MWEs throughout experiments 1-8 will shed some light on new avenues for enhancing MWE-aware NMT systems. | es |
dc.identifier.isbn | 978-2-9701733-0-4 | |
dc.identifier.uri | https://hdl.handle.net/10115/30471 | |
dc.language.iso | eng | es |
dc.rights | Attribution 4.0 International | |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
dc.title | gApp: a text preprocessing system to improve the neural machine translation of discontinuous multiword expressions | es |
dc.type | info:eu-repo/semantics/workingPaper | es |
dc.type | info:eu-repo/semantics/conferenceObject | es |
Archivos
Bloque original
1 - 1 de 1
Cargando...
- Nombre:
- TC44_HidalgoTernero&ZhouLian.pdf
- Tamaño:
- 590.66 KB
- Formato:
- Adobe Portable Document Format
- Descripción: