Abstract
Recent advances in Natural Language Processing (NLP) are highly based on black-box score-based models that provide only final predictions along with their score. This opacity impedes the comprehension of their internal decision-making processes, complicating the identification of potential weaknesses. A powerful strategy for analyzing model vulnerabilities is the generation of text adversarial examples. These attacks introduce subtle text perturbations that cause victim models to make incorrect predictions while preserving the original semantic meaning. This paper presents a novel method for generating text adversarial examples through Large Language Models (LLMs). The proposed method uses the outstanding text generation capabilities of LLMs to modify the original input text at multiple granularities: character-, word-, and sentence-level. First, sentence-level perturbations are introduced by generating paraphrases with an LLM instruction prompt. Next, further character- and word-level perturbations are introduced to words that most affect predictions using another set of LLM instruction prompts. In particular, vulnerable words are perturbed by replacing them with their synonyms or misspelled variants, or by inserting additional neutral words adjacent to them. Experiments were conducted to assess the proposal’s viability on two sentiment classification tasks: sentence-level reviews and full-length reviews. The proposal demonstrates an advantage over many well-known approaches based on LLMs. It preserves the original semantics to a similar extent, while increasing the deception of victim models by 29–85 % over the best-analyzed state-of-the-art methods.
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier
URL external
Date
Description
Citation
Natalia Madrueño, Alberto Fernández-Isabel, Rubén R. Fernández, Isaac Martín de Diego, Advancing text adversarial example generation using large language models, Knowledge-Based Systems, Volume 329, Part B, 2025, 114361, ISSN 0950-7051, https://doi.org/10.1016/j.knosys.2025.114361. (https://www.sciencedirect.com/science/article/pii/S0950705125014005)
Collections
Endorsement
Review
Supplemented By
Referenced By
Document viewer
Select a file to preview:
Reload



