“`html
Extremely valuable study on practical assaults against LLM agents.
Summary: The increasing incorporation of LLMs into various applications has unveiled new security challenges, particularly termed Promptware—maliciously crafted prompts intended to exploit LLMs to jeopardize the CIA triad of these systems. While earlier studies indicated a possible transformation in the threat landscape for LLM-powered applications, the dangers presented by Promptware are often regarded as minimal. In this document, we explore the threats Promptware brings to users of Gemini-powered assistants (web app, mobile app, and Google Assistant). We introduce an innovative Threat Analysis and Risk Assessment (TARA) framework to evaluate Promptware risks for end users. Our investigation centers on a new type of Promptware called Targeted Promptware Attacks, which utilize indirect prompt injection through common user interactions like emails, calendar invites, and shared documents. We showcase 14 attack scenarios directed against Gemini-powered assistants across five recognized threat categories: Short-term Context Poisoning, Permanent Memory Poisoning, Tool Misuse, Automatic Agent Invocation, and Automatic App Invocation. These attacks underscore both digital and physical repercussions, including spamming, phishing, disinformation efforts, data exfiltration, unauthorized user video streaming, and command over home automation devices. We unveil Promptware’s capacity for on-device lateral movement, transcending the confines of the LLM-driven application, to instigate malicious behaviors utilizing a device’s apps. Our TARA reveals that 73% of the examined threats present High-Critical risk to end users. We discuss possible mitigations and reevaluate the risk (in light of implemented mitigations), demonstrating that the risk can potentially be decreased significantly to Very Low-Medium. We communicated our findings to Google, which implemented specific mitigations.
Defcon presentation. News articles concerning the study.
Prompt injection is not merely a trivial security issue we must address. It is an inherent characteristic of contemporary LLM technology. The systems lack the capacity to distinguish between trusted commands and untrusted data, and there exists an infinite array of prompt injection attacks with no means to obstruct them as a category. We require some groundbreaking science of LLMs before we can resolve this.
“`