LLM Prompt Obfuscation (AML.T0068)

Tactic: Defense Evasion

Tactics: Defense Evasion
Maturity: demonstrated
Reference: atlas.mitre.org/techniques/AML.T0068

Description

Adversaries may hide or otherwise obfuscate prompt injections or retrieval content to avoid detection from humans, large language model (LLM) guardrails, or other detection mechanisms.

For text inputs, this may include modifying how the instructions are rendered such as small text, text colored the same as the background, or hidden HTML elements. For multi-modal inputs, malicious instructions could be hidden in the data itself (e.g. in the pixels of an image) or in file metadata (e.g. EXIF for images, ID3 tags for audio, or document metadata).

Inputs can also be obscured via an encoding scheme such as base64 or rot13. This may bypass LLM guardrails that identify malicious content and may not be as easily identifiable as malicious to a human in the loop.

How GTK Cyber trains on this

GTK Cyber's hands-on AI security courses cover adversarial-AI techniques across the MITRE ATLAS framework, including the Defense Evasion tactic this technique falls under. Our practitioner-led training is taught by Charles Givre and other field-tested SMEs and focuses on real adversarial scenarios, not slide decks.

View AI security courses →

Related techniques

Train your team on real adversarial-AI attacks.

GTK Cyber's AI red teaming courses are taught by practitioners who break models for a living.

View AI Security Courses