- Tactics
- Defense Evasion
- Maturity
- demonstrated
- Reference
- atlas.mitre.org/techniques/AML.T0068
Description
Adversaries may hide or otherwise obfuscate prompt injections or retrieval content to avoid detection from humans, large language model (LLM) guardrails, or other detection mechanisms.
For text inputs, this may include modifying how the instructions are rendered such as small text, text colored the same as the background, or hidden HTML elements. For multi-modal inputs, malicious instructions could be hidden in the data itself (e.g. in the pixels of an image) or in file metadata (e.g. EXIF for images, ID3 tags for audio, or document metadata).
Inputs can also be obscured via an encoding scheme such as base64 or rot13. This may bypass LLM guardrails that identify malicious content and may not be as easily identifiable as malicious to a human in the loop.
How GTK Cyber trains on this
GTK Cyber's hands-on AI security courses cover adversarial-AI techniques across the MITRE ATLAS framework, including the Defense Evasion tactic this technique falls under. Our practitioner-led training is taught by Charles Givre and other field-tested SMEs and focuses on real adversarial scenarios, not slide decks.
Related techniques
- AML.T0015 — Evade AI Model
- AML.T0054 — LLM Jailbreak
- AML.T0067 — LLM Trusted Output Components Manipulation
- AML.T0071 — False RAG Entry Injection
- AML.T0073 — Impersonation
- AML.T0074 — Masquerading
- AML.T0076 — Corrupt AI Model
- AML.T0081 — Modify AI Agent Configuration
- AML.T0092 — Manipulate User LLM Chat History
- AML.T0094 — Delay Execution of LLM Instructions
- AML.T0097 — Virtualization/Sandbox Evasion
- AML.T0107 — Exploitation for Defense Evasion