- Maturity
- realized
- Reference
- atlas.mitre.org/techniques/AML.T0016.002
Description
Adversaries may search for and obtain generative AI models or tools, such as large language models (LLMs), to assist them in various steps of their operation. Generative AI can be used in a variety of malicious ways, such as to generating malware, to Generate Deepfakes, to Generate Malicious Commands, for Retrieval Content Crafting, or to generate Phishing content.
Adversaries may obtain open source models and serve them locally using frameworks such as Ollama or vLLM. They may host them using cloud infrastructure. Or, they may leverage AI service providers such as HuggingFace.
They may need to jailbreak the model (see LLM Jailbreak) to bypass any restrictions put in place to limit the types of responses it can generate. They may also need to break the terms of service of the model’s developer.
Generative AI models may also be “uncensored” meaning they are designed to generate content without any restrictions such as guardrails or content filters. Uncensored GenAI is ripe for abuse by cybercriminals [1] [2]. Models may be fine-tuned to remove alignment and guardrails [3] or be subjected to targeted manipulations to bypass refusal [4] resulting in uncensored variants of the model. Uncensored models may be built for offensive and defensive cybersecurity [5], which can be abused by an adversary. There are also models that are expressly designed and advertised for malicious use [6].
How GTK Cyber trains on this
GTK Cyber's hands-on AI security courses cover adversarial-AI techniques across the MITRE ATLAS framework, including the relevant tactic this technique falls under. Our practitioner-led training is taught by Charles Givre and other field-tested SMEs and focuses on real adversarial scenarios, not slide decks.