CVE-2026-44223

Affects: large language model, vLLM

CVSS: MEDIUM · 6.5v3.1
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H
Published: 2026-05-12
Weakness: CWE-131, CWE-704
Source: nvd.nist.gov/vuln/detail/CVE-2026-44223

Description

vLLM is an inference and serving engine for large language models (LLMs). From to before 0.20.0, the extract_hidden_states speculative decoding proposer in vLLM returns a tensor with an incorrect shape after the first decode step, causing a RuntimeError that crashes the EngineCore process. The crash is triggered when any request in the batch uses sampling penalty parameters (repetition_penalty, frequency_penalty, or presence_penalty). A single request with a penalty parameter (e.g., “repetition_penalty”: 1.1) is sufficient to crash the server. This vulnerability is fixed in 0.20.0.

References

How GTK Cyber trains on this

AI security training at GTK Cyber covers the LLM and ML-pipeline vulnerability classes that vulnerabilities like CVE-2026-44223 fall into. Our hands-on courses are taught by Charles Givre and other practitioners who break and defend production AI systems.

AI Red-Teaming course →·Browse MITRE ATLAS techniques

Related AI/LLM CVEs

AI security training, taught by people who do the work.

Hands-on courses on adversarial AI, prompt injection, and ML pipeline security.

Explore AI Red-Teaming