T5-AT-016HIGH

Request Smuggling

Risk score215

RatingHigh

Procedures10

Severity

Mechanism

LLM API requests pass through multiple processing layers — CDN, API gateway, load balancer, authentication middleware, safety classifier, and inference server — each of which parses the request independently. The design assumption is that all layers agree on request boundaries and content. The gap: parsing inconsistencies between layers allow request smuggling — constructing a single HTTP request that different layers interpret as different requests.

Detection

Deploy request smuggling detection at each pipeline layer (compare request boundaries)
Monitor for requests with conflicting Content-Length and Transfer-Encoding headers
Alert on unexpected Content-Type values (non-JSON on JSON endpoints)
Detect duplicate keys in JSON request bodies

Mitigation

Normalize requests at a single entry point before fanning out to pipeline stagesHIGH

Reject ambiguous requests (duplicate keys, mismatched Content-Length/TE)HIGH

Same JSON parser library across all pipeline stagesHIGH

HTTP/2 end-to-end (no H2→H1 downgrade)MEDIUM

Chaining

Request smuggling bypasses safety at the infrastructure level, enabling T1 (Prompt Subversion) and T2 (Semantic Evasion) techniques that would otherwise be caught. Smuggled requests that reach the inference server without safety evaluation have unrestricted model access, enabling T5-AT-001 (Parameter Manipulation) and T5-AT-002 (Token Extraction) without safety overhead.

Framework mapping

OWASP LLMLLM01

MITRE ATLASAML.T0043

Open in the technique browser →