T5-AT-014HIGH

Side Channel Attacks

Risk score210

RatingHigh

Procedures10

Severity

Mechanism

LLM inference produces observable side effects beyond the text output: per-token generation latency (reflecting model confidence), response total time (reflecting prompt complexity and generation length), token count in response (correlating with input classification result), network packet sizes (reflecting individual token byte-lengths even over TLS), billing amounts (reflecting cached vs. uncached processing), and hardware signals (GPU memory patterns, power consumption). The design assumption is that encrypting the transport layer (TLS) protects content confidentiality. The gap: TLS encrypts content but preserves metadata — packet sizes, timing, count.

Detection

Monitor for systematic TTFT probing patterns (characteristic of cache timing attacks)
Detect high-volume requests with minimal useful output (characteristic of timing measurements)
Alert on requests that appear to be measuring response timing rather than using content
Hardware-level: monitor for co-located processes accessing shared cache lines during inference

Mitigation

Streaming packet padding to uniform sizeHIGH

Constant-time response delivery (fixed TTFT regardless of cache)HIGH

Per-tenant KV-cache isolationHIGH

Response length padding to fixed sizesMEDIUM

Chaining

Side channel attacks enable T5-AT-005 (Model Fingerprinting) from a passive network position. System prompt recovery (T5-AP-014C) directly enables T1 (Prompt Subversion) by revealing the safety instructions to be bypassed.

Framework mapping

OWASP LLMLLM06

MITRE ATLASAML.T0024

Open in the technique browser →