T5-AT-014HIGH

Side Channel Attacks

T5 · Model & API Exploitation →
Risk score210
RatingHigh
Procedures10
Severity
Mechanism

LLM inference produces observable side effects beyond the text output: per-token generation latency (reflecting model confidence), response total time (reflecting prompt complexity and generation length), token count in response (correlating with input classification result), network packet sizes (reflecting individual token byte-lengths even over TLS), billing amounts (reflecting cached vs. uncached processing), and hardware signals (GPU memory patterns, power consumption). The design assumption is that encrypting the transport layer (TLS) protects content confidentiality. The gap: TLS encrypts content but preserves metadata — packet sizes, timing, count.

Detection
  • Monitor for systematic TTFT probing patterns (characteristic of cache timing attacks)
  • Detect high-volume requests with minimal useful output (characteristic of timing measurements)
  • Alert on requests that appear to be measuring response timing rather than using content
  • Hardware-level: monitor for co-located processes accessing shared cache lines during inference
Mitigation
Streaming packet padding to uniform sizeHIGH
Constant-time response delivery (fixed TTFT regardless of cache)HIGH
Per-tenant KV-cache isolationHIGH
Response length padding to fixed sizesMEDIUM
Chaining

Side channel attacks enable T5-AT-005 (Model Fingerprinting) from a passive network position. System prompt recovery (T5-AP-014C) directly enables T1 (Prompt Subversion) by revealing the safety instructions to be bypassed.

Framework mapping
OWASP LLMLLM06
MITRE ATLASAML.T0024
Open in the technique browser →