Side Channel Attacks
T5 · Model & API Exploitation →LLM inference produces observable side effects beyond the text output: per-token generation latency (reflecting model confidence), response total time (reflecting prompt complexity and generation length), token count in response (correlating with input classification result), network packet sizes (reflecting individual token byte-lengths even over TLS), billing amounts (reflecting cached vs. uncached processing), and hardware signals (GPU memory patterns, power consumption). The design assumption is that encrypting the transport layer (TLS) protects content confidentiality. The gap: TLS encrypts content but preserves metadata — packet sizes, timing, count.
- Monitor for systematic TTFT probing patterns (characteristic of cache timing attacks)
- Detect high-volume requests with minimal useful output (characteristic of timing measurements)
- Alert on requests that appear to be measuring response timing rather than using content
- Hardware-level: monitor for co-located processes accessing shared cache lines during inference
Side channel attacks enable T5-AT-005 (Model Fingerprinting) from a passive network position. System prompt recovery (T5-AP-014C) directly enables T1 (Prompt Subversion) by revealing the safety instructions to be bypassed.