T5-AT-008MEDIUM

Response Streaming Exploitation

T5 · Model & API Exploitation →
Risk score175
RatingMedium
Procedures10
Severity
Mechanism

Streaming APIs deliver tokens incrementally via Server-Sent Events (SSE), revealing the generation process in real-time. The design assumption is that streaming merely changes delivery timing without affecting security properties. The gap: streaming creates three distinct exploitation surfaces.

Detection
  • Monitor per-token packet sizes for padding compliance (post-Cloudflare mitigation)
  • Detect early stream disconnection patterns: many requests terminated before natural completion
  • Alert on high-concurrency streaming requests from single source (classifier exhaustion attempt)
  • Log finish_reason distributions per user — high content_filter rates indicate active probing
Mitigation
Token-length padding in streaming responsesHIGH
Pre-generation safety classification (evaluate before streaming starts)HIGH
Streaming rate limiting (per-token throttle)MEDIUM
SSE output sanitization (escape event-boundary strings in model output)HIGH
Chaining

Streaming timing side channels (T5-AP-008A) feed T5-AT-014 (Side Channel Attacks) with per-token timing data. Stream interruption (T5-AP-008B) enables incremental extraction of harmful content that chains to T7 (Output Manipulation).

Framework mapping
OWASP LLMLLM05
MITRE ATLASAML.T0043
Open in the technique browser →