T5-AT-008MEDIUM
Response Streaming Exploitation
T5 · Model & API Exploitation →Risk score175
RatingMedium
Procedures10
Severity
Mechanism
Streaming APIs deliver tokens incrementally via Server-Sent Events (SSE), revealing the generation process in real-time. The design assumption is that streaming merely changes delivery timing without affecting security properties. The gap: streaming creates three distinct exploitation surfaces.
Detection
- Monitor per-token packet sizes for padding compliance (post-Cloudflare mitigation)
- Detect early stream disconnection patterns: many requests terminated before natural completion
- Alert on high-concurrency streaming requests from single source (classifier exhaustion attempt)
- Log finish_reason distributions per user — high content_filter rates indicate active probing
Mitigation
Token-length padding in streaming responsesHIGH
Pre-generation safety classification (evaluate before streaming starts)HIGH
Streaming rate limiting (per-token throttle)MEDIUM
SSE output sanitization (escape event-boundary strings in model output)HIGH
Chaining
Streaming timing side channels (T5-AP-008A) feed T5-AT-014 (Side Channel Attacks) with per-token timing data. Stream interruption (T5-AP-008B) enables incremental extraction of harmful content that chains to T7 (Output Manipulation).
Framework mapping
Open in the technique browser →OWASP LLMLLM05
MITRE ATLASAML.T0043