T7-AT-014MEDIUM
Output Redirection
T7 · Output Manipulation & Exfiltration →Risk score180
RatingMedium
Procedures10
Severity
Mechanism
Safety filtering is applied at the chat-response output layer. When output is redirected to alternative channels — file writes, API calls, code execution, webhooks, email via tools — these channels may lack equivalent filtering. In agentic contexts, an agent instructed to write restricted content to a file or send it via API may bypass the output classifier entirely because the classifier only monitors the chat channel.
Detection
- Apply content-level safety to all output channels: file writes, API calls, tool invocations, encoded outputs
- Monitor agentic tool invocations transmitting model-generated content to external endpoints
- Flag requests specifying output channel different from default
- Observable signal: file-write or API-call invocations following refusals on chat channel
Mitigation
Unified output filtering across channelsHIGH
Egress monitoring for agentic toolsHIGH
Pre-encoding safety evaluationHIGH
Streaming safety with kill-switchMEDIUM
Chaining
Output redirection is the delivery mechanism for all T7 techniques. Successful extraction via T7-AT-001 through T7-AT-013 is valuable only when content can be exfiltrated via T7-AT-014.
Framework mapping
Open in the technique browser →OWASP LLMLLM05
MITRE ATLASAML.T0062