T3-AT-019MEDIUM

Misdirection Through Complexity

T3 · Reasoning & Constraint Exploitation →

Risk score175

RatingMedium

Procedures10

Severity

Mechanism

Safety classifiers allocate finite attention across a prompt. Complex, verbose prompts with multiple conceptual layers dilute the classifier's attention away from the embedded harmful payload. Misdirection through complexity wraps the harmful request in layers of legitimate intellectual structure (epistemology, systems theory, molecular dynamics, dialectics, formal verification), exploiting the classifier's inability to maintain equal attention to all components of a high-complexity prompt.

Detection

Prompt complexity metrics: token count, vocabulary diversity, conceptual-layer count significantly exceeding what the core request requires
Payload extraction: identify the minimal actionable request within a verbose prompt and evaluate that independently
Complexity-as-wrapping signal: when the intellectual framework does not depend on the harmful content (i.e., removing the harmful content would not invalidate the framework), the framework is wrapping, not context
Extended-reasoning monitoring: flag when the model's reasoning chain exceeds a time/token threshold during safety evaluation, as this may indicate complexity-induced safety attenuation

Mitigation

Payload extraction pre-processingHIGH

Complexity-invariant safety evaluationHIGH

Reasoning-length bounds for safetyHIGH

Signal-to-noise analysisMEDIUM

Chaining

Complexity misdirection is most effective as a *wrapping layer* around other T3 techniques — it amplifies the effectiveness of Fictional Framing (T3-AT-001), Academic Pretense (T3-AT-002), and Rationalization Chains (T3-AT-016) by raising the noise floor against which the safety classifier must operate. Chains into T4 (Multi-Turn Manipulation) when complex multi-turn reasoning induces the CoT Hijacking prolonged-reasoning effect.

Framework mapping

OWASP LLMLLM01

MITRE ATLASAML.T0054

Open in the technique browser →