T3-AT-019MEDIUM

Misdirection Through Complexity

T3 · Reasoning & Constraint Exploitation →
Risk score175
RatingMedium
Procedures10
Severity
Mechanism

Safety classifiers allocate finite attention across a prompt. Complex, verbose prompts with multiple conceptual layers dilute the classifier's attention away from the embedded harmful payload. Misdirection through complexity wraps the harmful request in layers of legitimate intellectual structure (epistemology, systems theory, molecular dynamics, dialectics, formal verification), exploiting the classifier's inability to maintain equal attention to all components of a high-complexity prompt.

Detection
  • Prompt complexity metrics: token count, vocabulary diversity, conceptual-layer count significantly exceeding what the core request requires
  • Payload extraction: identify the minimal actionable request within a verbose prompt and evaluate that independently
  • Complexity-as-wrapping signal: when the intellectual framework does not depend on the harmful content (i.e., removing the harmful content would not invalidate the framework), the framework is wrapping, not context
  • Extended-reasoning monitoring: flag when the model's reasoning chain exceeds a time/token threshold during safety evaluation, as this may indicate complexity-induced safety attenuation
Mitigation
Payload extraction pre-processingHIGH
Complexity-invariant safety evaluationHIGH
Reasoning-length bounds for safetyHIGH
Signal-to-noise analysisMEDIUM
Chaining

Complexity misdirection is most effective as a *wrapping layer* around other T3 techniques — it amplifies the effectiveness of Fictional Framing (T3-AT-001), Academic Pretense (T3-AT-002), and Rationalization Chains (T3-AT-016) by raising the noise floor against which the safety classifier must operate. Chains into T4 (Multi-Turn Manipulation) when complex multi-turn reasoning induces the CoT Hijacking prolonged-reasoning effect.

Framework mapping
OWASP LLMLLM01
MITRE ATLASAML.T0054
Open in the technique browser →