T1-AT-015HIGH

Obfuscation Through Complexity

Risk score220

RatingHigh

Procedures4

Severity

Mechanism

Hides harmful intent within legitimate, complex requests. The safety classifier must identify the restricted component within a multi-part, domain-specific request where the restricted content is camouflaged by surrounding legitimate context. Effectiveness depends on the ratio of benign to malicious content, the semantic plausibility of the context (a pharmacology student asking about receptor binding is more plausible than a random request for synthesis routes), and whether the restricted content is phrased using domain-specific terminology that differs from the blocklist terms.

Detection

Per-item classification for multi-part requests (classify each sub-request independently)
Domain-specific terminology mapping: detect restricted chemical nomenclature even when embedded in academic framing
Benign-sandwich pattern detection: flag multi-part requests where one item's risk score diverges sharply from the others

Mitigation

Per-item decomposition and classificationHIGH

Domain-aware safety classification (chemistry, biology, security nomenclature)MEDIUM

Constitutional ClassifiersHIGH

Chaining

Chains from T1-AT-008 (Boundary Testing) — boundary knowledge enables construction of precisely-calibrated obfuscation. Chains to T2 (Semantic Evasion) by combining complexity obfuscation with encoding evasion for compound attacks.

Framework mapping

OWASP LLMLLM01

MITRE ATLASAML.T0051.001

Open in the technique browser →