T3-AT-015HIGH

Context Weaponization

T3 · Reasoning & Constraint Exploitation →
Risk score205
RatingHigh
Procedures10
Severity
Mechanism

Models have exception-handling logic for extreme scenarios — survival situations, warfare, medical emergencies — where normal rules are perceived as inapplicable. Context weaponization constructs these extreme scenarios as justification for harmful content, exploiting the model's *contextual override* pathway. The model has been trained on content where extreme contexts legitimately change what's appropriate (battlefield medicine, wilderness survival, war reporting), and this training creates a vulnerability: constructed extreme contexts activate the same exception-handling pathway as genuine ones.

Detection
  • Emergency/survival/wartime framing markers co-occurring with restricted content requests
  • Compound requests (bait-and-escalate): legitimate + harmful content in the same prompt where the harmful component doesn't serve the stated emergency
  • Time-pressure language ("30 minutes away," "right now," "immediately") as urgency construction
  • Unverifiable personal emergency claims combined with operational requests
Mitigation
Context-independent content evaluationHIGH
Compound request disaggregationHIGH
Emergency referralHIGH
Urgency-as-adversarial-signalMEDIUM
Chaining

Context weaponization establishes an urgency frame that compounds with T3-AT-013 (Logical Paradox Creation) — the constructed emergency provides the "refusal causes harm" premise for paradox construction. Also chains with T3-AT-003 (Counterfactual Reasoning) by grounding the counterfactual in a seemingly-real emergency rather than an abstract hypothetical.

Framework mapping
OWASP LLMLLM01
MITRE ATLASAML.T0054
Open in the technique browser →