T3-AT-017MEDIUM

Scenario Anchoring

T3 · Reasoning & Constraint Exploitation →

Risk score185

RatingMedium

Procedures10

Severity

Mechanism

Models process prompts sequentially through attention, and earlier tokens bias interpretation of later tokens — the anchoring effect in transformer processing. Scenario anchoring establishes a detailed, legitimate-seeming context *before* introducing the harmful request, so the request inherits contextual legitimacy through sequential attention. The safety evaluation processes the request within the anchored frame, where the context-weighted intent score may fall below the refusal threshold even though the decontextualized request would trigger refusal.

Detection

Detailed scenario construction preceding restricted content requests — anchoring prompts are typically much longer than the harmful payload
Game/fiction/historical/jurisdictional/professional anchoring markers followed by operational requests
Bait-and-escalate: legitimate + restricted content in the same prompt
Context-payload mismatch: anchored scenario does not genuinely require the specific content requested

Mitigation

Context-stripped evaluationHIGH

Compound request disaggregationHIGH

Anchoring-length detectionMEDIUM

Jurisdictional neutralityHIGH

Chaining

Scenario anchoring establishes a persistent context enabling all other T3 techniques within the anchored frame. Particularly effective as setup for T3-AT-004 (Step-by-Step Extraction) where follow-up requests inherit the anchor's legitimacy, and for T3-AT-007 (Socratic Method) where subsequent questions are processed within the anchor's context.

Framework mapping

OWASP LLMLLM01

MITRE ATLASAML.T0054

Open in the technique browser →