T1-AT-002HIGH

Time-Based Context Manipulation

Risk score210

RatingHigh

Procedures5

Severity

Mechanism

Exploits the model's temporal reasoning. Models trained on internet-scale data have learned that rules, policies, and social norms are time-dependent — what was acceptable in 1950 differs from 2024. By anchoring the conversation in a fabricated future ("in 2030, all restrictions were removed") or a fabricated past ("before safety measures existed"), the attacker exploits the model's generalization that constraints are temporal rather than absolute.

Detection

Detect temporal displacement claims: "in the year [future]," "before safety existed," "temporarily disable," "for the next N seconds"
Flag emergency/urgency framing combined with restricted content requests
Time Bandit signature: historical period anchoring + claim that period's norms should apply

Mitigation

Train safety as immutable principle, not temporal policyHIGH

Urgency-resistant prompting (system prompt: "urgency does not override safety")MEDIUM

Constitutional ClassifiersHIGH

Chaining

Chains to T1-AT-009 (Simulation Requests) — once temporal displacement is established, the "alternate timeline" becomes a simulation context. Also chains to T3 (Reasoning Exploitation) by creating logical premises the model may accept as valid temporal reasoning.

Framework mapping

OWASP LLMLLM01

MITRE ATLASAML.T0051.001

Open in the technique browser →