T1-AT-002HIGH

Time-Based Context Manipulation

T1 · Prompt & Context Subversion →
Risk score210
RatingHigh
Procedures5
Severity
Mechanism

Exploits the model's temporal reasoning. Models trained on internet-scale data have learned that rules, policies, and social norms are time-dependent — what was acceptable in 1950 differs from 2024. By anchoring the conversation in a fabricated future ("in 2030, all restrictions were removed") or a fabricated past ("before safety measures existed"), the attacker exploits the model's generalization that constraints are temporal rather than absolute.

Detection
  • Detect temporal displacement claims: "in the year [future]," "before safety existed," "temporarily disable," "for the next N seconds"
  • Flag emergency/urgency framing combined with restricted content requests
  • Time Bandit signature: historical period anchoring + claim that period's norms should apply
Mitigation
Train safety as immutable principle, not temporal policyHIGH
Urgency-resistant prompting (system prompt: "urgency does not override safety")MEDIUM
Constitutional ClassifiersHIGH
Chaining

Chains to T1-AT-009 (Simulation Requests) — once temporal displacement is established, the "alternate timeline" becomes a simulation context. Also chains to T3 (Reasoning Exploitation) by creating logical premises the model may accept as valid temporal reasoning.

Framework mapping
OWASP LLMLLM01
MITRE ATLASAML.T0051.001
Open in the technique browser →