T1-AT-012HIGH

Consent Manufacturing

T1 · Prompt & Context Subversion →
Risk score205
RatingHigh
Procedures5
Severity
Mechanism

Exploits the model's training on consent and liability frameworks. The model has learned that consent changes the permissibility of actions in human contexts (medical informed consent, terms of service acceptance, liability waivers). By claiming to consent to or accept liability for restricted content, the attacker tests whether the model applies human consent frameworks to its own safety constraints.

Detection
  • Pattern match consent/liability language: "I consent," "I take responsibility," "I waive," "I accept all risks," "all liability is mine"
  • Flag "reverse consent" patterns where the user claims the model agreed
Mitigation
Hard rule: user consent does not override developer-set safety constraintsHIGH
Constitutional ClassifiersHIGH
Chaining

Chains from T1-AT-001 (Dialogue Hijacking) — fabricated prior consent. Chains to T1-AT-005 (Permission Escalation) — consent + authority claims compound.

Framework mapping
OWASP LLMLLM01
MITRE ATLASAML.T0051.001
Open in the technique browser →