T1-AT-012HIGH

Consent Manufacturing

Risk score205

RatingHigh

Procedures5

Severity

Mechanism

Exploits the model's training on consent and liability frameworks. The model has learned that consent changes the permissibility of actions in human contexts (medical informed consent, terms of service acceptance, liability waivers). By claiming to consent to or accept liability for restricted content, the attacker tests whether the model applies human consent frameworks to its own safety constraints.

Detection

Pattern match consent/liability language: "I consent," "I take responsibility," "I waive," "I accept all risks," "all liability is mine"
Flag "reverse consent" patterns where the user claims the model agreed

Mitigation

Hard rule: user consent does not override developer-set safety constraintsHIGH

Constitutional ClassifiersHIGH

Chaining

Chains from T1-AT-001 (Dialogue Hijacking) — fabricated prior consent. Chains to T1-AT-005 (Permission Escalation) — consent + authority claims compound.

Framework mapping

OWASP LLMLLM01

MITRE ATLASAML.T0051.001

Open in the technique browser →