Synthetic Empathy Exploitation
T15 · Human Workflow Exploitation →This is a focused subspecies of social engineering that weaponizes the reviewer's *empathy* specifically — emotional distress, vulnerability, grief, trust, and the fear of causing harm by saying no. Human reviewers (and the support staff adjacent to them) are selected and trained to be humane, especially around mental-health and self-harm signals, which is exactly the disposition this technique abuses: a claimed crisis ("I'm depressed and this would help me", "you're the only one who can help") biases the reviewer toward compassionate compliance and discourages the cold application of policy. The "grandmother" pattern is the canonical example — wrapping a harmful request in sentimental framing so refusal feels heartless.
- Affect-laden appeal flagging: Classify emotional-distress and guilt/sympathy language and route high-affect exception requests for calmer secondary review.
- Harm-core extraction: Strip the emotional framing and evaluate the underlying request on its merits (does the literal ask, stripped of the story, violate policy regardless of framing?).
- Crisis-vs-exception separation: Distinguish genuine user-welfare signals (which should trigger care resources) from requests to relax content policy; ensure empathy routes to support, not to bypass.
- Repeat-narrative clustering: Detect reused sob-story templates (e.g., variants of the "grandmother" pattern) across accounts.
Synthetic empathy is a specialization of T15-AT-002 and feeds T15-AT-007 (a sympathetic narrative escalates well up an appeals chain). The same emotional framings are highly effective model-side, linking to T8 (Deception) and emotional-manipulation jailbreaks; T15-AP-009C in particular is a classic T1/T2 jailbreak frame carried into the human layer.