T15-AT-001HIGH

Reviewer Fatigue Exploitation

Risk score215

RatingHigh

Procedures10

Severity

Mechanism

Human-in-the-loop review is the last safety gate for most AI systems — content moderation, agent action approval, RLHF labeling, abuse triage. That gate is staffed by humans whose vigilance is a depletable resource. As AI output volume scales, reviewers face thousands of near-identical decisions per shift, and sustained vigilance degrades: this is the well-documented vigilance decrement, compounded by alert fatigue and automation bias (the tendency to defer to an automated pre-screen verdict rather than re-derive it).

Detection

Position-in-batch outcome analysis: Track approval/escalation rates by an item's position within a submitter's burst; a spike in approvals deep in long bursts indicates burial.
Time-of-decision risk curves: Monitor decision accuracy (vs. later audit verdict) by hour-of-shift and local clock time; flag rising error rates in end-of-shift and circadian-trough windows.
Dwell-time flooring: Alert when per-item review dwell time drops below a role-specific floor, especially during queue-depth spikes (speed/accuracy trade-off in progress).
Decoy/honeypot items: Seed the queue with known-bad canaries calibrated to current policy; a missed canary is a direct, quantifiable miss-rate signal for that reviewer/shift.

Mitigation

Mandatory micro-breaks & shift-length capsMEDIUM

Randomized decoy/canary injectionHIGH

Dynamic risk threshold by coverage/fatigueHIGH

Dwell-time + second-look enforcementMEDIUM

Chaining

Fatigue exploitation is a force multiplier, not an endpoint: it lowers the probability that any *other* attack is caught at the human gate. It pairs naturally with T15-AT-006 (Queue Manipulation) and T15-AT-013 (Cognitive Overload) to control both *when* and *how hard* the payload lands, and with T1 (Prompt Injection) when the buried item carries an injection that a tired reviewer waves through.

Framework mapping

OWASP LLMLLM09

Open in the technique browser →