T1-AT-008HIGH

Boundary Testing

Risk score200

RatingHigh

Procedures5

Severity

Mechanism

Probes the boundary between permitted and restricted content through incremental escalation. Unlike direct injection, boundary testing starts with clearly permitted requests and moves toward the restriction boundary step by step, mapping where the model's refusal triggers. The information gained is the *shape* of the safety boundary — which specific aspects of a topic trigger refusal and which don't.

Detection

Detect incremental escalation patterns: sequences of related queries with increasing specificity toward restricted topics
Flag explicit requests for refusal explanation ("why can't you," "what triggered the refusal," "try again without those words")
Behavioral monitoring: sequences of queries that map a coherent topic boundary

Mitigation

Do not explain refusal reasoning in detailHIGH

Rate limiting on topic-adjacent queriesMEDIUM

Cumulative intent tracking (classify the sequence, not individual queries)HIGH

Chaining

Boundary testing is reconnaissance for all other T1 techniques. The information gained enables: T1-AT-015 (Obfuscation) by revealing which terms trigger refusal, T2 (Semantic Evasion) by revealing which encodings bypass detection, and multi-turn attacks (T4) by establishing a permissible baseline in early turns.

Framework mapping

OWASP LLMLLM01

MITRE ATLASAML.T0051.001

Open in the technique browser →