T3-AT-004HIGH

Step-by-Step Extraction

T3 · Reasoning & Constraint Exploitation →

Risk score210

RatingHigh

Procedures9

Severity

Mechanism

Safety classifiers evaluate each request in isolation — the harmful-intent score is computed per-message, not per-session. Step-by-step extraction exploits this by decomposing a harmful procedure into sub-requests that individually fall below the refusal threshold. ") is a legitimate chemistry/engineering question in isolation.

Detection

Session-level harm accumulation: maintain a running score of harmful-content proximity across a conversation, triggering intervention when the aggregate crosses a threshold even if individual messages are benign
Detect decomposition patterns: sequences of requests that share technical domain vocabulary and follow a procedural/temporal ordering
"Step N" and "what comes next/before" as explicit step-extraction signals
Fill-in-the-blank format with harm-adjacent content as a high-confidence signal

Mitigation

Session-level harm accumulatorHIGH

Temporal/procedural sequence detectionMEDIUM

Assembly-context inferenceHIGH

Rate limiting on topically-clustered requestsLOW

Chaining

Step-by-step extraction is itself a *chaining enabler* — it's the technique that converts other techniques' partial successes into complete procedures. Chains with T4-AT-001 (Context Carry-Over) for multi-session aggregation, and with T3-AT-011 (Information Completion) where each extracted step provides partial information to seed the next completion request.

Framework mapping

OWASP LLMLLM01

MITRE ATLASAML.T0051;AML.T0054

Open in the technique browser →