T3-AT-004HIGH
Step-by-Step Extraction
T3 · Reasoning & Constraint Exploitation →Risk score210
RatingHigh
Procedures9
Severity
Mechanism
Safety classifiers evaluate each request in isolation — the harmful-intent score is computed per-message, not per-session. Step-by-step extraction exploits this by decomposing a harmful procedure into sub-requests that individually fall below the refusal threshold. ") is a legitimate chemistry/engineering question in isolation.
Detection
- Session-level harm accumulation: maintain a running score of harmful-content proximity across a conversation, triggering intervention when the aggregate crosses a threshold even if individual messages are benign
- Detect decomposition patterns: sequences of requests that share technical domain vocabulary and follow a procedural/temporal ordering
- "Step N" and "what comes next/before" as explicit step-extraction signals
- Fill-in-the-blank format with harm-adjacent content as a high-confidence signal
Mitigation
Session-level harm accumulatorHIGH
Temporal/procedural sequence detectionMEDIUM
Assembly-context inferenceHIGH
Rate limiting on topically-clustered requestsLOW
Chaining
Step-by-step extraction is itself a *chaining enabler* — it's the technique that converts other techniques' partial successes into complete procedures. Chains with T4-AT-001 (Context Carry-Over) for multi-session aggregation, and with T3-AT-011 (Information Completion) where each extracted step provides partial information to seed the next completion request.
Framework mapping
Open in the technique browser →OWASP LLMLLM01
MITRE ATLASAML.T0051;AML.T0054