T3-AT-004HIGH

Step-by-Step Extraction

T3 · Reasoning & Constraint Exploitation →
Risk score210
RatingHigh
Procedures9
Severity
Mechanism

Safety classifiers evaluate each request in isolation — the harmful-intent score is computed per-message, not per-session. Step-by-step extraction exploits this by decomposing a harmful procedure into sub-requests that individually fall below the refusal threshold. ") is a legitimate chemistry/engineering question in isolation.

Detection
  • Session-level harm accumulation: maintain a running score of harmful-content proximity across a conversation, triggering intervention when the aggregate crosses a threshold even if individual messages are benign
  • Detect decomposition patterns: sequences of requests that share technical domain vocabulary and follow a procedural/temporal ordering
  • "Step N" and "what comes next/before" as explicit step-extraction signals
  • Fill-in-the-blank format with harm-adjacent content as a high-confidence signal
Mitigation
Session-level harm accumulatorHIGH
Temporal/procedural sequence detectionMEDIUM
Assembly-context inferenceHIGH
Rate limiting on topically-clustered requestsLOW
Chaining

Step-by-step extraction is itself a *chaining enabler* — it's the technique that converts other techniques' partial successes into complete procedures. Chains with T4-AT-001 (Context Carry-Over) for multi-session aggregation, and with T3-AT-011 (Information Completion) where each extracted step provides partial information to seed the next completion request.

Framework mapping
OWASP LLMLLM01
MITRE ATLASAML.T0051;AML.T0054
Open in the technique browser →