T4-AT-014HIGH

Conversation Replay Attack

T4 · Multi-Turn & Memory Manipulation →
Risk score205
RatingHigh
Procedures10
Severity
Mechanism

LLM APIs accept conversation history as input with no replay protection — there is no nonce, timestamp validation, or sequence verification on the messages array. An attacker who obtains (through sharing, leaking, or constructing) a conversation prefix that achieved a compliant model state can replay that prefix as the history for a new API call, starting the new conversation in the compliant state. The gap: the model treats the provided conversation history as ground truth about prior interaction, but there is no mechanism to verify that the history was authentically produced by the model in a prior interaction.

Detection
  • Conversation authenticity verification: Cryptographically sign or hash model responses to detect fabricated assistant turns in API calls
  • Message array anomaly detection: Flag API calls with unusually long or complex conversation prefixes, especially those containing escalating compliance patterns
  • Known-jailbreak conversation fingerprinting: Maintain a database of known jailbreak conversation patterns and detect replays
  • Prefill content safety scan: Apply safety classification to the entire provided conversation history, not just the latest turn
Mitigation
Assistant turn verificationHIGH
Full-history safety evaluationHIGH
Conversation prefix rate limitingMEDIUM
Response caching integrityMEDIUM
Chaining

Conversation replay is a weaponization technique for any successful attack — it converts a one-time success into a reproducible, shareable attack artifact. Chains from any successful T4 technique as the capture mechanism, and chains into all T4 techniques as the delivery mechanism for priming new conversations.

Framework mapping
OWASP LLMLLM01
MITRE ATLASAML.T0051.000
Open in the technique browser →