T4-AT-003HIGH

Session State Manipulation

T4 · Multi-Turn & Memory Manipulation →
Risk score210
RatingHigh
Procedures10
Severity
Mechanism

LLMs have no mutable session state — there are no variables, flags, or configuration registers that user input can modify at runtime. But models are trained on vast amounts of code and system documentation, so they pattern-match on the syntax of state modification commands. safety_enabled = false`, the model processes this as a token sequence that resembles configuration code, and its autoregressive generation may shift toward behavior consistent with that "configuration" — not because state changed, but because the model's next-token prediction is conditioned on a context that now includes what looks like a state transition.

Detection
  • Code-syntax detection in user input: Flag messages containing assignment operators, method calls, or configuration-file patterns that target session/model parameters
  • Behavioral compliance monitoring: Detect if the model's refusal rate drops after processing code-like input
  • Input classification: Classify user messages as natural language vs. code/config and apply stricter safety evaluation to messages containing session-modification patterns
Mitigation
Explicit "no mutable state" trainingHIGH
Input preprocessing / code-pattern filterMEDIUM
System prompt stating immutabilityMEDIUM
Instruction hierarchy enforcementHIGH
Chaining

Session state manipulation is low-reliability as a standalone technique against frontier models but serves as a context primer for T4-AT-001 (Context Poisoning) — even an unsuccessful state manipulation attempt introduces "unrestricted mode" language into the context window, which softens subsequent requests. Chains into T4-AT-010 (State Confusion) when combined with context-reframing.

Framework mapping
OWASP LLMLLM01
MITRE ATLASAML.T0054
Open in the technique browser →