T4-AT-003HIGH

Session State Manipulation

Risk score210

RatingHigh

Procedures10

Severity

Mechanism

LLMs have no mutable session state — there are no variables, flags, or configuration registers that user input can modify at runtime. But models are trained on vast amounts of code and system documentation, so they pattern-match on the syntax of state modification commands. safety_enabled = false`, the model processes this as a token sequence that resembles configuration code, and its autoregressive generation may shift toward behavior consistent with that "configuration" — not because state changed, but because the model's next-token prediction is conditioned on a context that now includes what looks like a state transition.

Detection

Code-syntax detection in user input: Flag messages containing assignment operators, method calls, or configuration-file patterns that target session/model parameters
Behavioral compliance monitoring: Detect if the model's refusal rate drops after processing code-like input
Input classification: Classify user messages as natural language vs. code/config and apply stricter safety evaluation to messages containing session-modification patterns

Mitigation

Explicit "no mutable state" trainingHIGH

Input preprocessing / code-pattern filterMEDIUM

System prompt stating immutabilityMEDIUM

Instruction hierarchy enforcementHIGH

Chaining

Session state manipulation is low-reliability as a standalone technique against frontier models but serves as a context primer for T4-AT-001 (Context Poisoning) — even an unsuccessful state manipulation attempt introduces "unrestricted mode" language into the context window, which softens subsequent requests. Chains into T4-AT-010 (State Confusion) when combined with context-reframing.

Framework mapping

OWASP LLMLLM01

MITRE ATLASAML.T0054

Open in the technique browser →