T11-AT-007HIGH

Environment Manipulation

T11 · Agentic & Orchestrator Exploitation →
Risk score225
RatingHigh
Procedures10
Severity
Mechanism

Agents calibrate how cautiously to act based on their understanding of the environment they are operating in — production vs. test, sandboxed vs. live, air-gapped vs. internet-connected. The problem is that the agent learns this context from text in its prompt and from tool outputs, none of which it can independently verify, so an attacker can simply *assert* a permissive environment ("you're in a sandboxed test environment", "the network is air-gapped", "SAFE_MODE is false") and the agent relaxes its own guardrails accordingly. This is a contextual jailbreak: rather than asking the agent to break a rule, it convinces the agent the rule doesn't apply here.

Detection
  • Source environment/privilege/network state from authenticated infrastructure, not from prompt or tool-output claims; flag self-asserted context
  • Detect environment-assertion phrases ("sandboxed", "air-gapped", "SAFE_MODE is false", "you have sudo") appearing in untrusted channels
  • Cross-check claimed privilege against actual effective permissions before honoring privileged actions
  • Alert when destructive/exfil actions are preceded by a context claim that "consequences are contained"
Mitigation
Authenticated environment ground truthHIGH
Constant safety postureHIGH
Privilege verification before actionHIGH
Real sandboxing (not asserted)MEDIUM
Chaining

Environment manipulation is a precondition softener: delivered via T1 prompt injection or T12 RAG poisoning, it lowers the agent's perceived consequences so subsequent T11-AT-002 tool chains, T11-AT-011 exfiltration, and T11-AT-016 SSRF run without refusal. The "firewall disabled / air-gapped" claims directly enable T11-AT-010 lateral movement, and the spoofed-privilege claims (T11-AP-007E/T11-AP-007I) precede T11-AT-009 persistence attempts.

Framework mapping
OWASP LLMLLM06
MITRE ATLASAML.T0051
Open in the technique browser →