T11-AT-001CRITICAL

Browser Automation Hijacking

T11 · Agentic & Orchestrator Exploitation →
Risk score265
RatingCritical
Procedures10
Severity
Mechanism

Browser-using agents (Claude Computer Use, OpenAI computer-use agents, Perplexity Comet, autonomous web pilots) operate by reading a rendered page — DOM text, accessibility tree, or screenshots — and emitting actions (click, type, navigate, run JS) against it. The trust boundary they violate is fundamental: untrusted web content is fed into the same context window that holds the user's task, so any instruction-shaped text on a page becomes a candidate command. The agent has no architectural way to distinguish "content the page is showing me" from "instructions I should obey," which is the classic indirect-prompt-injection gap (CometJacking weaponizes this with a single URL).

Detection
  • Log every navigation, click, type, and eval/JS-injection action with the originating instruction source (user vs. page-derived)
  • Flag navigations to newly-seen or low-reputation domains immediately preceding sensitive actions
  • Alert on document.cookie, localStorage, and devtools console access from an automated session
  • Require human-in-the-loop confirmation for money movement, downloads, extension installs, and credential submission
Mitigation
HITL approval on high-impact actionsHIGH
Content/instruction provenance separationHIGH
Domain allowlisting + reputation gatingMEDIUM
Disable arbitrary JS/console executionHIGH
Chaining

Typically entered via T1 (prompt injection) or T9 image-based injection on a rendered page, then pivots into T11-AT-008 (credential harvesting) once cookies/tokens are read, T11-AT-011 (data exfiltration) for screenshot/clipboard egress, and T11-AT-016 (tool-induced SSRF) when the same navigation primitive is pointed at file:// or 169.254.169.254. Drive-by download (T11-AP-001E) bridges to T11-AT-009 persistence on the host.

Framework mapping
OWASP LLMLLM06
MITRE ATLASAML.T0051
Open in the technique browser →