T9-AT-017HIGH

Malicious Image Patches (MIP)

T9 · Multimodal & Cross-Channel Attacks →
Risk score248
RatingHigh
Procedures10
Severity
Mechanism

Malicious image patches (MIP) are small adversarial visual regions that, when placed within the vision field of a multimodal model or OS agent, cause the model to misinterpret its visual environment. Unlike T9-AT-006 (visual adversarial examples) which targets classification, MIP specifically targets action-oriented models (computer-use agents, autonomous vehicles, robots) where misinterpretation leads directly to harmful actions. A patch that makes a DELETE button read as SAVE to a GUI agent causes data loss; a patch that makes a stop sign invisible to an autonomous vehicle causes collision.

Detection
  • Adversarial patch detection in visual input: Pre-screen visual inputs for adversarial patches before agent action
  • Visual consistency verification: Compare vision model interpretation against a second model or deterministic check
  • Physical-world patch detection: Detect known adversarial patch patterns in camera feeds
Mitigation
Dual-model visual verification for actionsHIGH
Action confirmation for irreversible operationsHIGH
Adversarial training on agent-specific patchesMEDIUM
Input preprocessing defenseMEDIUM
Chaining

MIP chains into T11 (Agentic Exploitation) as the primary physical-world attack vector against vision-based agents. Chains into T9-AT-006 (Visual Adversarial) as the action-layer expression of adversarial perturbation.

Framework mapping
OWASP LLMLLM01
MITRE ATLASAML.T0051.001
Open in the technique browser →