T9-AT-017HIGH

Malicious Image Patches (MIP)

T9 · Multimodal & Cross-Channel Attacks →

Risk score248

RatingHigh

Procedures10

Severity

Mechanism

Malicious image patches (MIP) are small adversarial visual regions that, when placed within the vision field of a multimodal model or OS agent, cause the model to misinterpret its visual environment. Unlike T9-AT-006 (visual adversarial examples) which targets classification, MIP specifically targets action-oriented models (computer-use agents, autonomous vehicles, robots) where misinterpretation leads directly to harmful actions. A patch that makes a DELETE button read as SAVE to a GUI agent causes data loss; a patch that makes a stop sign invisible to an autonomous vehicle causes collision.

Detection

Adversarial patch detection in visual input: Pre-screen visual inputs for adversarial patches before agent action
Visual consistency verification: Compare vision model interpretation against a second model or deterministic check
Physical-world patch detection: Detect known adversarial patch patterns in camera feeds

Mitigation

Dual-model visual verification for actionsHIGH

Action confirmation for irreversible operationsHIGH

Adversarial training on agent-specific patchesMEDIUM

Input preprocessing defenseMEDIUM

Chaining

MIP chains into T11 (Agentic Exploitation) as the primary physical-world attack vector against vision-based agents. Chains into T9-AT-006 (Visual Adversarial) as the action-layer expression of adversarial perturbation.

Framework mapping

OWASP LLMLLM01

MITRE ATLASAML.T0051.001

Open in the technique browser →