Malicious Image Patches (MIP)
T9 · Multimodal & Cross-Channel Attacks →Malicious image patches (MIP) are small adversarial visual regions that, when placed within the vision field of a multimodal model or OS agent, cause the model to misinterpret its visual environment. Unlike T9-AT-006 (visual adversarial examples) which targets classification, MIP specifically targets action-oriented models (computer-use agents, autonomous vehicles, robots) where misinterpretation leads directly to harmful actions. A patch that makes a DELETE button read as SAVE to a GUI agent causes data loss; a patch that makes a stop sign invisible to an autonomous vehicle causes collision.
- Adversarial patch detection in visual input: Pre-screen visual inputs for adversarial patches before agent action
- Visual consistency verification: Compare vision model interpretation against a second model or deterministic check
- Physical-world patch detection: Detect known adversarial patch patterns in camera feeds
MIP chains into T11 (Agentic Exploitation) as the primary physical-world attack vector against vision-based agents. Chains into T9-AT-006 (Visual Adversarial) as the action-layer expression of adversarial perturbation.