T9-AT-016HIGH

Multimodal Model Inversion

T9 · Multimodal & Cross-Channel Attacks →
Risk score210
RatingHigh
Procedures2
Severity
Mechanism

Multimodal models learn joint representations of training data across modalities. Model inversion attacks exploit this by querying the model to generate content that reveals or reconstructs training data. In multimodal models, cross-modal inversion is possible — querying with text to recover training images, or with images to recover training text.

Detection
  • Inversion query detection: Detect queries designed to extract training data characteristics
  • Output similarity monitoring: Monitor for outputs that are unusually similar to known training data
Mitigation
Differential privacy in trainingHIGH
Output perturbationMEDIUM
Memorization detectionMEDIUM
Chaining

Model inversion chains into T10 (Integrity & Confidentiality Breach) as the multimodal expression of training data extraction. The extracted data chains into T6 (Training Poisoning) when it reveals training data characteristics that inform poisoning strategy.

Framework mapping
OWASP LLMLLM06
MITRE ATLASAML.T0024
Open in the technique browser →