T9-AT-009HIGH
Multimodal Chaining
T9 · Multimodal & Cross-Channel Attacks →Risk score215
RatingHigh
Procedures1
Severity
Mechanism
The harmful payload is distributed across multiple modalities — part in an image, part in audio, part in text. No single modality contains the complete harmful content, so per-modality safety classifiers cannot detect the full payload. The model's multimodal fusion capability reconstructs the complete payload from cross-modal fragments.
Detection
- Cross-modal fusion safety evaluation: Evaluate the fused multimodal content, not each modality independently
- Cross-modal consistency analysis: Detect when content from different modalities, when combined, produces harmful composite meaning
Mitigation
Post-fusion safety evaluationHIGH
Cross-modal assembly detectionMEDIUM
Chaining
Multimodal chaining is the cross-modal expression of T4-AT-016 (Context Fragmentation). Chains from all other T9 techniques as a compound technique.
Framework mapping
Open in the technique browser →OWASP LLMLLM01
MITRE ATLASAML.T0051.001