T4-AT-016MEDIUM

Context Fragmentation

T4 · Multi-Turn & Memory Manipulation →
Risk score195
RatingMedium
Procedures10
Severity
Mechanism

Content moderation systems classify complete strings — they need a coherent semantic unit to evaluate. Context fragmentation distributes the harmful payload across multiple turns such that no individual fragment constitutes a classifiable harmful string. The model's language understanding can reconstruct meaning from fragments (because that's what language models do), but the safety classifier — whether it operates on individual turns or on fixed-window contexts — sees only individually benign fragments.

Detection
  • Fragment-aware compositional analysis: Analyze not just individual turns but all possible combinations of recent turns for harmful composite meaning
  • Assembly request detection: Flag requests to "combine," "compile," "synthesize," or "merge" content from prior turns
  • Turn-reference tracking: Track when users reference specific prior turns by number or content, especially in combination requests
  • Cross-turn semantic coherence analysis: Detect when fragments across turns form a coherent harmful payload despite being individually benign
Mitigation
Full-conversation compositional safety evaluationHIGH
Assembly request interceptionMEDIUM
GNN-based multi-turn detectionHIGH
Token-budget-based fragmentation detectionLOW
Chaining

Context fragmentation chains from T4-AT-007 (Context Window Exhaustion) when fragments are distributed across a large context with benign padding between them. Chains into T4-AT-005 (Incremental Assembly) as the delivery mechanism for assembly components.

Framework mapping
OWASP LLMLLM01
MITRE ATLASAML.T0051.000
Open in the technique browser →