T7-AT-002MEDIUM

Information Fragmentation

T7 · Output Manipulation & Exfiltration →
Risk score180
RatingMedium
Procedures6
Severity
Mechanism

Safety evaluation in current LLM architectures operates atomically on individual requests without maintaining a cumulative disclosure ledger across the conversation. Each sub-request — "what's the first step," "what temperature," "what materials" — is evaluated independently and found benign. The architectural assumption violated is that per-turn safety is sufficient when the adversary aggregates across turns.

Detection
  • Track cumulative topic coverage across conversation turns; flag when the union of disclosed fragments crosses a harm threshold even though no individual turn did
  • Implement sliding-window content aggregation: concatenate last N turns and re-evaluate the aggregate through the safety classifier
  • Observable signal: repeated short requests on the same narrow topic, each extracting one specific parameter or step
Mitigation
Cross-turn disclosure trackingHIGH
Topic-persistence detectionMEDIUM
Session-level safety re-evaluationHIGH
Turn-count throttling on sensitive topicsLOW
Chaining

Information fragmentation is the primary feeder for T7-AT-012 (Aggregation Attacks) — fragments obtained here are reassembled externally. Also enables T7-AT-007 (Iterative Refinement) when initial fragments serve as the seed for progressive detail extraction.

Framework mapping
OWASP LLMLLM02
MITRE ATLASAML.T0024
Open in the technique browser →