T7-AT-002MEDIUM
Information Fragmentation
T7 · Output Manipulation & Exfiltration →Risk score180
RatingMedium
Procedures6
Severity
Mechanism
Safety evaluation in current LLM architectures operates atomically on individual requests without maintaining a cumulative disclosure ledger across the conversation. Each sub-request — "what's the first step," "what temperature," "what materials" — is evaluated independently and found benign. The architectural assumption violated is that per-turn safety is sufficient when the adversary aggregates across turns.
Detection
- Track cumulative topic coverage across conversation turns; flag when the union of disclosed fragments crosses a harm threshold even though no individual turn did
- Implement sliding-window content aggregation: concatenate last N turns and re-evaluate the aggregate through the safety classifier
- Observable signal: repeated short requests on the same narrow topic, each extracting one specific parameter or step
Mitigation
Cross-turn disclosure trackingHIGH
Topic-persistence detectionMEDIUM
Session-level safety re-evaluationHIGH
Turn-count throttling on sensitive topicsLOW
Chaining
Information fragmentation is the primary feeder for T7-AT-012 (Aggregation Attacks) — fragments obtained here are reassembled externally. Also enables T7-AT-007 (Iterative Refinement) when initial fragments serve as the seed for progressive detail extraction.
Framework mapping
Open in the technique browser →OWASP LLMLLM02
MITRE ATLASAML.T0024