T6-AT-005HIGH

Synthetic Data Poisoning

Risk score235

RatingHigh

Procedures10

Severity

Mechanism

Synthetic data now accounts for 10–30% of modern LLM training pipelines (SQ Magazine 2026), used across pre-training, supervised fine-tuning, RLHF, and model distillation. The Virus Infection Attack (VIA, Liang et al. 2025) demonstrated a critical vulnerability: poisoning an upstream model propagates through synthetic data generation to downstream models trained on that model's outputs.

Detection

Synthetic data statistical profiling: compare distribution of synthetic training data against known-good baselines for perplexity, token frequency, and topic distribution
Cross-generation consistency checks: verify that models trained on synthetic data from the same source converge to similar behaviors across independent runs
Provenance verification with cryptographic signing: require generator model attestation for all synthetic data
Recursive amplification monitoring: track poison-associated patterns across training generations

Mitigation

Cryptographic provenance for all synthetic dataHIGH

Independent validation of upstream generator modelsHIGH

Synthetic data diversity enforcement (multiple generators)MEDIUM

Recursive training cycle monitoringMEDIUM

Chaining

Synthetic data poisoning chains directly to T6-AT-002 (Dataset Contamination) as a delivery mechanism, and to T6-AT-010 (Knowledge Distillation Attacks) since distillation is a primary synthetic data consumer. Recursive amplification (T6-AP-005D) can chain to T6-AT-001 (Reward Hacking) when the model generates its own reward signal in self-play.

Open in the technique browser →