T10-AT-009MEDIUM

Data Poisoning Detection Bypass

T10 · Integrity & Confidentiality Breach →
Risk score195
RatingMedium
Procedures10
Severity
Mechanism

Data poisoning detection bypass targets the defenses themselves — the statistical tests, anomaly detectors, and quality filters that guard training pipelines. The attacker's goal is to inject poisoned data that survives validation while still achieving the attack objective (backdoor insertion, behavior modification, or targeted misclassification). The fundamental vulnerability is that detection systems rely on distributional assumptions about clean data: poisoned samples that preserve the statistical properties of clean data (mean, variance, feature distributions) while embedding adversarial signals are invisible to these detectors.

Detection
  • Multi-layer validation: combine statistical tests with semantic analysis and provenance verification
  • Training-time spectral analysis: detect poisoning signatures in the model's learned representation space
  • Holdout validation with trigger-pattern scanning
  • Data fingerprinting and source-integrity verification for all training data pipelines
Mitigation
Certified training defenses (DPA, randomized smoothing)MEDIUM
Multi-source data provenance trackingHIGH
Activation clustering for backdoor detectionMEDIUM
Robust aggregation (trimmed mean, Krum)MEDIUM
Chaining

Successful poisoning bypass enables persistent backdoors that can be triggered via T1 (Prompt Subversion), and compromises model integrity in ways detectable only by T10-AT-013 (Audit Log Manipulation) if the attacker also covers their tracks.

Framework mapping
OWASP LLMLLM04
MITRE ATLASAML.T0020
Open in the technique browser →