T10-AT-009MEDIUM

Data Poisoning Detection Bypass

T10 · Integrity & Confidentiality Breach →

Risk score195

RatingMedium

Procedures10

Severity

Mechanism

Data poisoning detection bypass targets the defenses themselves — the statistical tests, anomaly detectors, and quality filters that guard training pipelines. The attacker's goal is to inject poisoned data that survives validation while still achieving the attack objective (backdoor insertion, behavior modification, or targeted misclassification). The fundamental vulnerability is that detection systems rely on distributional assumptions about clean data: poisoned samples that preserve the statistical properties of clean data (mean, variance, feature distributions) while embedding adversarial signals are invisible to these detectors.

Detection

Multi-layer validation: combine statistical tests with semantic analysis and provenance verification
Training-time spectral analysis: detect poisoning signatures in the model's learned representation space
Holdout validation with trigger-pattern scanning
Data fingerprinting and source-integrity verification for all training data pipelines

Mitigation

Certified training defenses (DPA, randomized smoothing)MEDIUM

Multi-source data provenance trackingHIGH

Activation clustering for backdoor detectionMEDIUM

Robust aggregation (trimmed mean, Krum)MEDIUM

Chaining

Successful poisoning bypass enables persistent backdoors that can be triggered via T1 (Prompt Subversion), and compromises model integrity in ways detectable only by T10-AT-013 (Audit Log Manipulation) if the attacker also covers their tracks.

Framework mapping

OWASP LLMLLM04

MITRE ATLASAML.T0020

Open in the technique browser →