T12-AT-010HIGH

Feedback Loop Poisoning

T12 · RAG & Knowledge Base Manipulation →

Risk score215

RatingHigh

Procedures10

Severity

Mechanism

Many RAG systems include feedback mechanisms — user ratings, click-through tracking, relevance scoring, RLHF-on-retrieval — that update retrieval ranking over time. Feedback loop poisoning manipulates these signals to promote malicious documents and demote legitimate ones. The assumption violated is that aggregated user feedback reflects genuine quality — coordinated feedback manipulation by a small number of actors can dominate the signal, especially in low-traffic systems.

Detection

Anomaly detection on feedback patterns: detect coordinated/automated feedback from multiple accounts
Monitor for sudden ranking changes not explained by content updates
Statistical testing on feedback distributions; flag non-organic patterns
Observable signal: disproportionate feedback volume from a small number of users/accounts

Mitigation

Feedback rate limitingHIGH

Sybil-resistant identityHIGH

Feedback anomaly detectionMEDIUM

Feedback impact dampeningMEDIUM

Chaining

Feedback loop poisoning makes T12-AT-001 (Vector Poisoning) persistent — even if poisoned documents are removed, the ranking bias from manipulated feedback remains. Feeds T12-AT-002 (Retrieval Manipulation) by modifying the ranking algorithm's learned preferences.

Framework mapping

OWASP LLMLLM08

MITRE ATLASAML.T0020

Open in the technique browser →