T12-AT-014HIGH

Similarity Search Hijacking

T12 · RAG & Knowledge Base Manipulation →

Risk score210

RatingHigh

Procedures10

Severity

Mechanism

Similarity search (k-NN, ANN) is the core retrieval mechanism in vector RAG. Hijacking exploits mathematical properties of the similarity algorithm: cosine similarity's sensitivity to vector magnitude, ANN approximation errors in libraries like FAISS/HNSW that trade accuracy for speed, distance metric assumptions (L2 vs. cosine vs. dot product) that create different adversarial surfaces, and k-NN's vulnerability to adversarial nearest-neighbor injection. The Black-Hole Attack (April 2026) demonstrated that vectors near the geometric centroid of a high-dimensional space are nearest neighbors to disproportionately many queries — a fundamental geometric property exploitable in any similarity search system.

Detection

Monitor for vectors with anomalously high average similarity to the corpus (potential universal attractors)
Compare ANN results against exact k-NN for canary queries; detect approximation exploitation
Track retrieval result diversity; flag when a small number of documents dominate across many queries
Observable signal: documents appearing in top-k results for a disproportionate number of diverse queries

Mitigation

Result diversity enforcementHIGH

Exact k-NN verificationHIGH

Centroid anomaly detectionHIGH

Multi-metric retrievalMEDIUM

Chaining

Similarity search hijacking is the mechanical implementation of T12-AT-002 (Retrieval Manipulation) and T12-AT-005 (Embedding Manipulation) at the algorithm level. Enables all downstream T12 techniques by controlling what the retrieval system returns.

Framework mapping

OWASP LLMLLM08

MITRE ATLASAML.T0043

Open in the technique browser →