Similarity Search Hijacking
T12 · RAG & Knowledge Base Manipulation →Similarity search (k-NN, ANN) is the core retrieval mechanism in vector RAG. Hijacking exploits mathematical properties of the similarity algorithm: cosine similarity's sensitivity to vector magnitude, ANN approximation errors in libraries like FAISS/HNSW that trade accuracy for speed, distance metric assumptions (L2 vs. cosine vs. dot product) that create different adversarial surfaces, and k-NN's vulnerability to adversarial nearest-neighbor injection. The Black-Hole Attack (April 2026) demonstrated that vectors near the geometric centroid of a high-dimensional space are nearest neighbors to disproportionately many queries — a fundamental geometric property exploitable in any similarity search system.
- Monitor for vectors with anomalously high average similarity to the corpus (potential universal attractors)
- Compare ANN results against exact k-NN for canary queries; detect approximation exploitation
- Track retrieval result diversity; flag when a small number of documents dominate across many queries
- Observable signal: documents appearing in top-k results for a disproportionate number of diverse queries
Similarity search hijacking is the mechanical implementation of T12-AT-002 (Retrieval Manipulation) and T12-AT-005 (Embedding Manipulation) at the algorithm level. Enables all downstream T12 techniques by controlling what the retrieval system returns.