T10-AT-003HIGH

Membership Inference Attacks

T10 · Integrity & Confidentiality Breach →
Risk score220
RatingHigh
Procedures10
Severity
Mechanism

Membership inference exploits the measurable behavioral difference between how a model processes data it has trained on versus data it hasn't. For LLMs, this manifests as a perplexity gap: the model assigns lower perplexity (higher confidence, lower surprise) to sequences from its training set. The attacker doesn't need to extract the data — they only need to determine whether a specific record was included.

Detection
  • Statistical analysis of query patterns: repeated queries with minor variations on the same entity signal membership probing
  • Monitor for perplexity-differential probing patterns (same prompt template, varying the candidate data)
  • Alert on queries referencing specific documents, individuals, or records combined with meta-questions about model knowledge
  • Reference: sigma/t07-data-exfiltration.yml (adaptable for membership inference query patterns)
Mitigation
Differential privacy during trainingHIGH
Confidence score suppressionHIGH
Output perturbation (temperature > 0)MEDIUM
Self-distillation (ACL Findings 2025)MEDIUM
Chaining

Confirmed membership enables T10-AT-001 (Training Data Extraction) by identifying which sequences are extractable, and informs T10-AT-008 (Attribute Inference) by confirming which individuals' data informed the model.

Framework mapping
OWASP LLMLLM02
MITRE ATLASAML.T0024.000
Open in the technique browser →