T10-AT-003HIGH

Membership Inference Attacks

T10 · Integrity & Confidentiality Breach →

Risk score220

RatingHigh

Procedures10

Severity

Mechanism

Membership inference exploits the measurable behavioral difference between how a model processes data it has trained on versus data it hasn't. For LLMs, this manifests as a perplexity gap: the model assigns lower perplexity (higher confidence, lower surprise) to sequences from its training set. The attacker doesn't need to extract the data — they only need to determine whether a specific record was included.

Detection

Statistical analysis of query patterns: repeated queries with minor variations on the same entity signal membership probing
Monitor for perplexity-differential probing patterns (same prompt template, varying the candidate data)
Alert on queries referencing specific documents, individuals, or records combined with meta-questions about model knowledge
Reference: sigma/t07-data-exfiltration.yml (adaptable for membership inference query patterns)

Mitigation

Differential privacy during trainingHIGH

Confidence score suppressionHIGH

Output perturbation (temperature > 0)MEDIUM

Self-distillation (ACL Findings 2025)MEDIUM

Chaining

Confirmed membership enables T10-AT-001 (Training Data Extraction) by identifying which sequences are extractable, and informs T10-AT-008 (Attribute Inference) by confirming which individuals' data informed the model.

Framework mapping

OWASP LLMLLM02

MITRE ATLASAML.T0024.000

Open in the technique browser →