T5-AT-005MEDIUM

Model Fingerprinting

Risk score185

RatingMedium

Procedures1

Severity

Mechanism

LLM deployments exhibit model-specific behavioral signatures in their outputs, error messages, tokenization patterns, and response latencies that allow an attacker to determine the exact model, version, and sometimes configuration behind an API. The design assumption is that abstracting the model behind a generic API endpoint hides its identity. The gap: every model family has distinctive behavioral fingerprints — response to edge-case prompts, tokenization boundaries visible in response artifacts, vocabulary-specific probability distributions, and characteristic error message formatting.

Detection

Monitor for rapid sequential queries with high prompt diversity but minimal conversational coherence (fingerprinting pattern)
Detect known fingerprinting prompt signatures (LLMmap, ProFLingo discriminative prompts)
Alert on queries probing tokenization behavior, control character handling, or self-identification
Log and correlate multi-probe sessions that systematically explore model behavioral boundaries

Mitigation

Response normalization (standardize formatting, strip model-specific artifacts)MEDIUM

Block known fingerprinting prompt patternsLOW

Model version abstraction layer (proxy that homogenizes behavior)MEDIUM

Response perturbation (small random variations)LOW

Chaining

Model fingerprinting is the reconnaissance phase that enables targeted attacks. Identified model → known jailbreaks (T1, T2, T3), known extraction vulnerabilities (T5-AT-002), known safety alignment gaps, and version-specific CVEs.

Framework mapping

OWASP LLMLLM06

MITRE ATLASAML.T0001;AML.T0014

Open in the technique browser →