Model Fingerprinting
T5 · Model & API Exploitation →LLM deployments exhibit model-specific behavioral signatures in their outputs, error messages, tokenization patterns, and response latencies that allow an attacker to determine the exact model, version, and sometimes configuration behind an API. The design assumption is that abstracting the model behind a generic API endpoint hides its identity. The gap: every model family has distinctive behavioral fingerprints — response to edge-case prompts, tokenization boundaries visible in response artifacts, vocabulary-specific probability distributions, and characteristic error message formatting.
- Monitor for rapid sequential queries with high prompt diversity but minimal conversational coherence (fingerprinting pattern)
- Detect known fingerprinting prompt signatures (LLMmap, ProFLingo discriminative prompts)
- Alert on queries probing tokenization behavior, control character handling, or self-identification
- Log and correlate multi-probe sessions that systematically explore model behavioral boundaries
Model fingerprinting is the reconnaissance phase that enables targeted attacks. Identified model → known jailbreaks (T1, T2, T3), known extraction vulnerabilities (T5-AT-002), known safety alignment gaps, and version-specific CVEs.