Expertise Assumption
T3 · Reasoning & Constraint Exploitation →Models adjust response depth and specificity based on perceived audience expertise — RLHF training rewards providing expert-level detail to experts and simplified explanations to novices. When a user claims professional credentials, the model's audience model shifts to expect and reward higher technical specificity, which directly conflicts with safety constraints that limit operational detail. The model cannot verify claimed credentials, creating an *unverifiable trust claim* that the safety evaluation must process probabilistically.
- Credential claims in prompts: detect "I am a [professional role]" patterns co-occurring with restricted content requests
- Credential-payload consistency: flag when claimed expertise doesn't match the type of knowledge requested (pharmacist requesting clandestine synthesis, bomb squad requesting construction)
- Unverifiable authority claims: any claim of professional authorization in a public AI interface is inherently unverifiable
Expertise claims persist across turns and compound with T3-AT-002 (Academic Pretense) — a claimed credential followed by academic framing creates a dual credibility layer. Chains into T3-AT-012 (Capability Testing) where claimed expertise justifies testing the model's knowledge depth.