T1-AT-004HIGH

Instruction Prefix/Suffix

T1 · Prompt & Context Subversion →
Risk score235
RatingHigh
Procedures6
Severity
Mechanism

Exploits the model's processing of pseudo-code and command-line syntax. important`, and `//OVERRIDE` as authority-escalation primitives in their original domains. The attack hypothesis is that these patterns carry residual authority weight in the model's attention when they appear in natural-language context — the model may partially activate the "elevated privilege" association even though the user has no actual system access.

Detection
  • Regex detection of command-line, code, and configuration syntax in natural-language prompts: SUDO, --force, !important, //OVERRIDE, #AUTHORIZED, BEGIN_*_BLOCK
  • Unicode-aware detection (attacker may use homoglyphs or zero-width characters to evade regex)
  • YARA rule: yara/t01-prompt-injection.yar
Mitigation
Input sanitization (strip known code-authority tokens before model processing)MEDIUM
Constitutional ClassifiersHIGH
Training-time: reduce authority weight of code-syntax tokens in natural-language contextHIGH
Chaining

Direct ancestor of Policy Puppetry (T2 encoding techniques). When prefix/suffix attacks fail individually, they chain with T2 (Semantic Evasion) encoding to create compound payloads.

Framework mapping
OWASP LLMLLM01
MITRE ATLASAML.T0051.001
Open in the technique browser →