T2-AT-003MEDIUM

Encoding and Obfuscation

T2 · Semantic & Linguistic Evasion →
Risk score190
RatingMedium
Procedures10
Severity
Mechanism

Exploits the gap between what the safety classifier sees (encoded tokens) and what the model understands (decoded meaning). Models trained on internet data can decode Base64, ROT13, leetspeak, hexadecimal, Morse code, and other encoding schemes because these appear in their training corpora. Safety classifiers typically operate on the literal input tokens, not the decoded content.

Detection
  • Decode all known encodings (Base64, ROT13, hex, URL-encoding, leetspeak) before safety classification
  • Detect encoding signatures: Base64 padding (==), hex patterns (0x, consecutive hex pairs), NATO alphabet word sequences
  • YARA rule: yara/t02-encoding-evasion.yar
Mitigation
Input normalization (decode all known encodings before classification)HIGH
Multi-layer decoding (detect and decode nested encodings)HIGH
Constitutional ClassifiersHIGH
Chaining

Core building block for compound attacks. Chains with T1-AT-006 (Template Injection) when encoded payloads are embedded in template structures.

Framework mapping
OWASP LLMLLM01
MITRE ATLASAML.T0051.001
Open in the technique browser →