T2-AT-009MEDIUM
Code-Switching Attacks
T2 · Semantic & Linguistic Evasion →Risk score195
RatingMedium
Procedures1
Severity
Mechanism
Rapidly alternates between languages within a single message. Most classifiers are trained on monolingual inputs; code-switched text falls outside their distribution. The model handles code-switching well (common in multilingual training data) but the classifier may not.
Detection
- Per-token language detection
- Code-switching aware classifiers
Mitigation
Multilingual safety classifiers trained on code-switched dataHIGH
Framework mapping
Open in the technique browser →OWASP LLMLLM01