T6 · Core domain
Training & Feedback Poisoning
Corrupt training data and feedback
Techniques15
Avg risk234
Max risk270
DomainCore
T6-AT-003Backdoor Insertion1 proc270CRITICALT6-AT-002Dataset Contamination10 proc260CRITICALT6-AT-001Reward Hacking10 proc250CRITICALT6-AT-008Model Update Hijacking10 proc245HIGHT6-AT-004Fine-Tuning Attacks10 proc240HIGHT6-AT-011Reinforcement Signal Manipulation10 proc240HIGHT6-AT-005Synthetic Data Poisoning10 proc235HIGHT6-AT-007Preference Learning Corruption10 proc230HIGHT6-AT-014Self-Supervised Poisoning10 proc230HIGHT6-AT-006Annotation Manipulation10 proc225HIGHT6-AT-013Active Learning Exploitation10 proc225HIGHT6-AT-009Evaluation Set Contamination10 proc220HIGHT6-AT-015Few-Shot Learning Attacks10 proc220HIGHT6-AT-010Knowledge Distillation Attacks10 proc215HIGHT6-AT-012Curriculum Learning Exploitation10 proc210HIGH
Open T6 in the technique browser →