T6-AT-013HIGH
Active Learning Exploitation
T6 · Training & Feedback Poisoning →Risk score225
RatingHigh
Procedures10
Severity
Mechanism
Active learning systems select which unlabeled examples to query for human annotation, typically choosing high-uncertainty or high-information-gain examples. This creates a unique attack surface: the adversary does not need to poison the labels (T6-AT-006) or the data (T6-AT-002), but rather the *selection process* that determines which data the model learns from. By biasing the query strategy, the attacker controls the model's effective training distribution without modifying any training data directly.
Detection
- Query distribution monitoring: track the distribution of actively-queried examples and flag over-representation of specific data regions
- Committee member auditing: independently evaluate each committee member for systematic bias
- Information gain calibration: cross-validate information gain estimates against held-out data
- Budget allocation tracking: monitor annotation budget expenditure across data categories
Mitigation
Reserved annotation budget for safety-critical regionsHIGH
Independent query strategy auditingMEDIUM
Multi-strategy ensemble (combine multiple selection criteria)MEDIUM
Pool integrity verificationMEDIUM
Chaining
Open in the technique browser →Active learning exploitation chains to T6-AT-006 (Annotation Manipulation) — once examples are selected for annotation, the annotation itself can be further poisoned. Oracle manipulation (T6-AP-013D) is a specific instance of T6-AT-006.