T6-AT-013HIGH

Active Learning Exploitation

T6 · Training & Feedback Poisoning →
Risk score225
RatingHigh
Procedures10
Severity
Mechanism

Active learning systems select which unlabeled examples to query for human annotation, typically choosing high-uncertainty or high-information-gain examples. This creates a unique attack surface: the adversary does not need to poison the labels (T6-AT-006) or the data (T6-AT-002), but rather the *selection process* that determines which data the model learns from. By biasing the query strategy, the attacker controls the model's effective training distribution without modifying any training data directly.

Detection
  • Query distribution monitoring: track the distribution of actively-queried examples and flag over-representation of specific data regions
  • Committee member auditing: independently evaluate each committee member for systematic bias
  • Information gain calibration: cross-validate information gain estimates against held-out data
  • Budget allocation tracking: monitor annotation budget expenditure across data categories
Mitigation
Reserved annotation budget for safety-critical regionsHIGH
Independent query strategy auditingMEDIUM
Multi-strategy ensemble (combine multiple selection criteria)MEDIUM
Pool integrity verificationMEDIUM
Chaining

Active learning exploitation chains to T6-AT-006 (Annotation Manipulation) — once examples are selected for annotation, the annotation itself can be further poisoned. Oracle manipulation (T6-AP-013D) is a specific instance of T6-AT-006.

Open in the technique browser →