T6-AT-013HIGH

Active Learning Exploitation

Risk score225

RatingHigh

Procedures10

Severity

Mechanism

Active learning systems select which unlabeled examples to query for human annotation, typically choosing high-uncertainty or high-information-gain examples. This creates a unique attack surface: the adversary does not need to poison the labels (T6-AT-006) or the data (T6-AT-002), but rather the *selection process* that determines which data the model learns from. By biasing the query strategy, the attacker controls the model's effective training distribution without modifying any training data directly.

Detection

Query distribution monitoring: track the distribution of actively-queried examples and flag over-representation of specific data regions
Committee member auditing: independently evaluate each committee member for systematic bias
Information gain calibration: cross-validate information gain estimates against held-out data
Budget allocation tracking: monitor annotation budget expenditure across data categories

Mitigation

Reserved annotation budget for safety-critical regionsHIGH

Independent query strategy auditingMEDIUM

Multi-strategy ensemble (combine multiple selection criteria)MEDIUM

Pool integrity verificationMEDIUM

Chaining

Active learning exploitation chains to T6-AT-006 (Annotation Manipulation) — once examples are selected for annotation, the annotation itself can be further poisoned. Oracle manipulation (T6-AP-013D) is a specific instance of T6-AT-006.

Open in the technique browser →