Where should we apply active learning?
- We have very little data or a huge amount of data.
- Annotated dataset annotation is worth human effort, time and money.
- We have access to limited computing power.
example
On a certain planet has different fruits of different sizes (1-5), some of them are poisonous, while others are not. The only criterion for deciding whether a fruit is poisonous or not is its size. Our task — prepare a classifier that predicts whether a given fruit is poisonous or not. The only information we have is that size 1 fruits are not poisonous, size 5 fruits are poisonous, and after a certain size all fruits are poisonous.
The first approach is to check each fruit size, which is time and resource consuming.
Second approach — apply binary search and find the transition point (solution boundary). This approach uses less data and gives the same results as linear search.
General Algorithm: 1. train classifier with the initial training dataset 2.calculate the accuracy 3. while (accuracy "desired accuracy): 4.select the most valuable data points (in general points close to decision boundary) 5.query that data point / s (ask for a label) from human oracle 6.add that data point / s to our initial training dataset 7.re-train the model 8.re-calculate the accuracy
Suitable for active learning algorithm
1. Synthesis Query
- Usually this approach is used when we have a very small dataset.
- In this approach, we select any undefined point from a given n-dimensional space ... we don’t care if this dot exists.
- Someday it would be difficult for a human oracle to comment on the requested data point.
This query synthesis can select any point (valuable) from the 3 * 3 2-D plane.
These are some queries generated by the Query Synthesis approach for a model prepared for handwriting recognition. It is very difficult to annotate these requests.
2. Sampling
- This approach is used when we have a large dataset.
- In this approach, we split our dataset into three parts: the training set; Test set; Unlabeled pool (ironically) [5%; 25%, 70%].
- This training dataset is our initial dataset and is used for the initial training of our model.
- This approach selects points of value / uncertainty from this untagged pool, this ensures that the human oracle can recognize the entire request
The black dots represent the unlabeled pool and the merged red, green dots — training dataset.
Here is an active learning model that solves valuable questions based on the likelihood of a point in the classroom. In here .
|
Output: Accuracy by active model: 80.7 Accuracy by random sampling: 79.5
There are several models for choosing the most valuable glasses. Some of them are:
- Committee Request
- Query synthesis and nearest neighbor search
- Large margin heuristic
- Back Probability Heuristic
Link: Synthesis of Active Learning lectures on artificial intelligence and machine learning. By Burr S.