Electricity transmission operators are responsible for the safe and reliable operation of overhead line pylons. Individual routes can comprise thousands of pylons with a heterogeneous mixture of components. Images of pylons collected during routine maintenance and inspection activity can be analysed automatically using computer vision techniques; however building the image classification algorithms requires labelled data to serve as a supervisory signal for training. For a large dataset, manual labelling to build the training dataset can quickly become onerous. This report describes an uncertainty-guided approach for active learning. Given a classifier trained on a subset of images, our method identifies unlabelled images for which the deep learning method is most uncertain. These uncertain images can then be labelled by a human labeller and fed into the next round of training. We demonstrate that this strategy is effective in electrical pylon asset management, with important implications for efficient human-in-the-loop training.
Electricity transmission operators are responsible for the safe and reliable operation of overhead line pylons. Individual routes can comprise thousands of pylons with a heterogeneous mixture of components. Images of pylons collected during routine maintenance and inspection activity can be analysed automatically using computer vision techniques; however building the image classification algorithms requires labelled data to serve as a supervisory signal for training. For a large dataset, manual labelling to build the training dataset can quickly become onerous. This report describes an uncertainty-guided approach for active learning. Given a classifier trained on a subset of images, our method identifies unlabelled images for which the deep learning method is most uncertain. These uncertain images can then be labelled by a human labeller and fed into the next round of training. We demonstrate that this strategy is effective in electrical pylon asset management, with important implications for efficient human-in-the-loop training.
Our hypothesis is that images for which a classifier is uncertain will be useful in a next round of training, and asking a human labeller to label these uncertain images is an effective strategy to minimise human efforts in labelling data. To explore this hypothesis on the electrical pylon dataset, we applied a Deep Active Learning strategy to tasks focused on construction and configuration classification. Specifically, taking construction as an example, the pylon construction dataset consists of five categories and 5440 images, denoted as S = { Im }, where m = 1, ⋯, 5440 represents the image index. We denote the annotated data from S as SL = { Inm } where n = 1, ⋯, 5 represents the image label into one of five categories, and the remaining unlabeled data is denoted as SU.
We divide the Deep Active Learning progress into 10 stages, represented by k = 1, ⋯, 10. In each stage, we train ten models using the Deep Ensemble [3] method to estimate the uncertainty of classification results. Specifically, we test the probability outputs of 10 models on the same unlabeled image, and take the average and variance of these probability outputs as the final prediction output and uncertainty, respectively.

Fig. 1 shows the pipeline of our uncertainty-guided active learning strategy. The training data used in each stage is denoted as SkL. When k = 1, we randomly select 10% of the data from S1U to annotate, with the remaining data in S1L and S2U. Then, we train 10 models using S1L . Taking uncertainty query strategy as an example, when 1 < k ≤ 10, at stage k, we use the models trained in stage k − 1 to select the 10% of SkU data with the highest uncertainty for labeling and add them to Sk-1L, resulting in SkL and SUk+1. We retrain the models obtained from the stage k − 1 using SkL to obtain the models for the k −th stage. The prediction results of the k-th stage are obtained from the models trained in the k-th stage.
We compare this uncertainty-guided approach to two other strategies for selecting samples to label in the next round of training. First, we compare to random selection. Second, we use the 10 models’ average cosine similarity between the images in SkU and the images in SLk−1.
By adopting the aforementioned Deep Active Learning strategies, we obtained the following two figures: Fig. 2 for construction and Fig. 3 for configuration, respectively. Note, most of these curves stop at 60% of the whole trainset volume, as the uncertainty-guided approach is capable of achieving the upper bound performance at this point, which already shows that the uncertainty-guided method is more efficient than the other methods.


The upper bound lines represent the performance of the 10 models’ ensemble trained for 200 epochs using all the images in the whole training set S. In both plots, we can see only 40% of training dataset is needed to reach the upper bound performance if we use uncertainty as a criterion. This demonstrates that uncertainty can identify more valuable images than random or cosine similarity strategies. In addition, comparing the orange lines of Fig. 2 and Fig. 3, it can be seen that training more epochs has a performance gain for construction, but not for configuration. This is because construction task using a long-tailed dataset, which contains a rare category “composite_pole”, and the model can only learn the features of this rare sample if it is trained with more epochs. As Fig. 4 shows, if models are trained with 60% of the whole trainset images for 50 epochs, the models still cannot correctly predict the rare category “composite_pole”.

However, if models are trained for 200 epochs, then only 40% of the whole trainset is needed to make the models predict rare category “composite_pole”. So, if the dataset contains rare categories, then the model needs to be trained with more epochs for a good performance of rare categories. Regardless of the category, training more epochs will make the model’s performance more stable. But in Fig. 3, we observe that all active learning lines are better than the upper bound line. We retested the upper bound line and completed testing the orange line for all 10 stages, but this observation still persists. We speculate that the active learning strategy we used may prevent the models from falling into local optima. Going from 10% to 20% in Fig. 2 and Fig. 3, the classifiers improve a lot. To visualize some examples of these most uncertainty images, Fig. 5 and Fig. 6 show the images with the high uncertainty for construction and configuration, respectively. Besides, Fig. 7 shows the image with relatively low uncertainty for both configuration and construction classes.



Conclusion
This report investigates an application of the uncertainty-guided active learning strategy to a pylon dataset. Specifically, we divide the training process into ten stages, each of which uses Deep Ensemble to estimate the uncertainty of the models’ predictions for each sample and expands the training data by selecting most informative samples based on the uncertainty. The experimental results show that only 40% of the training set is needed to achieve the training effect of the whole dataset if the uncertainty query strategy is used. And the uncertainty query strategy is better than random selection or average cosine similarity query strategy. We also speculate that training more epochs will enable the model to learn more features of rare samples in the long-tail data, and that active learning will also be effective in preventing the model from falling into local optima. More experiments are needed to verify these hypotheses.
References
[1] Ren P, Xiao Y, Chang X, et al. A survey of deep active learning[J].
ACM computing surveys (CSUR), 2021, 54(9): 1-40.
[2] Wang K, Zhang D, Li Y, et al. Cost-effective active learning for deep
image classification[J]. IEEE Transactions on Circuits and Systems for
Video Technology, 2016, 27(12): 2591-2600.
[3] Gustafsson F K, Danelljan M, Schon T B. Evaluating scalable
Bayesian deep learning methods for robust computer
vision[C]//Proceedings of the IEEE/CVF conference on computer vision
and pattern recognition workshops. 2020: 318-319.