A Novel Method of Small Object Detection in UAV Remote Sensing Images Based on Feature Alignment of Candidate Regions

  1. Wang, Jinkang
  2. Shao, Faming
  3. He, Xiaohui
  4. Lu, Guanlin
  5. González Aguilera, Diego 1
  1. 1 Universidad de Salamanca
    info

    Universidad de Salamanca

    Salamanca, España

    ROR https://ror.org/02f40zc51

Revista:
Drones

ISSN: 2504-446X

Año de publicación: 2022

Volumen: 6

Número: 10

Páginas: 292

Tipo: Artículo

DOI: 10.3390/DRONES6100292 GOOGLE SCHOLAR lock_openAcceso abierto editor

Otras publicaciones en: Drones

Resumen

To solve the problem of low detection accuracy of small objects in UAV optical remote sensing images due to low contrast, dense distribution, and weak features, this paper proposes a small object detection method based on feature alignment of candidate regions is proposed for remote sensing images. Firstly, AFA-FPN (Attention-based Feature Alignment FPN) defines the corresponding relationship between feature mappings, solves the misregistration of features between adjacent levels, and improves the recognition ability of small objects by aligning and fusing shallow spatial features and deep semantic features. Secondly, the PHDA (Polarization Hybrid Domain Attention) module captures local areas containing small object features through parallel channel domain attention and spatial domain attention. It assigns a larger weight to these areas to alleviate the interference of background noise. Then, the rotation branch uses RROI to rotate the horizontal frame obtained by RPN, which avoids missing detection of small objects with dense distribution and arbitrary direction. Next, the rotation branch uses RROI to rotate the horizontal box obtained by RPN. It solves the problem of missing detection of small objects with dense distribution and arbitrary direction and prevents feature mismatch between the object and candidate regions. Finally, the loss function is improved to better reflect the difference between the predicted value and the ground truth. Experiments are conducted on a self-made dataset. The experimental results show that the mAP of the proposed method reaches 82.04% and the detection speed reaches 24.3 FPS, which is significantly higher than that of the state-of-the-art methods. Meanwhile, the ablation experiment verifies the rationality of each module.

Información de financiación

Financiadores

Referencias bibliográficas

  • 10.1016/j.isprsjprs.2019.11.023
  • 10.1016/j.isprsjprs.2020.12.015
  • 10.1016/j.eswa.2022.116793
  • 10.1016/j.isprsjprs.2020.01.025
  • Girshick, (2015), Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), 10.1109/ICCV.2015.169
  • 10.1109/TPAMI.2016.2577031
  • Dai, (2016)
  • He, (2017), Proceedings of the International Conference on Computer Vision (ICCV)
  • Lin, (2017), Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 10.1109/CVPR.2017.106
  • Redmon, (2016), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • Redmon, (2017), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
  • Redmon, (2018), arXiv
  • Bochkovskiy, (2020), arXiv
  • Liu, (2016)
  • Lin, (2018), pp. 2999
  • Long, (2015), pp. 640
  • Law, (2018), Proceedings of the European Conference on Computer Vision (ECCV)
  • Yang, (2019), Proceedings of the AAAI Conference on Artificial Intelligence
  • Yang, (2020)
  • Yang, (2021)
  • Han, (2021), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • Chen, (2020)
  • Han, (2021), Volume 60
  • Xie, (2021), Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3520
  • 10.1016/j.imavis.2020.103910
  • 10.1016/j.eswa.2021.114602
  • 10.1155/2020/3189691
  • Wang, (2019), Proceedings of the IEEE/CVF International Conference on Computer Vision
  • 10.1007/s11432-019-2723-1
  • 10.1088/1755-1315/440/3/032093
  • 10.1109/TIP.2018.2865280
  • 10.1109/TIE.2019.2899548
  • 10.1016/j.patrec.2019.03.009
  • Yang, (2019), Proceedings of the IEEE/CVF International Conference on Computer Vision
  • Yang, (2022)
  • 10.1016/j.neucom.2020.05.108
  • Hou, (2020), Proceedings of the 2020 IEEE International Conference on Multimedia and Expo (ICME), 10.1109/ICME46284.2020.9102807
  • 10.1109/ACCESS.2020.2991439
  • 10.1109/TIP.2020.3042084
  • 10.1109/TIM.2021.3118092
  • He, (2016)
  • Guo, (2020), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 10.1609/aaai.v33i01.33019259
  • Zhang, (2018), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City
  • 10.1109/TGRS.2021.3069056
  • 10.3390/rs12030389
  • 10.3390/rs13163192
  • 10.1109/TGRS.2021.3095186
  • 10.1016/j.isprsjprs.2019.12.001
  • Rezatofighi, (2019), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 10.1609/aaai.v34i07.6999
  • 10.1109/TGRS.2016.2601622
  • Xia, (2017), arXiv
  • Zhu, (2015), Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP), pp. 3735, 10.1109/ICIP.2015.7351502
  • Simonyan, (2014), arXiv
  • Krizhevsky, (2012), Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe Nevada