A Novel Multi-Scale Transformer for Object Detection in Aerial Scenes

  1. Lu, Guanlin
  2. He, Xiaohui
  3. Wang, Qiang
  4. Shao, Faming
  5. Wang, Hongwei
  6. Wang, Jinkang
  7. González Aguilera, Diego 1
  1. 1 Universidad de Salamanca
    info

    Universidad de Salamanca

    Salamanca, España

    ROR https://ror.org/02f40zc51

Journal:
Drones

ISSN: 2504-446X

Year of publication: 2022

Volume: 6

Issue: 8

Pages: 188

Type: Article

DOI: 10.3390/DRONES6080188 GOOGLE SCHOLAR lock_openOpen access editor

More publications in: Drones

Abstract

Deep learning has promoted the research of object detection in aerial scenes. However, most of the existing networks are limited by the large-scale variation of objects and the confusion of category features. To overcome these limitations, this paper proposes a novel aerial object detection framework called DFCformer. DFCformer is mainly composed of three parts: the backbone network DMViT, which introduces deformation patch embedding and multi-scale adaptive self-attention to capture sufficient features of the objects; FRGC guides feature interaction layer by layer to break the barriers between feature layers and improve the information discrimination and processing ability of multi-scale critical features; CAIM adopts an attention mechanism to fuse multi-scale features to perform hierarchical reasoning on the relationship between different levels and fully utilize the complementary information in multi-scale features. Extensive experiments have been conducted on the FAIR1M dataset, and DFCformer shows its advantages by achieving the highest scores with stronger scene adaptability.

Funding information

Funders

Bibliographic References

  • 10.1109/TIP.2020.2975718
  • 10.3390/min12020140
  • 10.1016/j.jag.2018.10.018
  • 10.5194/isprs-archives-XLII-2-W13-525-2019
  • 10.3390/rs13061204
  • 10.1109/TGRS.2022.3179379
  • 10.1007/BF00130487
  • 10.1109/83.817602
  • 10.1109/TPAMI.2002.1017623
  • Philbin, (2007), Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, 10.1109/CVPR.2007.383172
  • Blei, (2003), J. Mach. Learn. Res., 3, pp. 993
  • 10.1023/A:1007617005950
  • Simonyan, (2015), Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015
  • Ren, (2015), Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015
  • 10.3390/rs13234779
  • Yang, (2019), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
  • Li, (2020), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
  • Wang, (2021), Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), 10.1109/ICPR48806.2021.9413340
  • 10.1109/TIP.2020.3045636
  • 10.1109/TGRS.2022.3145483
  • Vaswani, (2017), Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017
  • Dosovitskiy, (2020), arXiv
  • 10.1016/j.icte.2021.12.006
  • 10.3390/rs14040984
  • 10.1109/TGRS.2016.2601622
  • 10.1016/j.isprsjprs.2019.11.023
  • Zhang, (2021), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops
  • Zhu, (2021), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops
  • Wang, (2021), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 10.1109/ICCV48922.2021.00061
  • Chen, (2021), Proceedings of the 29th ACM International Conference on Multimedia, 10.1145/3474085.3475467
  • Girshick, (2015), Proceedings of the IEEE International Conference on Computer Vision (ICCV), 10.1109/ICCV.2015.169
  • Girshick, (2014), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • Redmon, (2016), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • Redmon, (2018), arXiv
  • Bochkovskiy, (2020), arXiv
  • Liu, (2016), Proceedings of the European Conference on Computer Vision (ECCV 2016)
  • 10.1109/TGRS.2022.3176603
  • Carion, (2020), arXiv
  • Zhu, (2021), arXiv
  • Liu, (2021), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 10.1109/ICCV48922.2021.00986
  • Yuan, (2021), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
  • Lee, (2022), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • Yang, (2022), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • Zhou, (2021), arXiv
  • Mao, (2022), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • Pang, (2019), arXiv
  • Pan, (2020), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • Ma, (2021), arXiv
  • 10.1109/JSTARS.2021.3079968
  • Xu, (2021), arXiv
  • Ren, (2022), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 10.3390/rs14030579
  • Lee, (2022), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 10.1109/TIM.2021.3052575
  • 10.3390/rs13234743
  • Dai, (2021), Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
  • Graham, (2021), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 10.1109/ICCV48922.2021.01204
  • Deng, (2009), Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition
  • 10.1109/TGRS.2018.2818945
  • Rezatofighi, (2019), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • Lin, (2014), Proceedings of the European Conference on Computer Vision
  • Xia, (2018), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • Zhu, (2019), Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
  • 10.1016/j.isprsjprs.2021.12.004
  • 10.1080/01431161.2014.999881
  • 10.1016/j.jvcir.2015.11.002
  • Liu, (2017), Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods, 10.5220/0006120603240331
  • He, (2016), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)