A Novel Multi-Scale Transformer for Object Detection in Aerial Scenes
- Lu, Guanlin
- He, Xiaohui
- Wang, Qiang
- Shao, Faming
- Wang, Hongwei
- Wang, Jinkang
- González Aguilera, Diego 1
-
1
Universidad de Salamanca
info
ISSN: 2504-446X
Año de publicación: 2022
Volumen: 6
Número: 8
Páginas: 188
Tipo: Artículo
Otras publicaciones en: Drones
Resumen
Deep learning has promoted the research of object detection in aerial scenes. However, most of the existing networks are limited by the large-scale variation of objects and the confusion of category features. To overcome these limitations, this paper proposes a novel aerial object detection framework called DFCformer. DFCformer is mainly composed of three parts: the backbone network DMViT, which introduces deformation patch embedding and multi-scale adaptive self-attention to capture sufficient features of the objects; FRGC guides feature interaction layer by layer to break the barriers between feature layers and improve the information discrimination and processing ability of multi-scale critical features; CAIM adopts an attention mechanism to fuse multi-scale features to perform hierarchical reasoning on the relationship between different levels and fully utilize the complementary information in multi-scale features. Extensive experiments have been conducted on the FAIR1M dataset, and DFCformer shows its advantages by achieving the highest scores with stronger scene adaptability.
Información de financiación
Financiadores
-
National Natural Science Foundation of China
- 61671470
-
Key Research and Development Program of China
- 2016YFC0802900
Referencias bibliográficas
- 10.1109/TIP.2020.2975718
- 10.3390/min12020140
- 10.1016/j.jag.2018.10.018
- 10.5194/isprs-archives-XLII-2-W13-525-2019
- 10.3390/rs13061204
- 10.1109/TGRS.2022.3179379
- 10.1007/BF00130487
- 10.1109/83.817602
- 10.1109/TPAMI.2002.1017623
- Philbin, (2007), Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, 10.1109/CVPR.2007.383172
- Blei, (2003), J. Mach. Learn. Res., 3, pp. 993
- 10.1023/A:1007617005950
- Simonyan, (2015), Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015
- Ren, (2015), Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015
- 10.3390/rs13234779
- Yang, (2019), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
- Li, (2020), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
- Wang, (2021), Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), 10.1109/ICPR48806.2021.9413340
- 10.1109/TIP.2020.3045636
- 10.1109/TGRS.2022.3145483
- Vaswani, (2017), Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017
- Dosovitskiy, (2020), arXiv
- 10.1016/j.icte.2021.12.006
- 10.3390/rs14040984
- 10.1109/TGRS.2016.2601622
- 10.1016/j.isprsjprs.2019.11.023
- Zhang, (2021), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops
- Zhu, (2021), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops
- Wang, (2021), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 10.1109/ICCV48922.2021.00061
- Chen, (2021), Proceedings of the 29th ACM International Conference on Multimedia, 10.1145/3474085.3475467
- Girshick, (2015), Proceedings of the IEEE International Conference on Computer Vision (ICCV), 10.1109/ICCV.2015.169
- Girshick, (2014), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Redmon, (2016), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Redmon, (2018), arXiv
- Bochkovskiy, (2020), arXiv
- Liu, (2016), Proceedings of the European Conference on Computer Vision (ECCV 2016)
- 10.1109/TGRS.2022.3176603
- Carion, (2020), arXiv
- Zhu, (2021), arXiv
- Liu, (2021), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 10.1109/ICCV48922.2021.00986
- Yuan, (2021), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
- Lee, (2022), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Yang, (2022), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- Zhou, (2021), arXiv
- Mao, (2022), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- Pang, (2019), arXiv
- Pan, (2020), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- Ma, (2021), arXiv
- 10.1109/JSTARS.2021.3079968
- Xu, (2021), arXiv
- Ren, (2022), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 10.3390/rs14030579
- Lee, (2022), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 10.1109/TIM.2021.3052575
- 10.3390/rs13234743
- Dai, (2021), Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
- Graham, (2021), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 10.1109/ICCV48922.2021.01204
- Deng, (2009), Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition
- 10.1109/TGRS.2018.2818945
- Rezatofighi, (2019), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- Lin, (2014), Proceedings of the European Conference on Computer Vision
- Xia, (2018), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Zhu, (2019), Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
- 10.1016/j.isprsjprs.2021.12.004
- 10.1080/01431161.2014.999881
- 10.1016/j.jvcir.2015.11.002
- Liu, (2017), Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods, 10.5220/0006120603240331
- He, (2016), Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)