露天矿低能见度下多模态融合障碍物检测

doi:10.16265/j.cnki.issn1003-3033.2025.05.1654

摘要/Abstract

摘要： 为解决露天矿区低能见度、低照度环境中无人矿卡对行进障碍物的感知误差问题,减少碰撞风险,提出基于多模态融合的障碍物检测方法;首先,采用轻量粘合(LightGlue)图像配准算法,对齐热红外与可见光不同模态图像的空间,避免融合前的空间错位与几何失真;其次,模态特征提取与融合阶段,在改进的双分支主干网络引入双模态特征融合(DMFF)模块,通过特征压缩、跨模态特征增强,提高提取双模态特征的能力并完成融合;然后,引入迭代学习方法,深入匹配模态间的互补信息,获取双模态特征映射图,提高多模态检测性能;最后,将各尺度融合的特征映射图输入到检测头部,结合边界框回归与分类预测进行精确检测。结果表明:该方法在低能见度等复杂场景下对障碍物的检测效果较好,阈值为0.5时的平均精度均值(mAP@0.5)达到90.8%;F₁平衡分数达到0.887,较现有方法表现出更高精度和速度,并且误报率和漏检率也较低,能有效提升无人矿卡在复杂环境下的检测精度和安全性。

关键词: 露天矿, 低能见度, 无人矿卡, 多模态融合, 障碍物检测, 感知预警

Abstract:

To address the perception inaccuracies of autonomous mining trucks in open-pit mines under low-visibility and low-illumination conditions—issues that may lead to obstacle collisions. This paper was proposed a multimodal fusion-based obstacle detection method to enhance detection accuracy and operational safety in complex environments. Firstly, local Feature matching at light speed (LightGlue), was employed to achieve spatial alignment between thermal infrared and visible light images, thereby avoiding spatial misalignment and geometric distortion prior to fusion. Secondly, in the modality feature extraction and fusion stage, a Dual-Modality Feature Fusion (DMFF) module was incorporated into the improved dual-branch backbone network. The extraction capability of dual-modality features was enhanced and fusion was performed through feature compression and cross-modal feature enhancement. An iterative learning method was then introduced to effectively match the complementary information between modalities, generating a fused dual-modality feature map and improving multimodal detection performance. Finally, the fused feature maps at multiple scales were input into the detection head. They were combined with bounding box regression and classification prediction for precise detection. Experimental results demonstrate that the proposed method achieves excellent obstacle detection performance in challenging scenarios with low visibility. Specifically, it achieves a mean Average Precision (mAP@0.5) of 90.8%, and an F₁-score of 0.887, outperforming existing methods in both accuracy and speed. Moreover, the proposed approach exhibits lower false positive and miss detection rates, effectively ensuring the safe navigation of autonomous mining trucks in complex operational environments.

Key words: open-pit mine, low visibility, unmanned driving truck, multi-modal fusion, obstacle detection, perceptual warning system

中图分类号:

X924.3安全监控系统

杨奉展, 顾清华, 李少博, 杨建春. 露天矿低能见度下多模态融合障碍物检测[J]. 中国安全科学学报, 2025, 35(5): 195-203.

YANG Fengzhan, GU Qinghua, LI Shaobo, YANG Jianchun. Multimodal fusion-based obstacle detection in low-visibility open-pit mines[J]. China Safety Science Journal, 2025, 35(5): 195-203.

图/表 8

图1

图2

图3

图4

图5

表1

图6

表2

参考文献 22

[1]	王勇, 胡斌, 康渴楠, 等. 智能传感器在消防装备中的应用现状分析[J]. 中国安全科学学报, 2022, 32(增1): 184-188.
	WANG Yong, HU Bin, KANG Ke'nan, et al. Analysis of application status of intelligent sensors in firefighting equipment[J]. China Safety Science Journal, 2022, 32(S1):184-188. doi: 10.16265/j.cnki.issn1003-3033.2022.S1.0657
[2]	WANG Zhishe, XU Jiawei, JIANG Xiaolin, et al. Infrared and visible image fusion via hybrid decomposition of NSCT and morphological sequential toggle operator[J]. Optik, 2020,201:DOI:10.1016/j.ijleo.2019.163497.
[3]	娄熙承, 冯鑫. 潜在低秩表示框架下基于卷积神经网络结合引导滤波的红外与可见光图像融合[J]. 光子学报, 2021, 50(3):188-201.
	LOU Xicheng, FENG Xin. Infrared and visible image fusion in latent low rank representation framework based on convolution neural network and guided filtering[J]. Acta Photonica Sinica, 2021, 50(3):188-201.
[4]	VASU G T, PALANISAMY P. Multi-focus image fusion using anisotropic diffusion filter[J]. Soft Computing, 2022, 26(24):14029-14 040.
[5]	MA Jinlei, ZHOU Zhiqiang, WANG Bo, et al. Infrared and visible image fusion based on visual saliency map and weighted least square optimization[J]. Infrared Physics and Technology, 2017,82:8-17.
[6]	BAVIRISETTI P D, XIAO Gang, ZHAO Junhao, et al. Multi-scale guided image and video fusion:a fast and efficient approach[J]. Circuits, Systems, and Signal Processing, 2019, 38(12):5576-5605.
[7]	SUN Changqi, ZHANG Cong, XIONG Naixue. Infrared and visible image fusion techniques based on deep learning:a review[J]. Electronics, 2020, 9(12):DOI:10.3390/electronics9122162.
[8]	SUN Yiming, CAO Bing, ZHU Pengfei, et al. Drone-based RGB-infrared cross-modality vehicle detection via uncertainty-aware learning[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(10):6700-6713.
[9]	YUAN Maoxun, WEI Xingxing. C²Former: calibrated and complementary transformer for RGB-infrared object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024,62:1-12.
[10]	WANG Haozhe, SHU Chang, LI Xiaofeng. Enlighten fusion multiscale network for infrared and visible image fusion in dark environments[J]. IEEE Signal Processing Letters, 2023,30:1167-1171.
[11]	YE Yingchun, ZHANG Junxuan, LI Zeyi. Infrared and visible image fusion method based on dual domain enhancement in low illumination environment[C]. 2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing(DASC), Intl Conf on Pervasive Intelligence and Computing(PiCom), Intl Conf on Cloud and Big Data Computing(CBDCom), Intl Conf on Cyber Science and Technology Congress(CyberSciTech), 2022:1-7.
[12]	周李兵, 陈晓晶, 贾文琪, 等. 用于井下行人检测的可见光和红外图像融合算法[J]. 工矿自动化, 2023, 49(9):73-83.
	ZHOU Libing, CHEN Xiaojing, JlA Wenqi, et al. Visible and infrared image fusion algorithm for underground personneldetection[J]. Journal of Mine Automation, 2023, 49(9):73-83.
[13]	贾桂敏, 程羽, 齐孟飞. 多尺度注意力特征增强融合的红外小目标检测新网络[J]. 中国安全科学学报, 2024, 34(6):90-98. doi: 10.16265/j.cnki.issn1003-3033.2024.06.1565
	JIA Guimin, CHENG Yu, QI Mengfei. Multi-scale attention feature-enhanced fusion of a new network for infrared small object detection[J]. China Safety Science Journal, 2024, 34(6):90-98. doi: 10.16265/j.cnki.issn1003-3033.2024.06.1565
[14]	HUSSAIN M. YOLOv1 to v8: unveiling each variant:a comprehensive review of YOLO[J]. IEEE Access, 2024,12:42 816-42 833.
[15]	VANSWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]. Proceedings of the Advances in Neural Information Processing Systems(NeurIPS), 2017:5998-6008.
[16]	SHEN Jifeng, CHEN Yifei, LIU Yue, et al. Icafusion: iterative cross-attention guided feature fusion for multispectral object detection[J]. Pattern Recognition, 2024,145:DOI:10.1016/j.patcog.2023.109913.
[17]	HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016:770-778.
[18]	SHEN Zhiqiang, LIU Zechun, XING Eric. Sliced recursive transformer[C]. Proceedings of the Computer Vision-ECCV 2022: 17^th European Conference, 2022:727-744.
[19]	LINDENBERGER P, SARLIN P E, POLLEFEYS M. LightGlue: local feature matching at light speed[C]. Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2023:17581-17592.
[20]	ZHANG Heng, FROMONT E, LEFEVRE S, et al. Guided attentive feature fusion for multispectral pedestrian detection[C]. Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), 2021:72-80.
[21]	ZHANG Heng, FROMONT E, LEFEVRE S, et al. Multispectral fusion for object detection with cyclic fuse-and-refine blocks[C]. Proceedings of the IEEE International Conference on Image Processing(ICIP), 2020:276-280.
[22]	FANG Qingyun, HAN Dapeng, WANG Zhaokui. Cross-modality fusion transformer for multispectral object detection[J]. SSRN Electronic Journal,2022:DOI:10.2139/ssrn.4227745.

模型	模态	mAP@0.5/%	mAP@0.5:0.95/%	R/%	P/%	Params	F₁
GAFF^[20]	可见光+热红外	72.8	37.4	—	—	23 765 806	—
CFR_3^[21]	可见光+热红外	72.4	36.1	—	—	64 893 649	—
CFT^[22]	可见光+热红外	89.4	62.7	86.4	88.7	206 268 762	0.875
ICAF-NIN	可见光+热红外	90.6	60.9	87.8	87.2	189 448 282	0.874
ICAF-TF	可见光+热红外	90.2	63.8	85.0	89.0	190 273 534	0.870
ICAF-ResNet50	可见光+热红外	88.0	60.4	82.3	88.5	313 810 686	0.853
文中	可见光+热红外	90.8	64.4	86.9	89.3	173 866 640	0.881

模型	模态	mAP@0.5/%	mAP@0.5:0.95/%	R/%	P/%	Params
Faster-RCNN	可见光	64.5	40.4	53.7	52.3	23 765 806
Faster-RCNN	热红外	73.8	44.7	55.9	61.8	20 268 762
RT-DETR	可见光	60.7	40.2	51.6	69.2	32 816 351
RT-DETR	热红外	64.7	39.2	62.0	64.8	32 816 351
YOLOv8	可见光	77.0	49.6	70.5	83.2	43 610 463
YOLOv8	热红外	84.4	61.4	75.8	86.2	43 610 463
YOLOv11	可见光	60.9	38.9	59.0	67.3	2 590 815
YOLOv11	热红外	57.9	34.3	53.2	68.7	2 590 815
文中	可见光+热红外	90.8	64.4	86.9	89.3	173 866 640