矿井输送带运输区矿工不安全行为识别模型

doi:10.16265/j.cnki.issn1003-3033.2025.10.1778

摘要/Abstract

摘要： 为提升矿井输送带运输区矿工不安全行为识别的准确性与实时性,解决现有基于人工监控手段实时性差、误检率高的问题,提出一种融合图像特征与人体骨骼特征的双流时空融合网络(DS-SFNet)。首先,针对井下低光照、粉尘干扰环境,设计亚像素卷积块注意力模块(SPCBAM),通过亚像素卷积与深度可分离卷积优化特征表达;其次,针对OpenPose模型计算资源消耗大的问题,采用MobileNet v3网络重构其主干特征提取网络,并引入空洞卷积与跨层连接;最后,构建融入注意力机制的分层特征融合模块,通过时空对齐与互补性建模深度融合图像特征与骨骼轨迹特征。结果表明:DS-SFNet模型在51种人类动作数据库(HMDB51)和佛罗里达大学101类视频数据集(UCF101)上的识别准确率分别为76.4%和97.9%,较SlowFast模型分别提升1.5%和1.1%;在包含攀爬、跨越、倚靠、手搭4类行为的自建煤矿数据集中,平均识别准确率达92.3%;MobileNet v3重构的OpenPose模型参数量仅为原始视觉几何组网络(VGG19)的11.5%,推理速度提升3倍以上;模型单帧处理时间为38.7 ms,参数量为57.3 M。

关键词: 输送带运输, 不安全行为, 注意力机制, OpenPose模型, 特征融合

Abstract:

To improve the accuracy and real-time performance of identifying unsafe behaviors of miners in the mine belt transportation area, and to address the problems of poor real-time performance and high false detection rate in existing manual monitoring methods, a dual-stream spatiotemporal fusion network (DS-SFNet) that integrated image features and human skeleton features was proposed. First, challenges such as low illumination and dust interference in underground environments were addressed by designing a sub-pixel convolutional block attention module (SPCBAM), which combined with sub-pixel convolution and depth wise separable convolution to optimize feature representation. Second, to mitigate the high computational resource consumption of the OpenPose model, its backbone feature extraction network was reconstructed using MobileNet v3 by incorporating dilated convolutions and cross-layer connections. Finally, a hierarchical feature fusion module was constructed to deeply integrate image features and skeletal trajectory features through spatiotemporal alignment and complementary modeling. The results demonstrate a recognition accuracy of 76.4% on HMDB51 (Human Motion Database 51) and 97.9% on UCF101 (University of Central Florida 101), outperforming the SlowFast model by 1.5% and 1.1%, respectively. On a self-built coal mine dataset containing four unsafe behaviors (climbing, crossing, leaning, and hand-leaning), the average recognition accuracy reaches 92.3%. The MobileNet v3-reconstructed OpenPose model reduces parameters to 11.5% of the original Visual Geometry Group 19 (VGG19) network while increasing inference speed by over 3 times. The complete framework achieves a single-frame processing time of 38.7 ms and a parameter count of 57.3 M.

Key words: conveyor belt transportation, unsafe behavior, attention mechanism, OpenPose model, feature fusion

中图分类号:

X936

郝秦霞, 张家千. 矿井输送带运输区矿工不安全行为识别模型[J]. 中国安全科学学报, 2025, 35(10): 98-105.

HAO Qinxia, ZHANG Jiaqian. Identification model of miners' unsafe behaviors in coal mine conveyor belt[J]. China Safety Science Journal, 2025, 35(10): 98-105.

图/表 13

图1

图2

图3

图4

图5

图6

图7

表1

表2

图8

表3

表4

表5

参考文献 12

[1]	TIAN Shuicheng, WANG Yajuan, LI Hongxia, et al. Analysis of the causes and safety countermeasures of coal mine accidents: a case study of coal mine accidents in China from 2018 to 2022[J]. Process Safety and Environmental Protection, 2024, 187: 864-875. doi: 10.1016/j.psep.2024.04.137
[2]	王大龙, 王冰山, 曹睿, 等. 煤矿安全管理行为交互机制与模型[J]. 中国安全科学学报, 2025, 35(5):32-38. doi: 10.16265/j.cnki.issn1003-3033.2025.05.1094
	WANG Dalong, WANG Bingshan, CAO Rui, et al. Interaction mechanism and model of coal mine safety management behaviors[J]. China Safety Science Journal, 2025, 35(5): 32-38. doi: 10.16265/j.cnki.issn1003-3033.2025.05.1094
[3]	HAO Wenchao, JIANG Haiyan, SONG Qinghui, et al. A multi modal fusion coal gangue recognition method based on IBWO-CNN-LSTM[J]. Scientific Reports, 2024, 14: DOI: 10.1038/S41598-024-80811-6.
[4]	DENG Lujuan, FU Ruochong, SUN Qian, et al. Abnormal behavior recognition based on feature fusion C3D network[J]. Journal of Electronic Imaging, 2023, 32(2): DOI: 10.1117/1.JEI.32.2.021605.
[5]	CAO Xiangang, ZHANG Chiyu, WANG Peng, et al. Unsafe mining behavior identification method based on an improved ST-GCN[J]. Sustainability, 2023, 15(2): DOI: 10.3390/SU15021041.
[6]	NAN Yahui, JU Jianguo, HUA Qingyi, et al. A-mobileNet: an approach of facial expression recognition[J]. Alexandria Engineering Journal, 2022, 61(6): 4435-4444. doi: 10.1016/j.aej.2021.09.066
[7]	杨奉展, 顾清华, 李少博, 等. 露天矿低能见度下多模态融合障碍物检测[J]. 中国安全科学学报, 2025, 35(5):195-203. doi: 10.16265/j.cnki.issn1003-3033.2025.05.1654
	YANG Fengzhan, GU Qinghua, LI Shaobo, et al. Multimodal fusion-based obstacle detection in low-visibility open-pit mines[J]. China Safety Science Journal, 2025, 35(5):195-203. doi: 10.16265/j.cnki.issn1003-3033.2025.05.1654
[8]	饶天荣, 潘涛, 徐会军. 基于交叉注意力机制的煤矿井下不安全行为识别[J]. 工矿自动化, 2022, 48(10):48-54.
	RAO Tianrong, PAN Tao, XU Huijun. Unsafe action recognition in underground coal mine based on cross-attention mechanism[J]. Journal of Mine Automation, 2022, 48(10): 48-54.
[9]	马天, 姜梅, 杨嘉怡, 等. 基于多特征融合时差网络的带式输送机区域违规行为识别[J]. 工矿自动化, 2024, 50(7):115-122.
	MA Tian, JIANG Mei, YANG Jiayi, et al. Recognition of violations in belt conveyor area based on multi-feature fusion for time-difference network[J]. Journal of Mine Automation, 2024, 50(7):115-122.
[10]	王宇, 于春华, 陈晓青, 等. 基于多模态特征融合的井下人员不安全行为识别[J]. 工矿自动化, 2023, 49(11):138-144.
	WANG Yu, YU Chunhua, CHEN Xiaoqing, et al. Recognition of unsafe behaviors of underground personnel based on multi modal feature fusion[J]. Journal of Mine Automation, 2023, 49(11): 138-144.
[11]	KUEHNE H, JHUANG H, GARROTE E, et al. HMDB: a large video database for human motion recognition[C]. 2011 International Conference on Computer Vision, 2011:2556-2563.
[12]	SOOMRO K, ZAMIR A R, SHAH M. UCF101: a dataset of 101 human actions classes from videos in the wild[J]. Computer Science, 2012: DOI:10.48550/arXiv.1212.0402.

模型	准确率	精确率	召回率	F₁-Score
无注意力机制	80.2	78.7	78.6	78.6
SE	82.1	81.5	81.7	81.6
CBAM	83.8	83.8	83.1	83.5
SPCBAM	85.6	85.4	85.3	85.3

特征提取网络	精确率/ %	FPS/ (帧·s^-1)	参数量/ M
VGG19-OpenPose	87.06	9~11	151
MobileNet-V1-OpenPose	88.23	21~24	18.7
MobileNet-V2-OpenPose	91.35	23~27	17.9
MobileNet-V3-OpenPose	93.68	30~34	17.4

图像特征提取网络	人体骨骼特征提取网络	SPCBAM 注意力机制	特征融合模块	准确率/%
图像特征提取网络	人体骨骼特征提取网络	SPCBAM 注意力机制	特征融合模块	HMDB51	UCF101
√	×	×	×	63.8	87.2
√	×	√	×	65.4	88.9
×	√	×	×	50.7	79.6
×	√	√	×	52.3	81.6
√	√	×	×	64.5	87.4
√	√	√	×	71.2	92.9
√	√	√	√	76.4	97.9

模型	平均准确率	攀爬	跨越	倚靠	手搭
C3D	78.5	75.0	80.2	76.3	82.5
3DResNet50	84.1	82.5	86.8	81.7	85.4
OpenPose+ST-GCN	85.7	84.3	87.5	83.6	87.4
SlowFast	89.1	87.2	90.6	85.8	92.9
DS-SFNet	92.3	90.8	94.1	91.5	92.8

模型	处理单帧时间/ ms	FPS/ (帧·s^-1)	参数量/M	计算复杂度/ GFLOPs
C3D	35.1	28.6	78.4	400
3DResNet50	45.3	22.2	46.2	150
OpenPose+ST-GCN	40.6	24.8	138.7	280
SlowFast	55.2	18.3	60.5	210
DS-SFNet	38.7	25.8	57.3	136