中国安全科学学报 ›› 2025, Vol. 35 ›› Issue (8): 84-92.doi: 10.16265/j.cnki.issn1003-3033.2025.08.0084

• 安全工程技术 • 上一篇    下一篇

工业场景下人员行为的多模态信息融合决策策略

王海泉1(), 于浩玮2, 杨岳毅1, 徐晓滨3, 卜祥洲4, KURKOVA P5   

  1. 1 中原工学院 智能感知与仪器学院, 河南 郑州 450007
    2 中原工学院 自动化与电气工程学院, 河南 郑州 450007
    3 杭州电子科技大学 自动化学院, 浙江 杭州 310018
    4 河南宏博测控技术有限公司, 河南 郑州 450040
    5 圣彼得堡国立宇航与仪器制造大学 电子与激光仪器系, 俄罗斯 圣彼得堡 14-51
  • 收稿日期:2025-03-01 修回日期:2025-05-13 出版日期:2025-08-28
  • 作者简介:

    王海泉 (1981—),男,河南郑州人,博士,教授,硕士生导师,主要从事多模态数据挖掘技术、故障诊断等方面的研究。E-mail:

    杨岳毅,讲师

    徐晓滨,教授

    卜祥洲,高级工程师

    KURKOVA P,教授

  • 基金资助:
    河南重点研发专项项目(251111211600); 河南省科技攻关项目(242102320215); 河南省高端外国专家引进计划项目(HNGD2024032); 中原工学院学科实力提升计划项目(GG202412)

Multimodal information fusion decision-making strategy for personnel behavior in industrial scene

WANG Haiquan1(), YU Haowei2, YANG Yueyi1, XU Xiaobin3, BU Xiangzhou4, KURKOVA P5   

  1. 1 College of Intelligent Sensing and Instrumentation, Zhongyuan University of Technology, Zhengzhou Henan 450007, China
    2 School of Automation and Electrical Engineering, Zhongyuan University of Technology, Zhengzhou Henan 450007, China
    3 College of Automation, Hangzhou Dianzi University, Hangzhou Zhejiang 310018, China
    4 Henan Hongbo Measurment and Control Co., Ltd., Zhengzhou Henan 450040, China
    5 School of Electronic and Laser Instrument, Saint Petersburg State University of Aerospace Instrumentation, Saint Petersburg 14-51, Russia
  • Received:2025-03-01 Revised:2025-05-13 Published:2025-08-28

摘要:

为预防工业场景下人员不安全生产行为所导致的安全事故,解决光线不佳、视野受限和遮挡等干扰情况下单一视觉模态动作识别效果不佳的问题,提出一种基于自适应证据推理(S-ER)策略,融合视频信息和惯性测量元件(IMU)信息的人员不安全行为决策方法。首先,构建基于注意力机制的多任务三维卷积模型(M-C3D),分析视频信息,运用融合注意力机制的一维卷积神经网络(1D-CNN)处理IMU信息;其次,运用证据推理(ER)策略实现决策级融合,并通过萤火虫优化算法构建不同环境条件下证据权重和可靠度的优化集合,确保视频和传感器模态信息的权重能够根据环境情况自适应调整;最后,通过德克萨斯大学达拉斯分校的多模态人员行为数据集(UTD-MHAD)和中原工学院的多模态人员行为数据集(ZUT-MHAD)验证模型的有效性。结果表明: 在存在干扰的工业场景中,S-ER 方法的识别准确率最高可达 98.53%,较传统多模态融合方法及单模态识别方法的最高值提升17.52%。

关键词: 工业场景, 多模态信息, 信息融合, 行为识别, 证据推理(ER)策略

Abstract:

In order to reduce the accidents in industrial scenarios which were caused by workers'unsafe operation behaviors, meanwhile improve the performance of visual-based action recognition methods in industrial scene with poor lighting, limited field of view and occlusions, an improved decision-making strategy based on self-adaptive ER (S-ER) was introduced in this paper. This strategy could integrate video information and inertial measurement unit (IMU) information effectively. It firstly analyzed video information and IMU information with attention mechanism-based multi-task convolutional 3D (M-C3D) model as well as one-dimensional convolutional neural network (1D-CNN) fused with attention mechanism, then ER theory was introduced to achieve decision-level fusion, where the set of evidence weights and reliability under different environmental conditions was optimized through the firefly optimization algorithm for improving the recognition accuracy and robustness of the model. The effectiveness of the proposed algorithm was verified on the public dataset Multimodal Human Action Dataset from University of Texas at Dallas(UTD-MHAD) and the self-built dataset Multimodal Human Action Dataset from Zhongyuan University of Technology(ZUT-MHAD). The results show that the identification results of S-ER for workers'unsafe behaviors in complex industrial scenarios can reach up to 98.53%, which is 17.52% higher than the maximum value of traditional multimodal fusion methods and single-modality recognition methods.

Key words: industrial scene, multimodal information, information fusion, action recognition, evidence reasoning (ER) theory

中图分类号: