China Safety Science Journal ›› 2025, Vol. 35 ›› Issue (8): 84-92.doi: 10.16265/j.cnki.issn1003-3033.2025.08.0084

• Safety engineering technology • Previous Articles     Next Articles

Multimodal information fusion decision-making strategy for personnel behavior in industrial scene

WANG Haiquan1(), YU Haowei2, YANG Yueyi1, XU Xiaobin3, BU Xiangzhou4, KURKOVA P5   

  1. 1 College of Intelligent Sensing and Instrumentation, Zhongyuan University of Technology, Zhengzhou Henan 450007, China
    2 School of Automation and Electrical Engineering, Zhongyuan University of Technology, Zhengzhou Henan 450007, China
    3 College of Automation, Hangzhou Dianzi University, Hangzhou Zhejiang 310018, China
    4 Henan Hongbo Measurment and Control Co., Ltd., Zhengzhou Henan 450040, China
    5 School of Electronic and Laser Instrument, Saint Petersburg State University of Aerospace Instrumentation, Saint Petersburg 14-51, Russia
  • Received:2025-03-01 Revised:2025-05-13 Online:2025-08-28 Published:2026-02-28

Abstract:

In order to reduce the accidents in industrial scenarios which were caused by workers'unsafe operation behaviors, meanwhile improve the performance of visual-based action recognition methods in industrial scene with poor lighting, limited field of view and occlusions, an improved decision-making strategy based on self-adaptive ER (S-ER) was introduced in this paper. This strategy could integrate video information and inertial measurement unit (IMU) information effectively. It firstly analyzed video information and IMU information with attention mechanism-based multi-task convolutional 3D (M-C3D) model as well as one-dimensional convolutional neural network (1D-CNN) fused with attention mechanism, then ER theory was introduced to achieve decision-level fusion, where the set of evidence weights and reliability under different environmental conditions was optimized through the firefly optimization algorithm for improving the recognition accuracy and robustness of the model. The effectiveness of the proposed algorithm was verified on the public dataset Multimodal Human Action Dataset from University of Texas at Dallas(UTD-MHAD) and the self-built dataset Multimodal Human Action Dataset from Zhongyuan University of Technology(ZUT-MHAD). The results show that the identification results of S-ER for workers'unsafe behaviors in complex industrial scenarios can reach up to 98.53%, which is 17.52% higher than the maximum value of traditional multimodal fusion methods and single-modality recognition methods.

Key words: industrial scene, multimodal information, information fusion, action recognition, evidence reasoning (ER) theory

CLC Number: