China Safety Science Journal ›› 2025, Vol. 35 ›› Issue (3): 115-124.doi: 10.16265/j.cnki.issn1003-3033.2025.03.0863

• Safety engineering technology • Previous Articles     Next Articles

Lightweight neural network combined with depth camera for miner target detection and localization

ZHANG Miao1,2(), WANG Xiaojun1,2, LEI Jingfa1,2,3,**(), ZHAO Ruhai1,2, LI Yongling1,2   

  1. 1 School of Mechanical and Electrical Engineering, Anhui Jianzhu University, Hefei Anhui 230601, China
    2 Key Laboratory of Intelligent Manufacturing of Construction Machinery, Anhui Education Department, Hefei Anhui 230601, China
    3 Sichuan Provincial Key Laboratory of Process Equipment and Control Engineering, Zigong Sichuan 643000, China
  • Received:2024-10-23 Revised:2024-12-25 Online:2025-03-28 Published:2025-09-28
  • Contact: LEI Jingfa

Abstract:

To prevent miners from mistakenly entering dangerous areas, a lightweight underground miner object detection model based on YOLOv5s-MPD was proposed, which combined with depth camera to locate miner targets and detect whether miners had entered dangerous areas in real time. Specifically, the MobileNetv3 lightweight neural network was used as the backbone feature extraction network to significantly reduce the model size. Secondly, Polarized Self-Attention (PSA) module was introduced to enhance the perception of targets. Finally, Deformable Convolution Network v2 (DCNv2) was used to replace the standard convolution in the C3 module of the feature fusion layer, solving the problem of partial feature information loss in conventional convolution. The improved model was used in combination with the color images obtained by the depth camera to detect miner targets and obtain the spatial three-dimensional coordinates of the target center points. The results show that compared with YOLOv5s, the improved model reduces the number of parameters and computation by 83.54% and 77.03%, respectively. The model size is only 3.4 MB, and a detection speed of 70.2 f/s, which is increased by 54.97%. The mean average precision is 0.825. Compared with mainstream object detection models, the improved model has a more balanced number of parameters, computation, model size, detection speed, and mean average precision. In the actual positioning accuracy test, within a range of 1-8 meters, the average absolute error and average relative error of the distance between the camera and the miner target were 0.11 meters and 1.74%, respectively. The maximum absolute error and maximum relative error were 0.25 meters and 2.96%, respectively. In the dynamic detection, the miner target could be detected and its location information output, with a detection success rate of 97.5%.

Key words: lightweight, neural network, deep camera, target detection, target localization, security warning

CLC Number: