中国安全科学学报 ›› 2023, Vol. 33 ›› Issue (6): 20-26.doi: 10.16265/j.cnki.issn1003-3033.2023.06.1416

• 安全社会科学与安全管理 • 上一篇    下一篇

基于NLP的煤矿事故原因分类研究

张江石(), 李泳暾**(), 冒香凝, 胡馨月, 潘雨, 王梓伊   

  1. 中国矿业大学(北京) 应急管理与安全工程学院,北京 100083
  • 收稿日期:2023-01-14 修回日期:2023-04-06 出版日期:2023-08-07
  • 通讯作者:
    **李泳暾(1995—),男,山西长治人,博士研究生,主要研究方向为智能化安全管理、事故文本挖掘、事故数据挖掘等。E-mail:
  • 作者简介:

    张江石 ( 1973—),男,河南三门峡人,博士,教授,主要从事安全管理和煤矿粉尘方面的研究。E-mail:

Study on classification of coal mine accident causes based on NLP

ZHANG Jiangshi(), LI Yongtun**(), MAO Xiangning, HU Xinyue, WANG Ziyi   

  1. School of Emergency Management and Safety Engineering,China University of Mining and Technology-Beijing,Beijing 100083,China
  • Received:2023-01-14 Revised:2023-04-06 Published:2023-08-07

摘要:

为有效提升分析和处理煤矿事故文本的效率,融合自然语言处理(NLP)技术与事故致因模型,构建一个自动化的事故原因分类框架。首先以事故致因“2-4”模型(24Model)为事故分类依据,分析87份煤矿事故调查报告,得到煤矿事故原因分类框架,构建每类事故原因的语料库;然后利用NLP技术分别处理语料库中各类原因文本,将其用于训练fastText模型,自动识别事故原因文本并分类;最后对比分析fastText模型与TextCNN等其他3种经典模型的分类效果。结果表明:共得到21类事故原因和6 684条训练语料,训练后的fastText模型对煤矿事故原因分类的识别正确率能够达到98.92%,综合性能优于其他3种分类模型。基于24Model和NLP技术开发的事故文本挖掘系统,能够快速分析处理事故文本信息,进一步细化事故调查报告中的原因,便于进行事故案例学习和统计分析。

关键词: 自然语言处理(NLP), 事故原因分类, “2-4”模型(24Model)

Abstract:

In order to improve the efficiency of analyzing and processing coal mine accident text effectively, NLP and accident cause models were used for building an automatic classification framework of accident cause. Based on 24Model, 87 typical mine accident investigation reports were analyzed, and a framework of mine accident cause classifications was obtained. A corpus was constructed for each type of accident cause constructed. The NLP was used for processing each type of cause text of the corpus and training the fastText model to realize the automatic recognition and classification of accident cause text. The method proposed was compared with TextCNN and the other two classical models. The results show that a total of 21 types of accident causes and 6 684 training corpus are obtained, the accuracy of fastText after training can reach 98.92%, and the comprehensive performance is better than the other three methods. The accident text mining system is developed based on 24Model and NLP, which can analyze and process the accident text information quickly and further detail the cause of the accident investigation report, which is convenient for the case study and statistical analysis.

Key words: natural language processing(NLP), classification of accidents causes, "2-4″model(24Model), fastText