中国安全科学学报 ›› 2022, Vol. 32 ›› Issue (6): 103-108.doi: 10.16265/j.cnki.issn1003-3033.2022.06.2262

• 安全工程技术 • 上一篇    下一篇

基于BLS的铁路安全事件文本分类研究

尚麟宇1(), 尹明2, 肖畅3, 程君1   

  1. 1 中国铁道科学研究院集团有限公司 通信信号研究所,北京 100081
    2 国能朔黄铁路发展有限责任公司, 河北 沧州 062350
    3 中国铁道学会,北京 100844
  • 收稿日期:2022-01-12 修回日期:2022-04-13 出版日期:2022-06-28 发布日期:2022-12-28
  • 作者简介:

    尚麟宇 (1988—),男,辽宁沈阳人,硕士,助理研究员,主要从事铁路通信信号技术方面的工作。E-mail:

    尹明,高级工程师

    肖畅,工程师

    程君,副研究员

  • 基金资助:
    国能集团科技创新项目(GJNY-20-231); 中国铁道科学研究院集团有限公司科研项目(2020YJ044)

Research on text classification of railway safety incidents based on BLS

SHANG Linyu1(), YIN Ming2, XIAO Chang3, CHENG Jun1   

  1. 1 Signal & Communication Research Institute, China Academy of Railway Sciences Corporation Limited, Beijing 100081, China
    2 CHN Energy Shuohuang Railway Development Co., Ltd.,Cangzhou Hebei 062350, China
    3 China Railway Society, Beijing 100844, China
  • Received:2022-01-12 Revised:2022-04-13 Online:2022-06-28 Published:2022-12-28

摘要:

为预防铁路安全事件的发生,利用文本挖掘相关技术和宽度学习系统(BLS),探讨铁路安全事件分类,包括设备问题、施工问题、作业问题、外部环境问题4大类。通过清洗及结构化 314条文本数据,运用Jieba分词+自定义词表+通用停用词表完成中文分词;基于卡方检验建立223个特征词,基于词频-逆文档频率(TF-IDF)计算特征词权重;基于BLS完成事件成因分类,设计3种基于BLS的分类方法。结果表明:该系统通过挖掘铁路安全事件报告的文本信息,能够形成有效的分类模型;利用BLS自身节省算力的特性,并通过添加特征增强节点的方式,可提高分类准确性,从而提高行业管理水平。

关键词: 宽度学习系统(BLS), 铁路安全事件, 文本分类, 词频-逆文档频率(TF-IDF), 文本挖掘

Abstract:

In order to prevent railway safety incidents, text mining related technologies and BLS were utilized to study effective incident classification mechanism, including four categories of equipment, construction, operation and external environmental problems. 314 pieces of text data were cleaned and structured, and Chinese word segmentation was built based on Jieba word segmentation + custom thesaurus+ custom stop word list. Then, 223 feature words were constructed based on Chi square test, and their weights were calculated based on TF-IDF. Finally, accident causes were classified according to BLS, and three classification methods were designed. The results show that the system can form an effective classification model through mining text information of railway safety event reports. And it can save computing power by utilizing features of BLS, and improve classification accuracy by adding feature enhancement nodes, so as improve industry management level.

Key words: broad learning system (BLS), railway safety incident, text classification, term frequency-inverse document frequency (TF-IDF), text mining