中国安全科学学报 ›› 2019, Vol. 29 ›› Issue (7): 170-176.doi: 10.16265/j.cnki.issn1003-3033.2019.07.027

• 公共安全 • 上一篇    下一篇

结合TF-IDF的企业生产隐患关联预警及可视化研究

胡瑾秋 教授, 张曦月**, 吴志强   

  1. 中国石油大学(北京)安全与海洋工程学院,北京 102249
  • 收稿日期:2019-03-07 修回日期:2019-05-17 出版日期:2019-07-28 发布日期:2020-10-21
  • 通讯作者: ** 张曦月(1996—),女,四川成都人,硕士研究生,研究方向为安全监测与智能诊断。E-mail:zhangxiyue0518@163.com。
  • 作者简介:胡瑾秋 (1983—),女,江苏南京人,博士,教授,主要从事油气生产系统安全智慧感知与大数据分析、诊断及预警方面的研究。E-mail:hujq@cup.edu.cn。
  • 基金资助:
    国家重点研发计划(2017YFC0805801);北京市科技新星计划(Z181100006218048)。

Research on associated early-warning and visualization of hidden danger in enterprise production based on TF-IDF

HU Jinqiu, ZHANG Xiyue, WU Zhiqiang   

  1. School of Safety and Ocean Engineering, China University of Petroleum (Beijing), Beijing, 102249, China
  • Received:2019-03-07 Revised:2019-05-17 Online:2019-07-28 Published:2020-10-21

摘要: 为有效利用企业在日常管理中积累的大量生产事故隐患记录,实现隐患预警,解决人工分析数据效率低、主观性强等问题,构建结合词频率-逆文档频率(TF-IDF)的企业生产隐患关联预警可视化模型。首先,运用先验(Apriori)关联规则算法挖掘各隐患间的潜在联系,获取信息中的隐藏价值;然后,引入TF-IDF算法优化关联规则,找出隐患间的关键规则;最后,运用可视化技术直观地展现挖掘结果。研究表明:可视化模型能快速、准确地实现隐患预警;对关联规则的优化,解决了Apriori算法支持度依赖性强的问题;挖掘结果能为企业安全管理者提供整改方向与依据。

关键词: 词频率-逆文档频率(TF-IDF), 先验(Apriori)关联分析, 优化排序, 隐患预警, 文本可视化

Abstract: In order to effectively utilize the large number of hidden danger records in production process accumulated in the management of HSE, realize the early-warning of hidden danger and solve the problems such as low efficiency, high subjectivity of manual data analysis, a TF-IDF visual model for early-warning of hidden danger was established. Firstly, the Apriori technology was applied to mine the potential associations between various hidden dangers. Then TF-IDF algorithm was introduced to optimize and sort the association rules to find out the critical associations among hidden dangers. Finally, visualization technology was used to display the mining results intuitively. Results show that the proposed TF-IDF visual model can realize the early-warning of hidden danger quickly and accurately, that the optimization of association rules solves the problem of strong dependence of Support in Apriori algorithm, and that mining results can provide the direction and give support for enterprise safety management.

Key words: term frequency- inverse document frequency(TF-IDF), Apriori association analysis, optimization of sorting, early-warning of hidden danger, text visualization

中图分类号: