中国安全科学学报 ›› 2017, Vol. 27 ›› Issue (6): 61-66.doi: 10.16265/j.cnki.issn1003-3033.2017.06.011

• 安全工程技术科学 • 上一篇    下一篇

不平衡数据下的采空区自然发火预测研究

邵良杉 教授, 李相辰   

  1. 辽宁工程技术大学 系统工程研究所,辽宁 葫芦岛 125105
  • 收稿日期:2017-02-24 修回日期:2017-04-10
  • 作者简介:邵良杉 (1961—),男,辽宁凌源人,博士,教授,博士生导师,主要从事矿业系统工程方面的研究。E-mail:2215338497@qq.com。
  • 基金资助:
    国家自然科学基金资助(71371091)。

Prediction of coal spontaneous combustion in caving zone with unbalanced data

SHAO Liangshan, LI Xiangchen   

  1. System Engineering Institute, Liaoning Technical University, Huludao Liaoning 125105,China
  • Received:2017-02-24 Revised:2017-04-10

摘要: 为提高不平衡数据下少数类样本的自然发火预测精度,建立基于K-means-Relief-HSMOTE-SVM的采空区自然发火预测模型。首先,应用K-means法优化Relief方法,筛选自然发火指标,以弥补Relief指标筛选方法导致发火特征权重值偏大的缺陷;其次,针对合成少数类过采样(SMOTE)方法在处理不平衡数据时出现的因插值空间过小导致过拟合等问题,提出基于h维空间的过采样算法(HSMOTE),使自然发火数据集趋于平衡;应用支持向量机(SVM)预测降维、平衡后的发火数据;最后,选用张家口宣东2号煤矿实测样本试验50次,并对比所建模型。结果表明:用所建模型能提取关键特征因子,克服SMOTE方法的缺陷,有效提升SVM在不平衡数据下对少数类发火样本的预测精度和几何平均正确率。

关键词: 不平衡数据, 采空区自然发火, 支持向量机(SVM), 预测, h维空间的过采样算法(HSMOTE)

Abstract: In order to improve the prediction accuracy of spontaneous combustion of a few samples under the unbalanced data, a prediction model based on K-means-Relief-HSMOTE-SVM was built. First, the K-means method was used to optimize the traditional Relief method for index selection, to make up its deficiency-combustion indexes' unreasonablly high weights caused by the feature extraction under unbalanced data. Then, in view of problems such as overfitting encountered in dealing with unbalanced data with the SMOTE method an h dimensional spherical space thought was introduced, and the clustering algorithm was used to determine the center and establish the spherical space, and an improved HSMOTE algorithm was developed for balancing the spontaneous combustion data. Next, the SVM was used to predict the spontaneous combustion data. Actual samples from Xuandong No.2 coal mine were used to conduct 50 experiments, and a result comparison was made between the model built by the authors, the traditional SVM and other models. The results indicate that K-means-Relief-HSMOTE-SVM can effectively extract feature factors and overcome SMOTE defect,that compared with other models, K-means-Relief-HSMOTE-SVM can more effectively improve the traditional SVM in unbalanced prediction accuracy and geometric mean correct rate for combustion samples of the minority class natural data.

Key words: unbalanced data, spontaneous combustion in caving zone, support vector machine(SVM), prediction, h dimensional synthetic minority over sampling technique(HSMOTE)

中图分类号: