中国安全科学学报 ›› 2018, Vol. 28 ›› Issue (3): 74-78.doi: 10.16265/j.cnki.issn1003-3033.2018.03.013

• 安全工程技术科学 • 上一篇    下一篇

不均衡数据下的采空区煤自燃PCA-AdaBoost预测模型

赵琳琳1 讲师, 温国锋1 教授, 邵良杉2 教授   

  1. 1 山东工商学院 管理科学与工程学院, 山东 烟台 264005;
    2 辽宁工程技术大学 系统工程研究所, 辽宁 葫芦岛 125000
  • 收稿日期:2017-12-29 修回日期:2018-02-09 出版日期:2018-03-28 发布日期:2020-11-09
  • 作者简介:赵琳琳 (1988—),女,吉林榆树人,博士,讲师,主要从事矿山生产灾害智能预测、数据挖掘等方面的科研与教学工作。E-mail:linlinzhao1204@126.com。
  • 基金资助:
    国家自然科学基金资助(71771111,71371091)。

PCA-AdaBoost model for predicting coal spontaneous combustion in caving zone with imbalanced data

ZHAO Linlin1, WEN Guofeng1, SHAO Liangshan2   

  1. 1 School of Management Science and Engineering, Shandong Technology and Business University, Yantai Shandong 264005, China;
    2 System Engineering Institute, Liaoning Technical University, Huludao Liaoning 125000, China
  • Received:2017-12-29 Revised:2018-02-09 Online:2018-03-28 Published:2020-11-09

摘要: 为提高不均衡数据下采空区自然发火预测准确率,选取O2浓度等作为指标,利用主成分分析法(PCA)提取指标的主成分,并将主成分作为自适应增强算法(AdaBoost)输入参数,发火情况作为AdaBoost算法输出参数,建立不均衡数据下采空区自然发火PCA-AdaBoost预测模型;以张家口宣东2号矿为例,选取20组实测数据作为训练样本,用于训练模型;利用受试者工作特征曲线下的面积进行评价预测效果;利用训练好的模型预测15组测试样本,并将结果与粒子群优化支持向量机(PSO-SVM)模型进行比较。结果表明:在不均衡数据集条件下,利用PCA提取的算例的3个主成分包含原始6个指标的86.831%信息,降低了指标相关性,实现了降维;温度和CH4浓度对发火影响更大;所建模型的预测结果与实际情况吻合,其在预测精度和收敛速度方面优于PSO-SVM模型。

关键词: 自燃, 不均衡数据集, 主成分分析(PCA), 自适应增强算法(AdaBoost), 粒子群优化支持向量机(PSO-SVM)

Abstract: In order to improve the prediction accuracy of coal spontaneous combustion in caving zone under imbalanced data, after taking O2 concentration etc. as factors, and the principal components of factors were obtained by PCA, a PCA-AdaBoost prediction model of coal spontaneous combustion was built, which took the principal components as inputs and combustion situations as outputs. Taking Xuandong 2nd coal mine as the research object, the model was trained through twenty groups of training samples, and evaluated by the area under the curve of receiver operating characteristic curve. The trained model was used to predict fifty groups of test samples. A prediction result comparison was made between the model and the PSO-SVM model. The results show that based on imbalanced data sets, three principal componets are extracted with the 86.831% information of six original factors by PCA, both the correlations between the factors and the dimensionality have been reduced, that temperature and CH4 concentration have a greater impact than other factors, that the prediction results of PCA-AdaBoost model accord with the actual situation, and that the model is superior to the PSO-SVM model in terms of prediction accuracy and convergence speed.

Key words: spontaneous combustion, imbalanced data set, principal component analysis(PCA), adaptive boosting algorithm(AdaBoost), particle swarm optimization-support vector machine(PSO-SVM)

中图分类号: