China Safety Science Journal ›› 2024, Vol. 34 ›› Issue (11): 163-171.doi: 10.16265/j.cnki.issn1003-3033.2024.11.0368

• Public safety • Previous Articles     Next Articles

Prediction of urban sewage pipeline defect probability based on XGBoost

MA Hui(), HE Yingxia**(), CHEN Yangyang   

  1. School of Economics and Management, Tianjin Urban Construction University,Tianjin 300384,China
  • Received:2024-06-12 Revised:2024-08-15 Online:2024-11-28 Published:2025-01-04
  • Contact: HE Yingxia

Abstract:

To improve the efficiency of urban sewage pipeline defect detection, reduce resource wastage resulting from indiscriminate inspection methods, and mitigate environmental safety risks, the XGBoost model was used to predict the probability of urban sewage pipeline defects. Firstly, the causes of sewage pipe defects were statistically analyzed to determine key indicators that can characterize the pipeline defects as the inputs of the XGBoost model. Secondly, appropriate objective functions and base learner parameters were selected. Then the model training and optimization were performed by a grid search algorithm to determine the key parameters of the base learner. Finally, the XGBoost model prediction performance was validated against an area of the sewage pipeline network in Zhongshan, Guangdong province. Moreover, the main factors and paths affecting defect probability were investigated based on the model output, and the defect probability of the sewage pipe network in the area was divided into 4 different levels for visualization.The results indicated that the average area under curve (AUC) of the XGBoost model was 0.97 under 10-fold cross-validation with a prediction accuracy of 93%. Pipeline depth, slope, and length had the greatest impact on the probability of pipeline defect. As the pipe length increases, the sewage pipe defect probability will increase if the slope becomes greater and the buried depth becomes shallower.

Key words: eXtreme Gradient Boosting(XGBoost), urban sewage pipelines, defect probability, decision tree, prediction model

CLC Number: