China Safety Science Journal ›› 2025, Vol. 35 ›› Issue (4): 211-218.doi: 10.16265/j.cnki.issn1003-3033.2025.04.0774

Previous Articles     Next Articles

Causal analysis of highway accidents considering filling in missing values based on RF-Apriori algorithm

XUE Le1(), YU Lu1,**(), JIN Longzhe2, LI Bo1, SHEN Wenjin1   

  1. 1 School of Transportation Engineering,Dalian Jiaotong University,Dalian Liaoning 116028,China
    2 Research Institute of Macro-Safety Science, University of Science and Technology Beijing, Beijing 100083, China
  • Received:2024-11-14 Revised:2025-01-08 Online:2025-04-28 Published:2025-10-28
  • Contact: YU Lu

Abstract:

In order to improve the safety condition of highways, 26 320 highway traffic accident records in France from 2018 to 2022 were selected as the research object. Three representative algorithms were selected to impute missing values in the data, including the RF algorithm, the expectation-maximization (EM) algorithm, and the K-nearest neighbors (KNN) algorithm. The impact of different imputation algorithms on data stability was compared based on the changes in variable variance before and after imputation. The Apriori association rule algorithm was then applied to analyze the causes of highway accidents with different severity levels using the completed dataset. The results indicate that after missing value imputation, the RF algorithm demonstrates superior stability. Compared to the model trained on the original data, the accuracy is improved by 5.66%, the recall rate is increased by 9.22%, and the F1 score is enhanced by 9.91%. It is found that passenger vehicles are more likely to cause property damage accidents; motorcycles are prone to cause injury accidents on roads with lower speed limits and fatal accidents on roads with higher speed limits. The use of safety equipment is significantly related to the severity level of accidents.

Key words: random forest(RF), Apriori algorithm, missing value, highway, accident cause, data filling, association rules

CLC Number: