China Safety Science Journal ›› 2025, Vol. 35 ›› Issue (7): 192-200.doi: 10.16265/j.cnki.issn1003-3033.2025.07.1025

• Public safety • Previous Articles     Next Articles

Cluster analysis of autonomous driving traffic accidents based on K-means and LCA

QIAO Jianfeng1(), WANG Yanan1, LYU Shuran1, WANG Ting1, XIA Xuefeng2   

  1. 1 School of Management Engineering, Capital University of Economics and Business, Beijing 100070, China
    2 International Branch, China Petroleum Corporation Logging Limited Company, Beijing 100101, China
  • Received:2025-03-04 Revised:2025-05-09 Online:2025-08-21 Published:2026-01-28

Abstract:

To deeply explore the underlying patterns of road traffic accidents involving Autonomous Vehicles (AV), relying solely on the statistical analysis of individual accident description factors was insufficient. It was necessary to uncover further the comprehensive latent categories reflected by the interactions of multiple factors. Given that AV accident data contained structured information and narrative text, an innovative approach was proposed for type identification combining K-means clustering analysis and LCA. Specifically, the K-means method was used to extract key information from the narrative text, which was then fed into the LCA model to overcome the limitation of LCA being able to utilize only the structured information in existing accident reports. The effectiveness of this combined approach was verified using 437 AV traffic accidents in California, USA. The results show that AV accidents mainly manifest in four comprehensive types. The combined approach of K-means and LCA enables efficient clustering analysis of structured information that includes narrative text.

Key words: K-means, latent class analysis (LCA), autonomous driving, cluster analysis, autonomous vehicles (AV), traffic accidents

CLC Number: