China Safety Science Journal ›› 2022, Vol. 32 ›› Issue (6): 109-114.doi: 10.16265/j.cnki.issn1003-3033.2022.06.2732

• Safety engineering technology • Previous Articles     Next Articles

Information extraction method for railway equipment accidents based on multi-dimensional character feature representation

ZHANG Pengxiang()   

  1. Standards & Metrology Research Institute, China Academy of Railway Sciences Corporation Limited, Beijing 100081, China
  • Received:2022-01-14 Revised:2022-04-11 Online:2022-06-28 Published:2022-12-28

Abstract:

In order to address difficulty in data analysis in investigation reports of railway equipment accidents, an accident information extraction method based on multi-dimensional character feature representation was proposed. Firstly, a subject pattern matching method was put forward for data preprocessing stage to extract subject paragraphs to which named entity belonged. For text feature representation, a multi-dimensional feature representation method was proposed to transform text into feature vector, and training of named entity recognition model was carried out by using bidirection long short term memory(BiLSTM)+ conditional random fields(CRF) neural network. Finally, accident investigation report was used for experimental verification. The results show that the comprehensive evaluation index of multi-dimensional character +BiLSTM+CRF model is improved by 22.86% through preprocessing of subject pattern matching. And compared with word2vec feature representation, multi-dimensional one can improve evaluation index of BiLSTM+CRF model by 4.89%.

Key words: multi-dimensional character feature, railway equipment accident, information extraction, subject pattern matching, named entity recognition.