China Safety Science Journal ›› 2026, Vol. 36 ›› Issue (1): 26-34.doi: 10.16265/j.cnki.issn1003-3033.2026.01.0840

• Safety Science Theories and Methods • Previous Articles     Next Articles

Construction and application of intelligent question-answering model for accident investigation reports based on DeepSeek and RAG

LI Hua1(), WU Lizhou1,2, LI Xinhong1,**(), ZHANG Yue1, FENG Yao1, QIN Ziyun1   

  1. 1 School of Resources Engineering,Xi'an University of Architecture and Technology, Xi'an Shaanxi 710055, China
    2 College of Safety Science and Engineering,Xi'an University of Science and Technology, Xi 'an Shaanxi 710054, China
  • Received:2025-09-14 Revised:2025-11-21 Online:2026-02-08 Published:2026-07-28
  • Contact: LI Xinhong

Abstract:

In order to address the constraints of limited corpus resources, restricted input capacity, and data privacy in applying LLMs to the field of safety engineering, a localized accident question-answering model integrating the DeepSeek with a RAG mechanism was constructed to enable intelligent parsing and knowledge services for complex texts, thereby supporting safety management decision-making. A semantic-feature corpus was built based on accident investigation reports and laws and regulations released by government emergency management systems, and technologies such as PaddleOCR, LayoutLMv3, and YOLOv8 were incorporated to accomplish document structure reconstruction and semantic modeling. The model encompassed four stages—document parsing, semantic alignment, knowledge-base construction, and hybrid retrieval—and was designed with capabilities for causal-chain extraction, regulation matching, and semantic mapping. The results indicated that, compared with the Deepseek-r1:32b model without the RAG mechanism, the enhanced model achieved improvements of 7.7% in automated scoring and 17.6% in human evaluation, and the response-speed and stability metrics presented higher numerical performance than those of the baseline model. The model performance was still influenced by the local parameter scale and the knowledge-updating mechanism, yet the experimental findings demonstrate that it is capable of fulfilling the intended functions in the present study.

Key words: DeepSeek, large language model (LLM), retrieval augmented generation, accident investigation report, knowledge base

CLC Number: