China Safety Science Journal ›› 2024, Vol. 34 ›› Issue (7): 123-131.doi: 10.16265/j.cnki.issn1003-3033.2024.07.0228

• Safety engineering technology • Previous Articles     Next Articles

Integrated avionics system safety optimization method based on deep reinforcement learning

ZHAO Changxiao1,2(), LI Daojun1, SUN Yixuan1, JING Peng1, TIAN Yi1,2,**()   

  1. 1 School of Safety Engineering and Science, Civil Aviation University of China, Tianjin 300300, China
    2 Key Laboratory of Civil Aviation Airworthiness Certification Technology, Civil Aviation University of China, Tianjin 300300, China
  • Received:2024-01-18 Revised:2024-04-21 Online:2024-07-28 Published:2025-01-28
  • Contact: TIAN Yi

Abstract:

To solve the problem that traditional safety design methods based on manual inspection were difficult to cope with the explosion of optional residence solutions caused by the large-scale integration of avionics systems, an avionics system partition model, task model and safety criticality level quantification model were constructed, and the comprehensive design optimization considering safety was modeled as an MDP problem. An optimization method of Soft Action-Critic (SAC) algorithm based on Actor-Critic framework was proposed. In order to obtain the correlation between the parameter selection and training results of SAC algorithm, the sensitivity of the algorithm parameters was studied. At the same time, to verify the superiority of the optimization method based on the SAC algorithm in optimizing the comprehensive design considering safety, optimization comparison experiments were carried out with the Deep Deterministic Policy Gradient (DDPG) algorithm and the traditional allocation algorithm as the objects. The results show that under the optimal parameter combination, the maximum reward after using convergence of SAC algorithm increases by nearly 8% compared with other parameter combinations, and the convergence time is shortened by nearly 16.6%. Compared with the DDPG algorithm and the traditional allocation algorithm, the optimization method based on SAC algorithm has improved approximately 62%, 7464%, 8370%, 2123% and 775% in terms of the maximum reward, cumulative constraint violation rate, partition balance risk effect, partition resource utilization and solution time

Key words: deep reinforcement learning, integrated modular avionics, safety, Markov decision process (MDP), integrated design

CLC Number: