中国安全科学学报 ›› 2025, Vol. 35 ›› Issue (3): 204-211.doi: 10.16265/j.cnki.issn1003-3033.2025.03.0223

• 公共安全 • 上一篇    下一篇

基于知识注入的燃气知识双向变换器模型

柳晓昱(), 庄育锋**(), 赵兴昊, 王珂璠, 张国开   

  1. 北京邮电大学 智能工程与自动化学院,北京 100876
  • 收稿日期:2024-10-14 修回日期:2024-12-18 出版日期:2025-03-28
  • 通信作者:
    ** 庄育锋(1972—),男,上海人,博士,教授,主要从事智能控制与装备安全等方面的研究。E-mail:
  • 作者简介:

    柳晓昱 (2000—),女,天津人,硕士研究生,主要研究方向为文本挖掘与物流信息技术及工程应用。E-mail:

  • 基金资助:
    国家自然科学基金资助(52478123)

Gas knowledge bidirectional encoder representations from transformers model based on knowledge injection

LIU Xiaoyu(), ZHUANG Yufeng**(), ZHAO Xinghao, WANG Kefan, ZHANG Guokai   

  1. School of Intelligent Engineering and Automation, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2024-10-14 Revised:2024-12-18 Published:2025-03-28

摘要:

为提高燃气管网领域的应急管理水平,提出燃气知识双向变换器(Gas-kBERT)模型。该模型结合聊天生成预训练转换器(ChatGPT)扩充的燃气管网领域数据,以及构建的中文燃气语言理解-三元组(CGLU-Spo)和相关语料库,通过改变模型的掩码(MASK)机制,成功将领域知识注入模型中。考虑到燃气管网领域的专业性和特殊性,Gas-kBERT在不同规模和内容的语料库上进行预训练,并在燃气管网领域的命名实体识别和分类任务上进行微调。结果表明:与通用的双向变换器(BERT)模型相比,Gas-kBERT在燃气管网领域的文本挖掘任务中F1值表现出显著的提升。在命名实体识别任务中,F1值提高29.55%;在文本分类任务中,F1值提升高达83.33%。由此证明Gas-kBERT模型在燃气管网领域的文本挖掘任务中具有出色的表现。

关键词: 燃气管网, 燃气知识双向变换器(Gas-kBERT)模型, 自然语言处理(NLP), 知识注入, 双向变换器(BERT)模型

Abstract:

In order to enhance emergency management in the field of gas pipeline networks, Gas-kBERT model was proposed. The model incorporated data from the gas pipeline network field expanded by Chat Generative Pre-Trained Transformer,(ChatGPT)and Chinese Gas Language Understanding Subject-Predicate-Object(CGLU-Spo) and related corpora were constructed in this field. By altering the model's masking (MASK) mechanism, domain knowledge was successfully injected into the model. Considering the professionalism and specificity of the gas pipeline network field, Gas-kBERT was pre-trained on various scales and contents of corpora and fine-tuned on named entity recognition and classification tasks within this field. Experimental results demonstrated that, compared to the general BERT model, Gas-kBERT exhibited significant performance improvements in F1-score in text mining tasks in the gas pipeline network field. Specifically, in the named entity recognition task, the F1-score was increased by 29.55%, and in the text classification task, the F1-score improvement reached up to 83.33%. This study proves that the Gas-kBERT model performs exceptionally well in text mining tasks in the gas pipeline network field.

Key words: gas pipeline networks, gas knowledge bidirectional encoder representations from transformers(Gas-kBERT)model, natural language processing(NLP), knowledge injection, bidirectional encoder representations from transformers (BERT)

中图分类号: