中国安全科学学报 ›› 2024, Vol. 34 ›› Issue (12): 213-222.doi: 10.16265/j.cnki.issn1003-3033.2024.12.0308

• 应急技术与管理 • 上一篇    下一篇

基于知识提示的应急预案少样本关系抽取方法

张凯1(), 陈强1,**(), 倪凯2, 张玉金1   

  1. 1 上海工程技术大学 电子电气工程学院,上海 201620
    2 上海市安全生产科学研究所 科技研发室,上海 201620
  • 收稿日期:2024-07-15 修回日期:2024-09-22 出版日期:2024-12-28
  • 通信作者:
    **陈强(1965—),男,湖北荆门人,博士(后),教授,主要从事软件工程、地球探测与信息技术和机器学习等方面的研究。E-mail:
  • 作者简介:

    张 凯 (1997—),男,河南郑州人,硕士研究生,主要研究方向为自然语言处理、应急领域知识图谱构建。E-mail:

    倪 凯,正高级工程师。

    张玉金,副教授。

  • 基金资助:
    科技部重大专项(2020AAA0109302)

Knowledge-prompted few-shot relation extraction for emergency plan texts

ZHANG Kai1(), CHEN Qiang1,**(), NI Kai2, ZHANG Yujin1   

  1. 1 School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
    2 Science and Technology Research and Development Office, Shanghai Institute of Work Safety Science, Shanghai 201620, China
  • Received:2024-07-15 Revised:2024-09-22 Published:2024-12-28

摘要:

为从少样本应急预案文本中精准、快速实现关系抽取,提出一种基于知识提示的K最近邻关系抽取模型(KMKP)。首先,使用融入关系语义的可学习实体类型标记构建提示模板,强化输入对预训练语言模型(PLM)的提示引导效果;其次,利用边界损失函数优化模型训练,使PLM学习应急领域下的特定依赖关系,实现对PLM中掩码标记符[MASK]预测的结构化约束;然后,以训练数据创建无梯度应急知识存储数据库,结合K最近邻(KNN)算法构建知识查询机制,捕捉训练数据和预测数据之间的特征联系,无梯度范式校正PLM的预测结果;最后,在4个公开数据集的少样本设置下(1-,8-,16-shot)进行试验验证与分析。结果表明:KMKP对比最好模型KnowPrompt,F1值平均提升2.1%、2.8%、1.9%。在少样本(16-shot)应急预案实例测试中,KMKP关系抽取准确率达到91.02%,KMKP能有效缓解少样本场景下模型的灾难性遗忘和过拟合问题。

关键词: 知识提示, 少样本, 应急预案, 关系抽取, 数据增强, K最近邻(KNN)关系抽取模型(KMKP)

Abstract:

In order to accurately and quickly achieve relation extraction from few-shot emergency plan texts, KMKP based on knowledge prompts was proposed. First, a prompt template was constructed, utilizing learnable typed entity markers that incorporate relation semantics. The effectiveness of input guidance on the pre-trained language model (PLM) was thereby enhanced by these markers. Second, the boundary loss function was utilized to optimize model training, enabling the PLM to learn specific dependency relationships in the emergency domain and apply structured constraints to [MASK] predictions. Third, a gradient-free emergency knowledge storage database was created using the training data, and a knowledge retrieval mechanism was constructed by integrating KNN algorithm. The feature connections between training and prediction data can be captured through this mechanism and the gradient-free normation was used to correct the predictions of PLM. Finally, the experimental validation and analysis were performed using four public datasets under few-shot settings (1-, 8-, and 16-shot). The results show that compared to the state-of-the-art model, KnowPrompt, F1 score is boosted by an average of 2.1%, 2.8%, and 1.9% by KMKP. In a 16-shot emergency plan instance test, a relation extraction accuracy of 91.02% is achieved by KMKP. Catastrophic forgetting and overfitting issues in few-shot scenarios are effectively mitigated.

Key words: knowledge-prompted, few-shot, emergency plan, relation extraction, data augmentation, k-nearest neighbor(KNN) relationship extraction model based on knowledge prompts (KMKP)

中图分类号: