收稿日期: 2024-07-05
录用日期: 2024-07-05
网络出版日期: 2024-09-23
基金资助
国家重点研发计划(2023YFC3341200)
A case study on the application of the automatic labelling of the subject knowledge graph of Chinese large language models: Take morality and law and mathematics as examples
Received date: 2024-07-05
Accepted date: 2024-07-05
Online published: 2024-09-23
随着人工智能技术的迅猛发展, 大语言模型 (large language models, LLMs) 在自然语言处理和各种知识应用中展现了强大的能力. 研究了国内大语言模型在中小学学科知识图谱自动标注中的应用, 重点以义务教育阶段道德与法治学科和高中数学学科为例进行分析和探讨. 在教育领域, 知识图谱的构建对于整理和系统化学科知识具有重要意义, 然而传统的知识图谱构建方法在数据标注方面存在效率低、耗费大量人工成本等问题. 研究旨在通过大语言模型来解决这些问题, 从而提升知识图谱构建的自动化和智能化水平. 基于国内大语言模型的现状, 探讨了其在学科知识图谱自动标注中的应用, 以道德与法治和数学学科为例, 阐述了相关方法和实验结果. 首先, 探讨了研究背景和意义. 接着, 综述了国内大语言模型的发展现状和学科知识图谱的自动标注技术. 在方法与模型部分, 尝试探索一种基于国内大语言模型的自动标注方法, 力图完善其在学科知识图谱上的应用. 还探讨了学科知识图谱人工标注方法模型, 以此作为对比, 评估自动标注方法的实际效果. 在实验与分析部分, 通过在道德与法治和数学学科的自动标注实验和对其结果的分析, 发现两个学科的知识图谱自动标注均取得了较高的准确率和效率, 与人工标注结果进行了深入比较分析, 得出了一系列有价值的结论, 验证了所提出方法的有效性和准确性. 最后, 对未来的研究方向进行了展望. 总体而言, 研究为学科知识图谱的自动标注提供了一种新的思路和方法, 有望推动相关领域的进一步发展.
寇思佳 , 闫凤云 , 马晶 . 国内大语言模型在学科知识图谱自动标注上的应用——以道德与法治和数学学科为例[J]. 华东师范大学学报(自然科学版), 2024 , 2024(5) : 81 -92 . DOI: 10.3969/j.issn.1000-5641.2024.05.008
With the rapid development of artificial intelligence technology, large language models (LLMs) have demonstrated strong abilities in natural language processing and various knowledge applications. This study examined the application of Chinese large language models in the automatic labelling of knowledge graphs for primary and secondary school subjects in particular compulsory education stage morality and law and high school mathematics. In education, the construction of knowledge graphs is crucial for organizing systemic knowledge . However, traditional knowledge graph methods have problems such as low efficiency and labor-cost consumption in data labelling. This study aimed to solve these problems using LLMs, thereby improving the level of automation and intelligence in the construction of knowledge graphs. Based on the status quo of domestic LLMs, this paper discusses their application in the automatic labelling of subject knowledge graphs. Taking morality and rule of law and mathematics as examples, the relevant methods and experimental results are explained. First, the research background and significance are discussed. Second, the development status of the domestic large language model and automatic labelling technology of the subject knowledge graph are then presented. In the methods and model section, an automatic labelling method based on LLMs is explored to improve its application in a subject knowledge graph. This study also explored the subject knowledge graph model to compare and evaluate the actual effect of the automatic labelling method. In the experiment and analysis section, through the automatic labelling experiments and results analysis of the subjects of morality and law and mathematics, the knowledge graphs of the two disciplines are automatically labeled to achieve high accuracy and efficiency. A series of valuable conclusions are obtained, and the effectiveness and accuracy of the proposed methods are verified. Finally, future research directions are discussed. In general, this study provides a new concept and method for the automatic labelling of subject knowledge graphs, which is expected to promote further developments in related fields.
1 | 马富龙, 张泽琳, 闫燕. 学科知识图谱: 内涵、技术架构、应用与发展趋势 [J]. 软件导刊, 2024, 23 (3): 212-220. |
2 | 黄雅璇. 基于注意力机制的软件缺陷预测方法[D]. 南昌: 南昌航空大学, 2021. |
3 | 张蒙. 面向AI教育的学科知识图谱构建[D]. 西安: 西安工业大学, 2021. |
4 | 赵宇博, 张丽萍, 闫盛, 等. 个性化学习中学科知识图谱构建与应用综述 [J]. 计算机工程与应用, 2023, 59 (10): 1-21. |
5 | 李振, 周东岱. 基于学科知识图谱的智能化认知诊断评估方法 [J]. 现代教育技术, 2022, 32 (11): 118-126. |
6 | ZHANG X, LI C, DU H. Named entity recognition for terahertz domain knowledge graph based on Albert-BiLSTM-CRF [C]// 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). IEEE, 2020(1): 2602-2606. |
7 | HU W, HE L, MA H, et al.. Kgner: Improving Chinese named entity recognition by bert infused with the knowledge graph. Applied Sciences, 2022, 12 (15): 7702- 7710. |
8 | 任安琪, 柳林, 王海龙, 等. 面向文本实体关系抽取研究综述 [J/OL]. [2024-04-30]. 计算机科学与探索, 2024. http://fcst.ceaj.org/CN/PDF/10.3778/j.issn.1673-9418.2401033?token=63733afbb3794d3d9e9477111c904315. |
9 | HUANG K, QI P, WANG G, et al. Entity and evidence guided document-level relation extraction [C]// Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021). Kerrville, Texas: Association for Computational Linguistics. 2021: 307-315. |
10 | YAN L, HAN X, SUN L, et al. From bag of sentences to document: Distantly supervised relation extraction via machine reading comprehension [EB/OL]. (2020-12-09)[2024-04-24]. https://arxiv.org/abs/2012.04334. |
11 | 中华人民共和国教育部. 义务教育道德与法治课程标准(2022版) [M/OL]. 北京: 北京师范大学出版社, 2022. http://www.moe.gov.cn/srcsite/A26/s8001/202204/W020220420582343475848.pdf. |
12 | 中华人民共和国教育部. 普通高中数学课程标准(2017年版2020年修订) [M/OL]. 北京: 人民教育出版社, 2020. http://www.moe.gov.cn/srcsite/A26/s8001/202006/t20200603_462199.html. |
/
〈 |
|
〉 |