华东师范大学学报(自然科学版) ›› 2019, Vol. 2019 ›› Issue (5): 53-65.doi: 10.3969/j.issn.1000-5641.2019.05.004

• 数据驱动的计算教育学 • 上一篇    下一篇

面向初等数学的知识点关系提取研究

杨东明, 杨大为, 顾航, 洪道诚, 高明, 王晔   

  1. 华东师范大学 数据科学与工程学院, 上海 200062
  • 收稿日期:2019-07-29 出版日期:2019-09-25 发布日期:2019-10-11
  • 通讯作者: 王晔,男,研究员,研究方向为Web数据管理、海量数据挖掘、分布式系统等.E-mail:ywang@dase.ecnu.edu.cn. E-mail:ywang@dase.ecnu.edu.cn
  • 作者简介:杨东明,男,硕士研究生,研究方向为面向新硬件的大数据系统.E-mail:y1271752959m2@yahoo.com.
  • 基金资助:
    国家重点研发计划(2016YFB1000905);国家自然科学基金(U1811264,61672234,61502236,61877018,61977025);上海市科技兴农推广项目(T20170303)

Research on knowledge point relationship extraction for elementary mathematics

YANG Dong-ming, YANG Da-wei, GU Hang, HONG Dao-cheng, GAO Ming, WANG Ye   

  1. School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
  • Received:2019-07-29 Online:2019-09-25 Published:2019-10-11

摘要: 随着互联网技术的发展,在线教育已经改变了学生的学习方式.但由于缺乏完整的知识体系,在线教育存在着智能化程度低和“信息迷航”的问题.因此,构建知识体系成为在线教育平台的核心技术.知识点间的关系提取是知识体系构建的主要任务之一,目前比较高效的关系提取算法主要是监督式的.但是这类方法受限于文本质量低、语料稀缺、标签数据难获取、特征工程效率低、难以提取有向关系等挑战.为此,基于百科语料和远程监督思想,研究了知识点间的关系提取算法.提出了基于关系表示的注意力机制,该方法能够提取知识点间的有向关系信息.结合了GCN和LSTM的优势,提出了GCLSTM,该模型更好地提取了句子中的多点信息.基于Transformer架构和关系表示的注意力机制,提出了适用于有向关系提取的BTRE模型,降低了模型的复杂度.设计并实现了知识点关系提取系统.通过设计3组对比实验,验证了模型的性能和效率.

关键词: 知识体系构建, 关系提取, 注意力机制, 远程监督, Transformer

Abstract: With the development of Internet technology, online education has changed the learning style of students. However, given the lack of a complete knowledge system, online education has a low degree of intelligence and a/knowledge trek0problem. The relation-extraction concept is one of the key elements of knowledge system construction. Therefore, building knowledge systems has become the core technology of online education platforms. At present, the more efficient relationship extraction algorithms are usually supervised. However, such methods suffer from low text quality, scarcity of corpus, difficulty in labeling data, low efficiency of feature engineering, and difficulty in extracting directional relationships. Therefore, this paper studies the relation-extraction algorithm between concepts based on an encyclopedic corpus and distant supervision methods. An attention mechanism based on relational representation is proposed, which can extract the forward relationship information between knowledge points. Combining the advantages of GCN and LSTM, GCLSTM is proposed, which better extracts multipoint information in sentences. Based on the attention mechanism of Transform architecture and relational representation, a BTRE model suitable for the extraction of directional relationships is proposed, which reduces the complexity of the model. Hence, a knowledge point relationship extraction system is designed and implemented. The performance and efficiency of the model are verified by designing three sets of comparative experiments.

Key words: knowledge system construction, relation-extraction, attention mechanism, distant supervisor, Transformer

中图分类号: