基于堆叠门控循环单元残差网络的知识追踪模型研究

doi:10.3969/j.issn.1000-5641.2022.06.008

摘要/Abstract

摘要：

知识追踪任务是根据学生历史做题记录和其他辅助信息追踪学生知识水平的变化过程, 以及预测学生在下一时刻作答的结果. 由于已有的神经网络知识追踪模型在效果和性能上还有待提升, 提出了基于堆叠门控循环单元(Gated Recurrent Unit, GRU)的深度残差(Stacked-Gated Recurrent Unit-Residual, S-GRU-R)网络. 针对长短期记忆网络(Long Short-term Memory, LSTM)参数过多导致过拟合问题, 用GRU代替LSTM学习做题序列中的信息, 采用堆叠GRU扩大序列学习容量, 并用残差连接降低模型训练的难度. S-GRU-R在数据集Statics2011上进行了实验, 并用AUC (Area Under the Curve)和F₁-score作为评估指标. 结果表明S-GRU-R在这2个评估指标上都超过了其他类似的循环神经网络模型.

关键词: 深度学习, 知识追踪, 循环神经网络, 门控循环单元, 残差网络

Abstract:

The concept of knowledge tracking involves tracking changes in a student’s knowledge level based on historical question records and other auxiliary information, and predicting the result of a student’s subsequent answer to a question. Since the performance of existing neural network knowledge tracking models needs to be improved, this paper proposes a deep residual network based on a stacked gated recurrent unit (GRU) network named the stacked-gated recurrent unit-residual (S-GRU-R) network. The proposed solution aims to address over-fitting caused by too many parameters in a long short-term memory (LSTM) network; hence, the solution uses a GRU instead of LSTM to learn information on the sequence of questions. The use of stacked GRU can expand sequence learning capacity, and the use of residual connections can reduce the difficulty of model training. Experiments on the Statics2011 data set were completed using S-GRU-R, and AUC (area under the curve) and F₁-score were used as evaluation functions. The results showed that S-GRU-R surpassed other similar recurrent neural network models in these two indicators.

Key words: deep learning, knowledge tracking, recurrent neural network, gated recurrent unit, residual network

中图分类号:

TP183

黄彩蝶, 王昕萍, 陈良育, 刘勇. 基于堆叠门控循环单元残差网络的知识追踪模型研究[J]. 华东师范大学学报（自然科学版）, 2022, 2022(6): 68-78.

Caidie HUANG, Xinping WANG, Liangyu CHEN, Yong LIU. Research on a knowledge tracking model based on the stacked gated recurrent unit residual network[J]. Journal of East China Normal University(Natural Science), 2022, 2022(6): 68-78.

图/表 8

图1

图2

图3

图4

表1

表2

图5

图6

参考文献 29

1	朱莎, 余丽芹, 石映辉. 智能导学系统: 应用现状与发展趋势——访美国智能导学专家罗纳德·科尔教授、亚瑟·格雷泽教授和胡祥恩教授. 开放教育研究, 2017, (5): 4- 10.
2	罗照盛. 认知诊断评价理论基础 [M]. 北京: 北京师范大学出版社, 2019: 3-8.
3	PIECH C, BASSEN J, HUANG J, et al. Deep knowledge tracing [C]//Proceedings of the 28th International Conference on Neural Information Processing System (NeurIPS). Cambridge, MA: MIT Press, 2015: 505-513.
4	BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate [EB/OL]. (2016-05-19)[2021-06-22]. http://arxiv.org/abs/1409.0473.
5	SHA L, HONG P Y. Neural knowledge tracing [C]//LNCS 10512: International Conference on Brain Function Assessment in Learning (BFAL). Berlin: Springer, 2017: 108-117.
6	刘恒宇, 张天成, 武培文, 等. 知识追踪综述. 华东师范大学学报(自然科学版), 2019, (5): 1- 15.
7	YUDELSON M V, KOEDINGER K R, GORDON G J. Individualized Bayesian knowledge tracing models [C]//International Conference on Artificial Intelligence in Education, 2013: Artificial Intelligence in Education. Berlin: Springer, 2013: 171-180.
8	CORBETT A T, ANDERSON J R. Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction, 1994, 4 (4): 253- 278.
9	DE BAKER R S J, CORBETT A T, ALEVEN V. More accurate student modeling through contextual estimation of slip and guess probabilities in Bayesian knowledge tracing [C]//International Conference on Intelligent Tutoring Systems, 2008: Intelligent Tutoring Systems. Berlin: Springer, 2008: 406–415.
10	PARDOS Z A, HEFFERNAN N T. KT-IDEM: Introducing item difficulty to the knowledge tracing model [C]// International Conference on User Modeling, Adaptation, and Personalization, 2011: User Modeling, Adaption and Personalization. Berlin: Springer, 2011: 243-254.
11	SALAKHUTDINOV R, MNIH A. Probabilistic matrix factorization [C]//Proceedings of the 20th International Conference on Neural Information Processing Systems. New York: Curran Associates Inc., 2007: 1257-1264.
12	CHEN Y, LIU Q, HUANG Z, et al. Tracking knowledge proficiency of students with educational priors [C]//Proceedings of the 26th ACM International Conference on Information and Knowledge Management (CIKM). ACM, 2017: 989-998.
13	KHAJAH M, LINDSEY R V, MOZER M C. How deep is knowledge tracing [C]//Proceedings of the 9th International Conference on Educational Data Mining (EDM). Worcester, MA: IEDMS, 2016: 94-101.
14	LEE J, YEUNG D Y. Knowledge query network for knowledge tracing: How knowledge interacts with skills [C]//Proceedings of the 9th International Conference on Learning Analytics and Knowledge (LAK). ACM, 2019: 491-500.
15	刘铁园, 陈威, 常亮, 等. 基于深度学习的知识追踪研究进展 [J]. 计算机研究与发展, 2022, 59(1): 81-104.
16	LIU D, DAI H H, ZHANG Y P, et al. Deep knowledge tracking based on attention mechanism for student performance prediction [C]//Proceedings of the 2nd International Conference on Computer Science and Educational Informatization (CSEI). IEEE, 2020: 95-98.
17	ZHANG J N, SHI X J, KING I, et al. Dynamic key-value memory networks for knowledge tracing [C]//Proceedings of the 26th International Conference on World Wide Web (WWW). ACM, 2017: 765-774.
18	AI F Z, CHEN Y S, GUO Y C, et al. Concept-aware deep knowledge tracing and exercise recommendation in an online learning system [C]//Proceedings of the 12th International Conference on Educational Data Mining (EDM 2019). Worcester, MA: IEDMS, 2019: 240-245.
19	ABDELRAHMAN G, WANG Q. Knowledge tracing with sequential key-value memory networks [C]//Proceedings of the 42nd International Conference on Research and Development in Information Retrieval (SIGIR). ACM, 2019: 175-184.
20	PANDEY S, KARYPIS G. A self-attentive model for knowledge tracing [C]//International Conference on Education Data Mining (EDM). Montreal: Word Press, 2019: 1-6.
21	CHOI Y, LEE Y, CHO J, et al. Towards an appropriate query, key, and value computation for knowledge tracing [C]//Proceedings of the 7th ACM Conference on Learning @ Scale (L@S). ACM, 2020: 341-344.
22	PU S, YUDELSON M, OU L, et al. Deep Knowledge tracing with transformers [C]//Proceedings of the 21st International Conference on Artificial Intelligence in Education (AIED). Berlin: Springer, 2020: 252-256.
23	ZHANG L, XIONG X L, ZHAO S Y, et al. Incorporating rich features into deep knowledge tracing [C]//Proceedings of the 4th ACM Conference on Learning @ Scale (L@S). ACM, 2017: 169-172.
24	NAGATANI K, ZHANG Q, SATO M, et al. Augmenting knowledge tracing by considering forgetting behavior [C]//Proceedings of the International World Wide Web Conference. ACM, 2019: 3101-3107.
25	CHENG S, LIU Q, CHEN E H. Domain adaption for knowledge tracing [EB/OL]. (2020-01-14)[2021-06-22]. https://arxiv.org/abs/2001.04841.
26	TONG H S, ZHOU Y, WANG Z. Exercise hierarchical feature enhanced knowledge tracing [C]//Proceedings of the 21st International Conference on Artificial Intelligence in Education (AIED). Berlin: Springer, 2020: 324-328.
27	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016: 770-778. DOI: 10.1109/CVPR.2016.90.
28	HE K M, ZHANG X Y, REN S Q, et al. Identity mappings in deep residual networks [C]//European Conference on Computer Vision (ECCV). Berlin: Springer, 2016: 630-645.
29	KINGMA D P, JIMMY B. Adam: A method for stochastic optimization [EB/OL]. (2017-01-30)[2021-07-11]. https://arxiv.org/abs/1412.6980.

参数	设置	参数	设置
初始学习率	0.001	迭代轮次	100次
最大时间步	50	GRU隐层维度	200
最小批次数量	128	GRU层数	2

模型	AUC	F₁
DKT	0.857	0.872
NKT	0.853	0.872
S-GRU	0.858	0.873
S-GRU-R	0.862	0.875

[1]	王畅, 马丹, 许华容, 陈攀峰, 陈梅, 李晖. SA-MGKT: 基于自注意力融合的多图知识追踪方法[J]. 华东师范大学学报（自然科学版）, 2024, 2024(5): 20-31.
[2]	段志尚, 冉懿, 吕笃良, 祁杰, 钟佳晨, 袁培森. 基于残差网络和深度可分离卷积增强自注意力机制的窃电识别[J]. 华东师范大学学报（自然科学版）, 2023, 2023(5): 193-204.
[3]	张洋, 赖叶静, 黄定江. 基于残差网络的配电柜设备元件状态识别[J]. 华东师范大学学报（自然科学版）, 2023, 2023(2): 132-142.
[4]	班启敏, 吴雯, 胡文心, 林晖, 郑巍, 贺樑. 基于学习者知识和性格的个性化课程推荐[J]. 华东师范大学学报（自然科学版）, 2022, 2022(6): 87-101.
[5]	马依琳, 陶慧玲, 董启文, 王晔. 基于Transformer的多特征融合的航空发动机剩余使用寿命预测[J]. 华东师范大学学报（自然科学版）, 2022, 2022(5): 219-232.
[6]	王泽杰, 沈超敏, 赵春, 刘新妹, 陈杰. 融合人体姿态估计和目标检测的学生课堂行为识别[J]. 华东师范大学学报（自然科学版）, 2022, 2022(2): 55-66.
[7]	刘波, 白晓东, 张更新, 沈俊, 谢继东, 赵来定, 洪涛. 深度学习在认知无线电中的应用研究综述[J]. 华东师范大学学报（自然科学版）, 2021, 2021(1): 36-52.
[8]	张旭, 黄定江. 基于深度学习的铝材表面缺陷检测[J]. 华东师范大学学报（自然科学版）, 2020, 2020(6): 105-114.
[9]	陈良健, 许建秋. 基于门控循环单元模型的在线路网匹配算法[J]. 华东师范大学学报（自然科学版）, 2020, 2020(6): 63-71.
[10]	韩程程, 李磊, 刘婷婷, 高明. 语义文本相似度计算方法[J]. 华东师范大学学报（自然科学版）, 2020, 2020(5): 95-112.
[11]	傅裕, 李优, 林煜明, 周娅. 基于自注意力机制的冗长商品名称精简方法[J]. 华东师范大学学报(自然科学版), 2019, 2019(5): 113-122,167.
[12]	杨康, 黄定江, 高明. 面向自动问答的机器阅读理解综述[J]. 华东师范大学学报(自然科学版), 2019, 2019(5): 36-52.
[13]	陈远哲, 匡俊, 刘婷婷, 高明, 周傲英. 共指消解技术综述[J]. 华东师范大学学报(自然科学版), 2019, 2019(5): 16-35.
[14]	刘恒宇, 张天成, 武培文, 于戈. 知识追踪综述[J]. 华东师范大学学报(自然科学版), 2019, 2019(5): 1-15.
[15]	叶健, 赵慧. 基于大规模弹幕数据监听和情感分类的舆情分析模型[J]. 华东师范大学学报(自然科学版), 2019, 2019(3): 86-100.