新兴应用中的计算机智能

基于社区问答数据迁移学习的FAQ问答模型研究

  • 邵明锐 ,
  • 马登豪 ,
  • 陈跃国 ,
  • 覃雄派 ,
  • 杜小勇
展开
  • 中国人民大学 信息学院, 北京 100872
邵明锐,男,硕士研究生,研究方向为自然语言处理与语义搜索.E-mail:dhucstsmr@163.com.

收稿日期: 2019-07-27

  网络出版日期: 2019-10-11

基金资助

国家自然科学基金广东大数据科学中心联合基金(U1711261);国家自然科学基金(61432006)

Transfer learning based QA model of FAQ using CQA data

  • SHAO Ming-rui ,
  • MA Deng-hao ,
  • CHEN Yue-guo ,
  • QIN Xiong-pai ,
  • DU Xiao-yong
Expand
  • Information college, Renmin University of China, Beijing 100872, China

Received date: 2019-07-27

  Online published: 2019-10-11

摘要

基于FAQ(Frequent Asked Questions)问答技术构建智能客服系统,是当前业界普遍采用的技术方案.基于FAQ构建的问答系统,其返回的结果具有稳定、可靠、质量高的优点;但因受限于人工标注的知识库规模,识别能力有限,容易遇到瓶颈.为了解决FAQ数据集规模有限的问题,给出了数据层面和模型层面的解决方法:在数据层面,利用百度知道爬取相关数据并挖掘语义等价问题,保证了数据的相关性和一致性;在模型层面,提出了一种面向迁移学习的深度神经网络transAT,该模型融合了Transformer强大的特征抽取能力和注意力机制,适用于句子对之间的语义相似度计算.实验表明,该方法可以显著提升模型在FAQ问答任务中的效果,在一定程度上解决了FAQ数据集规模有限的问题.

本文引用格式

邵明锐 , 马登豪 , 陈跃国 , 覃雄派 , 杜小勇 . 基于社区问答数据迁移学习的FAQ问答模型研究[J]. 华东师范大学学报(自然科学版), 2019 , 2019(5) : 74 -84 . DOI: 10.3969/j.issn.1000-5641.2019.05.006

Abstract

Building an intelligent customer service system based on FAQ (frequent asked questions) is a technique commonly used in industry. Question answering systems based on FAQ offer numerous advantages including stability, reliability, and quality. However, given the practical limitations of scaling a manually annotated knowledge base, models often have limited recognition ability and can easily encounter bottlenecks. In order to address the problem of limited scale with FAQ datasets, this paper offers a solution at both the data level and the model level. At the data level, we use Baidu Knows to crawl relevant data and mine semantically equivalent questions, ensuring the relevance and consistency of the data. At the model level, we propose a deep neural network with transAT oriented transfer learning, which combines a transformer network and an attention network, and is suitable for semantic similarity calculations between sentence pairs. Experiments show that the proposed solution can significantly improve the impact of the model on FAQ datasets and to a certain extent resolve the issues with the limited scale of FAQ datasets.

参考文献

[1] TURNEY P D, PANTEL P.From frequency to meaning:Vector space models of semantics[J].Journal of Artificial Intelligence Research, 2010, 37:141-188.
[2] ROBERTSON S, ZARAGOZA H. The probabilistic relevance framework:BM25 and beyond[J]. Foundations and Trends in Information Retrieval, 2009, 3(4):333-389.
[3] KATO S, TOGASHI R, MAEDA H, et al. LSTM vs BM25 for open-domain QA:A hands-on comparison of effectiveness and efficiency[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development. ACM, 2017:1309-1312.
[4] WANG Z G, ITTYCHERIAH A.FAQ-based question answering via word alignment[J]. arXiv:1507.02628v1[cs.CL].
[5] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 2013 Conference on Neural Information Processing Systems Association. NIPS, 2013:3111-311.
[6] PENNINGTON J, SOCHER R, MANNING C D. Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing Association for Computational Linguistics. EMNLP, 2014:1532-1543.
[7] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). ACL, 2014:1746-1751.
[8] LIPTON Z C. A critical review of recurrent neural networks for sequence learning[J]. arXiv:1506.00019v1[cs.LG].
[9] SAK H, SENIOR A W, BEAUFAYS F. Long short term memory recurrent neural network architectures for large scale acoustic modeling[C]//Proceedings of the 2014 Conference of the International Speech Communication Association. INTERSPEECH, 2014:338-342.
[10] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30(NIPS 2017). 2017:6000-6010.
[11] DEVLIN J, CHANG M W, LEE K, et al. BERT:Pretraining of deep bidirectional transformers for language understanding[C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies. 2019:4171-4186.
[12] YU J F, QIU M H, JIANG J, et al. Modelling domain relationships for transfer learning on retrieval-based question answering systems in e-commerce[C]//Proceedings of the 11th ACM International Conference on Web Search and Data Mining. 2018:682-690.
[13] NIE Y X, BANSAL M.Shortcut-stacked sentence encoders for multi-domain inference[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing Association for Computational Linguistics, EMNLP 2017, 2017:41-45.
[14] CONNEAU A, KIELA D, SCHWENK H, et al. Supervised learning of universal sentence representations from natural language inference data[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. ACL, 2017:670-680.
[15] KRATZWALD B, FEUERRIEGEL S. Putting question-answering systems into practice:Transfer learning for efficient domain customization[J]. ACM Transactions on Management Information Systems, 2019, 9(4):15:1-15:20(Article No.15).
[16] HE H, LIN J J. Pairwise word interaction modeling with deep neural networks for semantic similarity measurement[C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies. 2016:937-948.
[17] CHEN Q, ZHU XD, LING Z H, et al. Enhanced LSTM for natural language inference[C]//Proceedings of the 2017 Annual Meeting of the Association for Computation Linguistics. ACL, 2017:1657-1668.
文章导航

/