Big Data Analysis

The auto-question answering system based on convolution neural network

  • JING Li-jiao ,
  • FU Yun-bin ,
  • DONG Qi-wen
  • School of Data Science and Engineering, East China Normal University, Shanghai 200062, China

Received date: 2017-06-23

  Online published: 2017-09-25


The question-answering is a hot research field in natural language processing, which can give users concise and precise answer to the question presented in natural language and provide the users with more accurate information service. There are two key questions to be solved in the question answering system:one is to realize the semantic representation of natural language question and answer, and the other is to realize the semantic matching learning between question and answer. Convolution neural network is a classic deep network structure which has a strong ability to express semantics in the field of natural language processing in recent years, and is widely used in the field of automatic question and answer. This paper reviews some techniques in the question answering system that is based on the convolution neural network, the paper focuses on the knowledge-based and the text-oriented Q&A techniques from the two main perspectives of semantic representation and semantic matching, and indicates the current research difficulties.

Cite this article

JING Li-jiao , FU Yun-bin , DONG Qi-wen . The auto-question answering system based on convolution neural network[J]. Journal of East China Normal University(Natural Science), 2017 , 2017(5) : 66 -79 . DOI: 10.3969/j.issn.1000-5641.2017.05.007


[1] KATZ B. Annotating the World Wide Web using natural language[C]//Proceedings of RIAO'97 ComputerAssisted Information Searching on Internet. 1997:136-155.
[2] SPINK A, GUNAR O. E-commerce web queries:eExcite and ask jeeves study[J/OL]. First Monday, 2001, 6(7).[2017-06-02].
[3] ZHENG Z. AnswerBus question answering system[C]//Proceedings of the Second International Conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc, 2002:399-404.
[4] 郑实福, 刘挺, 秦兵, 等. 自动问答综述[J]. 中文信息学报, 2002, 16(6):46-52.
[5] MOLLAD, VICEDO J L. Special section on restricted-domain question answering[J]. Computational Linguistics, 2006, 33(1):41-61.
[6] KWIATKOWSKI T, ZETTLEMOYER L, GOLDWATER S, et al. Lexical generalization in CCG grammar induction for semantic parsing[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2011:1512-1523.
[7] LIANG P, JORDAN M I, KLEIN D. Learning dependency-based compositional semantics[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies (I). Association for Computational Linguistics, 2011:590-599.
[8] ZHANG Y, JIN R, ZHOU Z H. Understanding bag-of-words model:A statistical framework[J]. International Journal of Machine Learning and Cybernetics, 2010, 1(1/4):43-52.
[9] LANDAUER T K, FOLTZ P W, LAHAM D. An introduction to latent semantic analysis[J]. Discourse Processes, 1998, 25:259-284.
[10] BROWN P F, DESOUZA P V, MERCER R L, et al. Class-based n-gram models of natural language[J]. Computational Linguistics, 1992, 18(4):467-479.
[11] BENGIO Y, DUCHARME R, VINCENT P, et al. A neural probabilistic language model[J]. Journal of Machine Learning Research, 2003, 3:1137-1155.
[12] MIKOLOV T, YIH W T, ZWEIG G. Linguistic regularities in continuous space word representations[C]//Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistic. 2013, 13:746-751.
[13] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space.[EB/OL].[2017-06-02]. arXiv:1301.3781.
[14] COLLOBERT R, WESTON J. A unified architecture for natural language processing:Deep neural networks with multitask learning[C]//Proceedings of the 25th International Conference on Machine Learning. ACM, 2008:160-167.
[15] MNIH A, HINTON G. Three new graphical models for statistical language modelling[C]//Proceedings of the 24th International Conference on Machine Learning. ACM, 2007:641-648.
[16] FREGE G. Funktion, Begriff, Bedeutung[M]. Gottingen:Vandenhoeck & Ruprecht, 2002.
[17] HERMANN K M. Distributed representations for compositional semantics[D]. Oxford:University of Oxford, 2014.
[18] 来斯惟. 基于神经网络的词和文档语义向量表示方法研究[D]. 北京:中国科学院研究生院, 2016.
[19] FUKUSHIMA K, MIYAKE S. Neocognitron:A self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position[J]. Biological Cybernetics, 1980, 36(4):193-202.
[20] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE. 1998, 86(11):2278-2324.
[21] SEVERYN A, MOSCHITTI A. Learning to rank short text pairs with convolutional deep neural networks[C]//Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2015:373-382.
[22] DAHL G E, SAINATH T N, HINTON G E. Improving deep neural networks for LVCSR using rectified linear units and dropout[C]//Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 2013:8609-8613.
[23] COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing (almost) from scratch[J]. Journal of Machine Learning Research, 2011, 12:2493-2537.
[24] BERGER A, LAFFERTY J. Information retrieval as statistical translation[C]//Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1999:222-229.
[25] WU W, LU Z, LI H. Learning bilinear model for matching queries and documents[J]. Journal of Machine Learning Research, 2013, 14(1):2519-2548.
[26] YU L, HERMANN K M, BLUNSOM P, et al. Deep learning for answer sentence selection[EB/OL].[2017-06-02]. arXiv:1412.1632.
[27] SURDEANU M, CIARAMITA M, ZARAGOZA H. Learning to rank answers to non-factoid questions from web collections[J]. Computational Linguistics, 2011, 37(2):351-383.
[28] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Advances in Neural Information Processing Systems. 2013:3111-3119.
[29] 庞亮, 兰艳艳, 徐君, 等. 深度文本匹配综述[J]. 计算机学报, 2017, 40(4):985-1003.
[30] YIH W, HE X D, MEEK C. Semantic parsing for single-relation question answering[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2014:643-648.
[31] DONG L, WEI F, ZHOU M, et al. Question Answering over Freebase with Multi-Column Convolutional Neural Networks[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2015:260-269.
[32] CAI Q, YATES A. Large-scale semantic parsing via schema matching and lexicon extension[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2013:423-433.
[33] BORDES A, WESTON J, USUNIER N. Open question answering with weakly supervised embedding models[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Berlin:Springer, 2014:165-180.
[34] BORDES A, CHOPRA S, WESTON J. Question answering with subgraph embeddings[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014:615-620.
[35] HUANG P S, HE X, GAO J, et al. Learning deep structured semantic models for web search using clickthrough data[C]//Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management. ACM, 2013:2333-2338.
[36] SHEN Y, HE X, GAO J, et al. Learning semantic representations using convolutional neural networks for web search[C]//Proceedings of the 23rd International Conference on World Wide Web. ACM, 2014:373-374.
[37] FADER A, ZETTLEMOYER L S, ETZIONI O. Paraphrase-driven learning for open question answering[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2013:1608-1618.
[38] FADERA, SODERLAND S, ETZIONI O. Identifying relations for open information extraction[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2011:1535-1545.
[39] YAO X, VAN DURME B. Information extraction over structured data:Question answering with freebase[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2014:956-966.
[40] BAO J, DUAN N, ZHOU M, et al. Knowledge-based question answering as machine translation[J]. Cell, 2014, 2(6):967-976.
[41] LEHMANN J, ISELE R, JAKOB M, et al. DBpedia-a large-scale, multilingual knowledge base extracted from Wikipedia[J]. Semantic Web, 2015, 6(2):167-195.
[42] BOLLACKER K, EVANS C, PARITOSH P, et al. Freebase:A collaboratively created graph database for structuring human knowledge[C]//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. ACM, 2008:1247-1250.
[43] SUCHANEK F M, KASNECI G, WEIKUM G. Yago:A core of semantic knowledge[C]//Proceedings of the 16th International Conference on World Wide Web. ACM, 2007:697-706.
[44] YIH S W, CHANG M W, HE X, et al. Semantic parsing via staged query graph generation:Question answering with knowledge base[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2015:1321-1331.
[45] YANG Y, CHANG M W. S-MART:Novel tree-based structured learning algorithms applied to tweet entity linking[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. 2015:504-513.
[46] BERANT J, CHOU A, FROSTIG R, et al. Semantic parsing on freebase from question-answer pairs[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2013:1533-1544.
[47] BERANT J, LIANG P. Semantic parsing via paraphrasing[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2014:1415-1425.
[48] WANG M, SMITH N A, MITAMURA T. What is the Jeopardy model? A quasi-synchronous grammar for QA[C]//Proceedings of EMNLP-CoNLL'07. 2007:22-32.
[49] HEILMAN M, SMITH N A. Tree edit models for recognizing textual entailments, paraphrases, and answers to questions[C]//Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. ACL, 2010:1011-1019.
[50] WANG M, MANNING C D. Probabilistic tree-edit models with structured latent variables for textual entailment and question answering[C]//Proceedings of the 23rd International Conference on Computational Linguistics. ACL, 2010:1164-1172.
[51] YAO X, VAN DURME B, CALLISON-BURCH C, et al. Answer extraction as sequence tagging with tree edit distance[C]//Proceedings of NAACL-HLT. 2013:858-867.
[52] YIH W, CHANG M W, MEEK C, et al. Question answering using enhanced lexical semantic models[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. ACL, 2013:1744-1753.
[53] YIH W, ZWEIG G, PLATT J C. Polarity inducing latent semantic analysis[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 2012:1212-1222.
[54] HU B, LU Z, LI H, et al. Convolutional neural network architectures for matching natural language sentences[C]//Proceedings of Advances in Neural Information Processing Systems. 2015:2042-2050.
[55] BENGIO Y. Learning Deep Architectures for AI[M]. Foundations and Trends in Machine Learning. Boston, USA:Now Publishers Ins, 2009.
[56] LU Z, LI H. A deep architecture for matching short texts[C]//Proceedings of Advances in Neural Information Processing Systems. 2013:1367-1375.
[57] WANG M, LU Z, LI H, et al. Syntax-based deep matching of short texts[C]//Proceedings of the 24th International Conference on Artificial Intelligence. AAAI Press, 2015:1354-1361.
