共指消解技术综述

doi:10.3969/j.issn.1000-5641.2019.05.002

摘要/Abstract

摘要： 共指消解旨在识别指向同一实体的不同表述，在文本摘要、机器翻译、自动问答和知识图谱等领域有着广泛的应用.然而，作为自然语言处理中的一个经典问题，它是一个NP-Hard的问题.本文首先对共指消解的基本概念进行介绍，对易混淆概念进行解析，并讨论了共指消解的研究意义及难点.本文进一步归纳梳理了共指消解的发展历程，将共指消解从技术层面划分为若干阶段，并介绍了各个阶段的代表性模型，探讨了各类模型的优缺点，其中着重介绍了基于规则、基于机器学习、基于全局最优化、基于知识库和基于深度学习的模型.接着对共指消解的评测会议进行介绍，对共指消解的语料库和常用评测指标进行解释和对比分析.最后，指出了当前共指消解模型尚未解决的问题，探讨了共指消解的发展趋势.

关键词: 共指消费, 自然语言处理, 全局优化, 知识库, 深度学习

Abstract: Coreference resolution is the task of finding all expressions that point to the same entity in a text; this technique is widely used for text summarization, machine translation, question answering systems, and knowledge graphs. As a classic problem in natural language processing, it is considered NP-Hard. This paper first introduces the basic concepts of coreference resolution, analyzes some confusing concepts related thereto, and discusses the research significance and difficulties of the technique. Then, we summarize research advances in coreference resolution, divide them into stages from a technical standpoint, introduce the representative approaches for each stage, and discuss the advantages and disadvantages of various methods. The summarized approaches are five-fold:rule-based, machine learning, global optimization, knowledge base, and deep learning. Next, we introduce benchmark conferences for the problem of coreference resolution; in this context, we explain and compare their corpus and common evaluation metrics. Finally, this paper highlights the open problems for coreference resolution, and discusses trends and directions of future research.

Key words: coreference resolution, natural language processing, global optimization, knowledge base, deep learning

中图分类号:

TP391

陈远哲, 匡俊, 刘婷婷, 高明, 周傲英. 共指消解技术综述[J]. 华东师范大学学报(自然科学版), 2019, 2019(5): 16-35.

CHEN Yuan-zhe, KUANG Jun, LIU Ting-ting, GAO Ming, ZHOU Ao-ying. A survey on coreference resolution[J]. Journal of East China Normal University(Natural Sc, 2019, 2019(5): 16-35.

参考文献

[1] 刘峤,李杨,段宏,等.知识图谱构建技术综述[J].计算机研究与发展, 2016, 53(3):582-600.
[2] 王厚峰.指代消解的基本方法和实现技术[J].中文信息学报, 2002, 16(6):9-17.
[3] GETOOR L, MACHANAVAJJHALA A. Entity resolution:Theory, practice&open challenge[J]. Proceedings of the Very Large Data Bases Endowment, 2012, 5(12):2018-2019.
[4] MELLI G, ESTER M. Supervised identification and linking of concept mentions to a domain-specific ontology[C]//Proceedings of the 19th ACM International Conference on Information&Knowledge Management. 2010:1717-1720.
[5] JURAFSKY D, MARTIN H. Speech and Language Processing:An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition[M]. New Delhi:Pearson Education, 2000.
[6] LANG J, QIN B, LIU T, et al. Intra-document coreference resolution:The state of the art[J]. Journal of Chinese Language and Computing, 2008,17(4):227-253.
[7] 宋洋,王厚峰.共指消解研究方法综述[J].中文信息学报, 2015, 29(1):1-12.
[8] LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition[C]//Proceedings of NAACL-HLT. 2016:260-270.
[9] 高艳红,李爱萍,段利国.面向实体链接的多特征图模型实体消歧方法[J].计算机应用研究, 2017, 34(10):2909-2914.
[10] LI Y, WANG C, HAN F Q, et al. Mining evidences for named entity disambiguation[C]//Proceedings of the 19th International Conference on Knowledge Discovery and Data Mining. 2013:1070-1078.
[11] DEEMTER K V, KIBBLE R. On coreferring:Coreference in MUC and related annotation schemes[J]. Computational Linguistics, 2000, 26(4):629-637.
[12] MITKOV R. Anaphora resolution:The state of the art[D]. Wolverhampton:University of Wolverhampton, 1999.
[13] HOBBS J R. Resolving pronoun references[J]. Journal of Lingua, 1978, 44:311-338.
[14] WALKER M A. Evaluating discourse processing algorithms[C]//Proceedings of the 27th Annual Meeting of Association of Computational Linguistics. Vancouver, 1989.
[15] GROSZ B, JOSHI A, WEINSTEIN S. Centering:A framework for modelling the local coherence of discourse[J]. Journal of Computational Linguistics, 1995, 21(2):203-225.
[16] MCCARTHY J, LEHNERT W. Using decision trees for coreference resolution[C]//Proceedings of the 14th International Joint Conference on Artificial Intelligence. 1995.
[17] PONZETTO S P, STRUBE M. Exploiting semantic role labeling, wordnet and wikipedia for coreference resolution[C]//Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. 2006:192-199.
[18] RAHMAN A, NG V. Supervised models for coreference resolution[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009:968-977.
[19] CARDIE C, WAGSTAFF K. Noun phrase coreference as clustering[C]//Proceedings of the Joint Conference on Empirical Methods in NLP and Very Large Corpora. 1999:277-308.
[20] 谢永康,周雅倩,黄萱菁.一种基于谱聚类的共指消解方法[J].中文信息学报, 2007, 21(2):77-82.
[21] 周俊生,黄书剑,陈家骏,等.一种基于图划分的无监督汉语指代消解算法[J].中文信息学报, 2007, 21(2):77-82.
[22] MULLER C, RAPP S, STRUBE M. Applying co-training to reference resolution[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 2002:352-359
[23] DENIS P, BALDRIDGE J. Joint determination of anaphoricity and coreference resolution using integer programming[C]//Proceedings of Human Language Technologies 2007:The Conference of the North American Chapter of the Association for Computational Linguistics. 2007:236-243.
[24] RAGHUNATHAN K, LEE H, RANGARAJAN S, et al. A multi-pass sieve for coreference resolution[C]//Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. 2010.
[25] VESDAPUNT N, BELLARE K, DALVI N. Crowdsourcing algorithms for entity resolution[C]//Proceedings of the VLDB Endowment. 2014:1071-1082.
[26] RAHMAN A, NG V. Coreference resolution with world knowledge[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. 2011:814-824.
[27] RATINOV L, ROTH D. Learning-based Multi-Sieve Co-Reference Resolution with Knowledge[M]. Association for Computational Linguistics, 2012:1234-1244.
[28] DURRETT G, KLEIN D. Easy Victories and Uphill Battles in Coreference Resolution[M]. Association for Computational Linguistics, 2013:1971-1982.
[29] SORALUZE A, ARREGI O, ARREGI X, et al. Enriching basque coreference resolution system using semantic knowledge sources[C]//Proceedings of the 2nd Workshop on Coreference Resolution Beyond OntoNotes. Association for Computational Linguistics, 2017:8-16.
[30] WISEMAN S, RUSH A M, SHIEBER S M. Learning global features for coreference resolution[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2016.
[31] CLARK K, MANNING C D. Deep reinforcement learning for mention-ranking coreference models[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016:2256-2262.
[32] LEE K, HE L H, LEWIS M, et al. End-to-end neural coreference resolution[C]//Conference on Empirical Methods in Natural Language Processing. 2017:188-197.
[33] HAGHIGHI A, KLEIN D. Simple coreference resolution with rich syntactic and semantic features[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009:1152-1161.
[34] CONVERSE S P. Pronominal Anaphora Resolution in Chinese[D]. Pennsylvania:University of Pennsylvania, 2006.
[35] SIDNER C. Focusing for interpretation of pronouns[J]. Computational Linguistics. 1981, 7(4):217-231.
[36] BRENNAN S E, FRIEDMAN M W, POLLARD C. A centering approach to pronouns[C]//Proceedings of the 25th Annual Meeting of the Association for Computational Linguistics. 1987:155-162.
[37] GE N Y, HALE J, CHARNIAK E. A statistical approach to anaphora resolution[C]//Proceedings of the ACL 1998 Workshop on Very Large Corpora. 1998.
[38] MCCALLUM A, WELLNER B. Conditional models of identity uncertainty with application to noun coreference[C]//International Conference on Neural Information Processing System. 2004:905-912.
[39] NG V. Unsupervised models for coreference resolution[C]//Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. 2008:640-649.
[40] BHATTACHARYA I, GETOOR L. A latent Dirichlet model for unsupervised entity resolution[C]//SIAM International Conference on Data Mining. 2006.
[41] RAGHAVAN P, FOSLERLUSSIER E, LAI A M. Exploring semi-supervised coreference resolution of medical concepts using semantic and temporal features[C]//Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2012:731-741.
[42] MCCALLUM A, WELLNER B. Conditional models of identity uncertainty with application to noun coreference[C]//Proceedings of Neural Information Processing Systems. 2004:905-912.
[43] YANG X, SU J. Coreference resolution using semantic relatedness information from automatically discovered patterns[C]//Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 2007:528-535.
[44] CHEN C, NG V. Combining the best of two worlds:A hybrid approach to multilingual coreference resolution[C]//Joint Conference on EMNLP&CONLL-Shared Task. Association for Computational Linguistics, 2012:56-63.
[45] LEE H, PEIRSMAN Y, CHANG A, et al. Stanford's multi-pass sieve coreference resolution system at the conll-2011 shared task[C]//Proceedings of the 15th Conference on Computational Natural Language Learning:Shared Task. 2011:28-34.
[46] FERNANDES E R, SANTOS C N, MILIDIU R L. Latent trees for coreference resolution[J]. Computational Linguistics, 2014, 40(4):801-835.
[47] FERNANDES E R, MILIDIU R L. Entropy-guided feature generation for structured learning of Portuguese dependency parsing[C]//Computational Processing of the Portuguese Language. 2012:146-156.
[48] YU C N J, JOACHIMS T. Learning structural SVMs with latent variables[C]//Proceedings of the 26th Annual International Conference on Machine Learning. 2009:1169-1176.
[49] DAUME H, MARCU D. Learning as search optimization:Approximate large margin methods for structured prediction[C]//Proceedings of the 22nd International Conference on Machine Learning. 2005:169-176.
[50] BJORKELUND A, KUHN J. Learning structured perceptrons for coreference resolution with latent antecedents and non-local features[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Lingustics. 2014:47-57.
[51] MARTSCHAT S, STRUBE M. Latent structures for coreference resolution[J]. Transactions of the Association for Computational Linguistics, 2015(3):405-418.
[52] RECASENS M, MARNEFFE M C, POTTS C. The life and death of discourse entities:Identifying singleton metions[C]//The 2013 Annual Conference of the North American Chapter of the Association for Computational Linguistics. 2013:627-633.
[53] MARNEFFE M C, RECASENS M, POTTS C, et al. Modeling the lifespan of discourse entities with application to coreference resolution[J]. Journal of Artificial Intelligence Research, 2015, 52:445-475.
[54] PARK C, CHOI K H, LEE C K, et al. Korean coreference resolution with guided mention pair model using deep learning[J]. ETRI Journal, 2016, 38(6):1207-1217.
[55] CLARK K, MANNING C D. Improving coreference resolution by learning entity-level distributed representations[EB/OL].[2019-05-03]. https://arxiv.org/pdf/1606.01323.pdf.
[56] MIKOLOV T, KARAFIAT M, BURGET L, et al. Recurrent neural network based language model[C]//Conference of the International Speech Communication Association. 2010:1045-1048.
[57] PETERS M E, NEUMANN M, LYYER M, et al. Deep contextualized word representations[C]//North American Chapter of the Association for Computational Linguistics. 2018:2227-2237.
[58] LEE K, HE L H, ZETTLEMOYER L. Higher-order coreference resolution with coarse-to-fine inference[C]//North American Chapter of the Association for Computational Linguistics. 2018:687-692.
[59] LAPPIN S, SHALOM H J. An algorithm for pronominal anaphora resolution[J]. Computational Linguistics, 1994, 20(4):535-561.
[60] POESIO M, STEVENSON R, EUGENIO B D, et al. Centering:A parametric theory and its instantiations[J]. Computational Linguistics, 2004, 30(3):309-363.
[61] NG V, CARDIE C. Improving machine learning approaches to coreference resolution[C]//Meeting of the Association of Computational Linguistics. 2002:104-111.
[62] PONZETTO S P, STRUBE M. Exploiting semantic role labeling, WordNet and Wikipedia for coreference resolution[C]//Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL. 2006:192-199.
[63] DENIS P, BALDRIDGE J. Specialized models and ranking for coreference resolution[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2008:660-669.
[64] YANG X, ZHOU G, SU J, et al. Coreference resolution using competitive learning approach[C]//Proceedings of the Association of Computational Linguistics. 2003:176-183.
[65] YANG X F, SU J, LANG J, et al. An entity-mention model for coreference resolution with inductive logic programming[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2008:843-851.
[66] RAHMAN A, NG V. Narrowing the modeling gap:A cluster-ranking approach to coreference resolution[J]. Journal of Artificial Intelligence Research, 2011, 40:469-521.
[67] NEWMAN M E J, GIRVAN M. Finding and evaluating community structure in networks[J]. Phys Rev E, 2004, 69(2):026113.
[68] BLUM A, MITCHELL T. Combining labeled and unlabeled data with co-training[C]//Proceedings of the 11th Annual Conference on Learning Theory. 1998:92-100.
[69] GANCHEV K, GRACA J, GILLENWATER J. Posterior regularization for structured latent variable models[J]. Journal of Machine Learning Research, 2010, 11(1):2001-2049.
[70] MOOSAVI N S, STRUBE M. Search space pruning:A simple solution for better coreference resolvers[C]//Proceedings of NAACL-HLT 2016. Association for Computational Linguistics, 2016:1005-1011.
[71] WISEMAN S, RUSH A M, SHIEBER S M, et al. Learning anaphoricity and antecedent ranking features for coreference resolution[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. 2015:1416-1426.
[72] MA C, DOPPA J R, ORR J W, et al. Prune-and-score:Learning for greedy coreference resolution[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014.
[73] SUCHANEK F, KASNECI G, WEIKUM G. YAGO:A core of semantic knowledge unifying wordnet and Wikipedia[C]//Proceedings of the World Wide Web Conference. 2007:697-706.
[74] BAKER C F, FILLMORE C J, LOWE J B. The Berkeley FrameNet project[C]//Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and the 17th International Conference on Computational Linguistics. 1998:86-90.
[75] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL].[2019-05-10]. https://arxiv.org/pdf/1301.3781.pdf.
[76] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9:1735-1780.
[77] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[EB/OL].[2019-06-02]. https://arxiv.org/pdf/1409.0473.pdf.
[78] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553):436.
[79] CLARK K, MANNING C D. Entity-centric coreference resolution with model stacking[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. 2015:1405-1415.
[80] HINTON G, TIELEMAN T. Lecture 6.5-RmsProp:Divide the gradient by a running average of its recent magnitude[J]. COURSERA:Neural Networks for Machine Learning, 2012, 4:26-30.
[81] HINTON G, SRIVASTAVA N, KRIZHEVSKY I, et al. Improving neural networks by preventing coadaptation of feature detectors[EB/OL].[2019-06-20]. https://arxiv.org/pdf/1207.0580.pdf.
[82] WILLIAMS R J. Simple statistical gradient-following algorithms for connectionist reinforcement learning[J]. Machine Learning, 1992, 8(3/4):229-256.
[83] JI Y F, TAN C H, MARTSCHAT S, et al. Dynamic entity representations in neural language models[EB/OL].[2019-06-10]. https://arxiv.org/pdf/1708.00781.pdf.
[84] PENNINGTON J, SOCHER R, MANNING C D. GloVe:Global vectors for word representation[C]//Conference on Empirical Methods in Natural Language Processing. 2014:1532-1543.
[85] TURIAN J, RATINOV L, BENGIO Y. Word representations:A simple and general method for semi-supervised learning[C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 2010:384-394.
[86] GRISHMAN R, SUNDHEIM B. Message understanding conference-6:A brief history[C]//Proceedings of the 16th Conference on Computational linguistics. 1996:466-471.
[87] NIST, US. The ACE 2003 Evaluation Plan V[R]. US National Institute for Standards and Technology (NIST), 2003.
[88] RECASENS M, MARQUEZ L, SAPENA E, et al. SemEval-2010 Task 1 OntoNotes English:Coreference Resolution in Multiple Languages[M]. Philadelphia:Linguistic Data Consortium, 2011.
[89] PRADHAN S S, RAMSHAW L, MARCUS M, et al. CoNLL-2011 shared task:Modeling unrestricted coreference in OntoNotes[C]//Proceedings of the Shared Task of the 15th Conference on Computational Natural Language Learning. 2011:1-27
[90] PRADHAN S, MOSCHITTI A, XUE N W, et al. CoNLL-2012 shared task:Modeling multilingual unrestricted coreference in OntoNotes[C]//Proceedings of the Shared Task of the 16th Conference on Computational Natural Language Learning. 2012:1-40.
[91] VILAIN M, BURGER J, ABERDEEN J, et al. A model-theoretic coreference scoring scheme[C]//Proceedings of the 6th Conference on Message Understanding. 1995:45-52.
[92] BAGGA A, BALDWIN B. Algorithms for scoring coreference chains[C]//Proceedings of the Linguistic Coreference Workshop at the First International Conference on Language Resources and Evaluation. 1998:563-566.
[93] LUO X. On coreference resolution performance metrics[C]//Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 2005:25-32.
[94] RECASENS M, HOVY E. BLANC:Implementing the rand index for coreference evaluation[J]. Natural Language Engineering, 2011, 17(4):485-510.
[95] LUO X, PRADHAN S, RECASENS M, et al. An extension of BLANC to system mentions[C]//Meeting of the Association for Computational Linguistics. 2014:24.
[96] MOOSAVI N S, STRUBE M. Which coreference evaluation metric do you trust?A proposal for a link-based entity aware metric[C]//Meeting of the Association for Computational Linguistics. 2016:7-12.
[97] KUHN H W. The Hungarian method for the assignment problem[J]. Naval Research Logistics Quarterly, 1955, 2(1/2):83-97.
[98] MUNKRES J. Algorithms for the assignment and transportation problems[J]. Journal of the Society for Industrial&Applied Mathematics, 1957, 5(1):32-38.
[99] PENG H R, KHASHABI D, ROTH D. Solving hard coreference problems[EB/OL].[2019-05-1]. https://arxiv.org/pdf/1907.05524.pdf.
[100] ZHOU Z H. A brief introduction to weakly supervised learning[J]. National Science Review, 2017, 5(1):44-53.
[101] LEE D H. Pseudo-Label:The simple and efficient semi-supervised learning method for deep neural networks[C]//International Conference on Machine Learning. 2013.
[102] RASMUS A, VALPOLA H, HONKALA M, et al. Semi-supervised learning with ladder networks[J]. Computer Science, 2015:1-9.
[103] SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529:484-489.
[104] MA S, SUN X, LIN J Y, et al. A hierarchical end-to-end model for jointly improving text summarization and sentiment classification[C]//International Joint Conferencces on Artificial Intelligence. 2018.
[105] CHO K, VAN MERRENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoderdecoder for statistical machine translation[C]//Conference on Empirical Methods in Natural Language Processing. 2014:1724-1734.