华东师范大学学报(自然科学版) ›› 2019, Vol. 2019 ›› Issue (5): 16-35.doi: 10.3969/j.issn.1000-5641.2019.05.002

• 数据驱动的计算教育学 • 上一篇    下一篇

共指消解技术综述

陈远哲, 匡俊, 刘婷婷, 高明, 周傲英   

  1. 华东师范大学 数据科学与工程学院, 上海 200062
  • 收稿日期:2019-07-29 出版日期:2019-09-25 发布日期:2019-10-11
  • 通讯作者: 高明,男,教授,博士生导师,研究方向为教育计算、知识图谱、知识工程、用户画像、社会网络挖掘、不确定数据管理.E-mail:mgao@dase.ecnu.edu.cn. E-mail:mgao@dase.ecnu.edu.cn
  • 作者简介:陈远哲,男,硕士研究生,研究方向为自然语言处理与知识图谱.E-mail:yzchen@stu.ecnu.edu.com.
  • 基金资助:
    国家重点研发计划(2016YFB1000905);国家自然科学基金(U1811264,61877018,61502236,61672234);上海市科技兴农推广项目(T20170303)

A survey on coreference resolution

CHEN Yuan-zhe, KUANG Jun, LIU Ting-ting, GAO Ming, ZHOU Ao-ying   

  1. School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
  • Received:2019-07-29 Online:2019-09-25 Published:2019-10-11

摘要: 共指消解旨在识别指向同一实体的不同表述,在文本摘要、机器翻译、自动问答和知识图谱等领域有着广泛的应用.然而,作为自然语言处理中的一个经典问题,它是一个NP-Hard的问题.本文首先对共指消解的基本概念进行介绍,对易混淆概念进行解析,并讨论了共指消解的研究意义及难点.本文进一步归纳梳理了共指消解的发展历程,将共指消解从技术层面划分为若干阶段,并介绍了各个阶段的代表性模型,探讨了各类模型的优缺点,其中着重介绍了基于规则、基于机器学习、基于全局最优化、基于知识库和基于深度学习的模型.接着对共指消解的评测会议进行介绍,对共指消解的语料库和常用评测指标进行解释和对比分析.最后,指出了当前共指消解模型尚未解决的问题,探讨了共指消解的发展趋势.

关键词: 共指消费, 自然语言处理, 全局优化, 知识库, 深度学习

Abstract: Coreference resolution is the task of finding all expressions that point to the same entity in a text; this technique is widely used for text summarization, machine translation, question answering systems, and knowledge graphs. As a classic problem in natural language processing, it is considered NP-Hard. This paper first introduces the basic concepts of coreference resolution, analyzes some confusing concepts related thereto, and discusses the research significance and difficulties of the technique. Then, we summarize research advances in coreference resolution, divide them into stages from a technical standpoint, introduce the representative approaches for each stage, and discuss the advantages and disadvantages of various methods. The summarized approaches are five-fold:rule-based, machine learning, global optimization, knowledge base, and deep learning. Next, we introduce benchmark conferences for the problem of coreference resolution; in this context, we explain and compare their corpus and common evaluation metrics. Finally, this paper highlights the open problems for coreference resolution, and discusses trends and directions of future research.

Key words: coreference resolution, natural language processing, global optimization, knowledge base, deep learning

中图分类号: