

  • 韩程程 ,
  • 李磊 ,
  • 刘婷婷 ,
  • 高明
  • 华东师范大学 数据科学与工程学院, 上海 200062

收稿日期: 2020-08-09

  网络出版日期: 2020-09-24


国家重点研发计划(2016YFB1000905); 国家自然科学基金(U1911203, U1811264, 61877018, 61672234, 61672384); 中央高校基本科研业务费专项资金; 上海市科技兴农推广项目(T20170303); 上海市核心数学与实践重点实验室资助项目(18dz2271000)

Approaches for semantic textual similarity

  • HAN Chengcheng ,
  • LI Lei ,
  • LIU Tingting ,
  • GAO Ming
  • School of Data Science and Engineering, East China Normal University, Shanghai 200062, China

Received date: 2020-08-09

  Online published: 2020-09-24


综述了语义文本相似度计算的最新研究进展, 主要包括基于字符串、基于统计、基于知识库和基于深度学习的方法. 针对每一类方法, 不仅介绍了其中典型的模型和方法, 而且深入探讨了各类方法的优缺点; 并对该领域的常用公开数据集和评估指标进行了整理, 最后讨论并总结了该领域未来可能的研究方向.


韩程程 , 李磊 , 刘婷婷 , 高明 . 语义文本相似度计算方法[J]. 华东师范大学学报(自然科学版), 2020 , 2020(5) : 95 -112 . DOI: 10.3969/j.issn.1000-5641.202091011


This paper summarizes the latest research progress on semantic textual similarity calculation methods, including string-based, statistics-based, knowledge-based, and deep-learning-based methods. For each method, the paper reviews not only typical models and approaches, but also discusses the respective advantages and disadvantages of each routine; the paper also explores public datasets and evaluation metrics commonly used. Finally, we put forward several possible directions for future research in the field of semantic textual similarity.


