华东师范大学学报(自然科学版) ›› 2021, Vol. 2021 ›› Issue (1): 60-66.doi: 10.3969/j.issn.1000-5641.201922018

• 物理学与电子学 • 上一篇    下一篇

基于量子理论的句子相似度研究

孟冰卿, 马雷*()   

  1. 华东师范大学 物理与电子科学学院, 上海 200241
  • 收稿日期:2019-11-21 出版日期:2021-01-25 发布日期:2021-01-28
  • 通讯作者: 马雷 E-mail:lma@phy.ecnu.edu.cn

Study on sentence similarity based on quantum theory

Bingqing MENG, Lei MA*()   

  1. School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
  • Received:2019-11-21 Online:2021-01-25 Published:2021-01-28
  • Contact: Lei MA E-mail:lma@phy.ecnu.edu.cn

摘要:

量子理论所具备的叠加、纠缠、不相容和干涉等特征使其成为优秀的建模框架. 研究了量子理论在自然语言理解方面的应用潜力. 在自然语言句子匹配任务上, 探讨了量子理论作为一种形式化框架捕捉句子、词语的语义和语义建模上的能力: 利用量子态构建句子的语义Hilbert空间, 计算句子信息变换过程中信息的保真度; 与此同时, 将量子理论与Word Embedding技术巧妙结合, 在高维低秩向量空间中表示单词或概念, 求取句子的相似性. 在一个真实业务场景中构造的问句匹配数据集上, 模拟数据表明, 所提出的方法相比于经典方法取得了更好的效果, 为以后进行多个句子的相似度研究提供了新的思路, 也是计算机科学与量子理论学科交叉研究领域的一个突破, 符合当下研究的方向.

关键词: 量子理论, 自然语言, 保真度, 句子相似度

Abstract:

Quantum theory has the characteristics of superposition, entanglement, incompatibility, and interference, which make it an excellent modeling framework. For the purpose of sentence matching, we explore the ability of quantum theory as a framework to capture sentence meaning and model semantic processes. We use quantum states to construct the semantic Hilbert space and calculate the fidelity of information during sentence transformation. The similarity of sentences is subsequently determined by using word embedding technology to represent words or concepts in semantic vector spaces. Simulation data showed that the proposed method achieved better results than traditional methods for sentence matching datasets constructed on real business scenarios. Hence, this paper provides a new idea for similarity research of multiple sentences and introduces a breakthrough in interdisciplinary research between computer science and quantum theory, in line with current research trends.

Key words: quantum theory, natural language, fidelity, sentence similarity

中图分类号: