The enterprise knowledge graph is a kind of domain knowledge base for the financial field to describe business relationships between enterprises. Although the domain knowledge graph is not broadly covered in the field, the precision of the knowledge is better than with an open knowledge graph. Despite the fact that open knowledge graphs have made significant advancements in recent years, vertical fields-especially business-have not seen in-depth applications in practice; this has resulted in significant demands on the enterprise knowledge graph. This paper proposes a Chinese entity relation extraction method based on classification for the limitation of extraction results. In this method, the maximum entropy model is used to analyze the data of selected companies' announcements to determine the optimal feature template. The results show that accuracy rates reach over 85% in the enterprise bulletin data set.
SUN Chen
,
FU Ying-nan
,
CHENG Wen-liang
,
QIAN Wei-ning
. Chinese named entity relation extraction for enterprise knowledge graph construction[J]. Journal of East China Normal University(Natural Science), 2018
, 2018(3)
: 55
-66
.
DOI: 10.3969/j.issn.1000-5641.2018.03.007
[1] PUJARA J, MIAO H, GETOOR L, et al. Knowledge graph identification[C]//International Semantic Web Conference. New York:Springer-Verlag, Inc, 2013:542-557.
[2] DESHPANDE O, LAMBA D S, TOURN M, et al. Building, maintaining, and using knowledge bases:A report from the trenches[C]//ACM SIGMOD International Conference on Management of Data. ACM, 2013:1209-1220.
[3] HEARST M A. Automatic acquisition of hyponyms from large text corpora[C]//Proceeding of the 14th Conference on Computational Linguistics. 1992:539-545.
[4] WU W T, LI H S, WANG H X, et al. Probase:A probabilistic taxonomy for text understanding[C]//Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM, 2012:481-492.
[5] ZHOU G D, SU J, ZHANG J, et al. Exploring various knowledge in relation extraction[C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. 2005:427-434.
[6] ZHOU G D, ZHANG M, JI D H, et al. Tree kernel-based relation extraction with context-sensitive structured parse Tree information[C]//Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. DBLP, 2007:728-736.
[7] BRIN S. Extracting patterns and relations from the World Wide Web[C]//WebDB'98 Selected Papers from the International Workshop on the World Wide Web and Databases. Berlin:Springer, 1998:172-183.
[8] AGICHTEIN E, GRAVANO L. Snowball:Extracting relations from large plain-text collections[C]//ACM Conference on Digital Libraries. ACM, 2000:85-94.
[9] HASEGAWA T, SEKINE S, GRISHMAN R. Discovering relations among named entities from large corpora[C]//Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2004:415.
[10] 郭喜跃, 何婷婷, 胡小华, 等. 基于句法语义特征的中文实体关系抽取[J]. 中文信息学报, 2014, 28(6):183-189.
[11] KAMBHATLA N. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations[C]//Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions. Association for Computational Linguistics, 2004:Article No 22.
[12] RATNAPARKHI A. Maximum entropy models for natural language ambiguity resolution[D]. Pennsylvania:University of Pennsylvania, 1998.
[13] 李丹. 基于朴素贝叶斯方法的中文文本分类研究[D]. 石家庄:河北大学, 2011.
[24] 薛俊欣. 条件随机场模型研究及应用[D]. 济南:山东大学, 2014.
[15] DARROCH J N, RATCLIFF D. Generalized iterative scaling for log-linear models[J]. Annals of Mathematical Statistics, 1972, 43(5):1470-1480.
[16] BERGER A. The improved iterative scaling algorithm:A gentle introduction[R/OL]. (1997-12-12)[2017-05-19]. http://www.doc88.com/p-1806889293798.html.
[17] 胡宝顺, 王大玲, 于戈, 等. 基于句法结构特征分析及分类技术的答案提取算法[J]. 计算机学报, 2008, 31(4):662-676.
[18] OLSON D L, DELEN D. Advanced Data Mining Techniques[M]. Berlin:Springer, 2008.