Journal of East China Normal University(Natural Science) ›› 2021, Vol. 2021 ›› Issue (5): 24-36.doi: 10.3969/j.issn.1000-5641.2021.05.003
• Financial Knowledge Graph • Previous Articles Next Articles
Rui FU, Jianyu LI, Jiahui WANG, Kun YUE*(), Kuang HU
Received:
2021-08-05
Online:
2021-09-25
Published:
2021-09-28
Contact:
Kun YUE
E-mail:kyue@ynu.edu.cn
CLC Number:
Rui FU, Jianyu LI, Jiahui WANG, Kun YUE, Kuang HU. Joint extraction of entities and relations for domain knowledge graph[J]. Journal of East China Normal University(Natural Science), 2021, 2021(5): 24-36.
Table 1
Statistics of entities and relations in the datasets"
数据集 | 实体数量 | 关系类别 | 关系数量 |
金融领域 | 5034 | /person/company | 2064 |
/company/founders | 557 | ||
/company/place_founded | 243 | ||
/company/advisors | 47 | ||
/major_shareholder_of | 304 | ||
少数民族领域 | 4722 | 饮食 | 434 |
人口 | 120 | ||
别名 | 223 | ||
居住地 | 557 | ||
宗教 | 231 | ||
语言 | 220 | ||
工艺 | 171 | ||
歌舞 | 470 | ||
文学作品 | 279 | ||
习俗 | 263 | ||
服饰 | 440 | ||
建筑 | 214 |
Table 3
Effectiveness of joint extraction of entities and relations on a financial dataset"
模型 | 命名实体识别 | 关系抽取 | |||||
P | R | F1 | P | R | F1 | ||
BERT-BiLSTM-CRF+RBert | 0.7163 | 0.7130 | 0.7146 | 0.5001 | 0.3439 | 0.4075 | |
Word2vec-BiLSTM-CRF | 0.6932 | 0.6313 | 0.6608 | 0.4185 | 0.4231 | 0.4207 | |
Word2vec-BiGRU-CRF | 0.6770 | 0.7083 | 0.6923 | 0.4611 | 0.3824 | 0.4182 | |
Word2vec-BiGRU*-CRF | 0.6983 | 0.6874 | 0.6928 | 0.4523 | 0.4111 | 0.4307 | |
BERT-BiLSTM-CRF | 0.7346 | 0.7176 | 0.7260 | 0.4806 | 0.4859 | 0.4832 | |
BERT-BiGRU-CRF | 0.7600 | 0.7037 | 0.7308 | 0.5032 | 0.4593 | 0.4802 | |
BERT- BiGRU*-CRF | 0.7530 | 0.7353 | 0.7440 | 0.4952 | 0.4787 | 0.4868 |
Table 4
Effectiveness of joint extraction of entities and relations on an ethnic minority dataset"
模型 | 命名实体识别 | 关系抽取 | |||||
P | R | F1 | P | R | F1 | ||
BERT-BiLSTM-CRF+RBert | 0.8579 | 0.8513 | 0.8546 | 0.7578 | 0.3715 | 0.4986 | |
Word2vec-BiLSTM-CRF | 0.7549 | 0.7821 | 0.7682 | 0.5461 | 0.6132 | 0.5777 | |
Word2vec-BiGRU-CRF | 0.7649 | 0.7713 | 0.7680 | 0.6401 | 0.5258 | 0.5773 | |
Word2vec-BiGRU*-CRF | 0.7761 | 0.7894 | 0.7827 | 0.6261 | 0.5739 | 0.5988 | |
BERT-BiLSTM-CRF | 0.8674 | 0.9173 | 0.8916 | 0.7761 | 0.6194 | 0.6889 | |
BERT-BiGRU-CRF | 0.8688 | 0.9285 | 0.8976 | 0.6561 | 0.7188 | 0.6860 | |
BERT- BiGRU*-CRF | 0.9121 | 0.9313 | 0.9216 | 0.7209 | 0.6983 | 0.7094 |
Table 5
Effectiveness of overlap relation extraction"
模型 | 金融领域 | 少数民族领域 | |||||
P | R | F1 | P | R | F1 | ||
BERT-BiLSTM-CRF+RBert | 0.3526 | 0.1528 | 0.2132 | 0.5714 | 0.3636 | 0.4444 | |
Word2vec-BiLSTM-CRF | 0.4259 | 0.3126 | 0.3606 | 0.4914 | 0.5969 | 0.5390 | |
Word2vec-BiGRU-CRF | 0.4116 | 0.3150 | 0.3569 | 0.4656 | 0.6250 | 0.5351 | |
Word2vec-BiGRU*-CRF | 0.4234 | 0.3203 | 0.3647 | 0.4815 | 0.6500 | 0.5532 | |
BERT-BiLSTM-CRF | 0.5107 | 0.3495 | 0.4150 | 0.5492 | 0.7114 | 0.6199 | |
BERT-BiGRU-CRF | 0.4923 | 0.3495 | 0.4135 | 0.6111 | 0.6250 | 0.6180 | |
BERT- BiGRU*-CRF | 0.5312 | 0.3564 | 0.4213 | 0.6200 | 0.6566 | 0.6378 |
Table 6
Comparison of relation extraction based on different scaled labeled data"
标注数据规模/% | 金融领域 | 少数民族领域 | |||||
P | R | F1 | P | R | F1 | ||
10 | 0.1304 | 0.1071 | 0.1176 | 0.2857 | 0.1333 | 0.1818 | |
20 | 0.3230 | 0.2456 | 0.2790 | 0.4286 | 0.3333 | 0.3750 | |
30 | 0.4328 | 0.3275 | 0.3728 | 0.6250 | 0.5556 | 0.5882 | |
40 | 0.4514 | 0.3936 | 0.4205 | 0.6667 | 0.5714 | 0.6154 | |
50 | 0.4667 | 0.4514 | 0.4589 | 0.7143 | 0.6822 | 0.6978 | |
60 | 0.4936 | 0.4734 | 0.4833 | 0.7163 | 0.6875 | 0.7016 | |
100 | 0.4952 | 0.4787 | 0.4868 | 0.7209 | 0.6983 | 0.7094 |
Table 7
Impacts of dropout on BERT-BiGRU*-CRF"
dropout | 金融领域 | 少数民族领域 | |||||
P | R | F1 | P | R | F1 | ||
0.3 | 0.4714 | 0.4796 | 0.4755 | 0.7100 | 0.6965 | 0.7032 | |
0.4 | 0.4732 | 0.4936 | 0.4832 | 0.7209 | 0.6983 | 0.7094 | |
0.5 | 0.4952 | 0.4787 | 0.4868 | 0.7189 | 0.6923 | 0.7055 | |
0.6 | 0.4960 | 0.4678 | 0.4815 | 0.7341 | 0.6713 | 0.7013 | |
0.7 | 0.5013 | 0.4597 | 0.4796 | 0.7408 | 0.6615 | 0.6989 |
1 | 刘峤, 李杨, 段宏, 等. 知识图谱构建技术综述. 计算机研究与发展, 2016, 53 (3): 582- 600. |
2 |
LI J, WANG Z, WANG Y, et al. Research on distributed search technology of multiple data sources intelligent information based on knowledge graph. Journal of Signal Processing Systems, 2021, 93 (2): 239- 248.
doi: 10.1007/s11265-020-01592-5 |
3 | 饶子昀, 张毅, 刘俊涛, 等.应用知识图谱的推荐方法与系统 [J/OL].自动化学报, 2020. (2020-07-09)[2021-08-05]. https://doi.org/10.16383/j.aas.c200128. |
4 | LU X, PRAMANIK S, ROY R., et al. Answering complex questions by joining multi-document evidence with quasi knowledge graphs [C]//Proceedings of the 42nd International ACM SIGIR Conference. NewYork: ACM, 2019: 105-114. |
5 |
LEHMANN J, ISELE R., JAKOB M, et al. DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web, 2015, 6 (2): 167- 195.
doi: 10.3233/SW-140134 |
6 | MAHDISOLTANI F, BIEGA J, SUCHANEK F. YAGO3: A knowledge base from multilingual Wikipedias [C/OL]//Proceedings of the 7th Biennial Conference on Innovative Data Systems Research. 2015. [2021-08-05]. https://suchanek.name/work/publications/cidr2015.pdf. |
7 | BOLLACKER K, COOK R, TUFTS P. Freebase: A shared database of structured general human knowledge [C]//Proceedings of the 22nd AAAI Conference on Artificial Intelligence. California: AAAI, 2007: 1962-1963. |
8 | ELHAMMADI S, LAKSHMANAN L, NG R, et al. A high precision pipeline for financial knowledge graph construction [C]//Proceedings of the 28th International Conference on Computational Linguistics. Berlin: Springer, 2020: 967-977. |
9 | YANG Y, WEI Z, CHEN Q, et al. Using external knowledge for financial event prediction based on graph neural networks [C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management. Beijing: ACM, 2019: 2161-2164. |
10 | 龙军, 殷建平, 祝恩, 等. 主动学习研究综述. 计算机研究与发展, 2008, (S1): 300- 304. |
11 |
HOCHREITER S, SCHMIDHUBER J. Long short-term memory. Neural Computation, 1997, 9 (8): 1735- 1780.
doi: 10.1162/neco.1997.9.8.1735 |
12 | CHO K, MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. Computer Science, 2014, 1724- 1734. |
13 | ZENG D, LIU K, LAI S, et al. Relation classification via convolutional deep neural network [C]//Proceedings of the 25th International Conference on Computational Linguistics. Pennsylvania: ACL, 2014: 2335-2344. |
14 | XU Y, MOU L, GE L, et al. Classifying relations via long short term memory networks along shortest dependency paths [C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Pennsylvania: ACL, 2015: 1785-1794. |
15 | MIWA M, BANSAL M. End-to-end relation extraction using LSTMs on sequences and tree structures [C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Pennsylvania: ACL, 2016: 1105-1116. |
16 | ZHENG S, WANG F, BAO H, et al. Joint extraction of entities and relations based on a novel tagging scheme [C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Pennsylvania: ACL, 2017: 1227-1236. |
17 | ZENG X, ZENG D, HE S, et al. Extracting relational facts by an end-to-end neural model with copy mechanism [C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Pennsylvania: ACL, 2018: 506-514. |
18 | HOULSBY N, HUSZáR F, GHAHRAMANI Z, et al. Bayesian active learning for classification and preference learning [EB/OL]. (2011-12-24) [2021-08-05]. https://arxiv.org/pdf/1112.5745.pdf. |
19 | TANG P, HUANG S. Self-paced active learning: Query the right thing at the right time [C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence. California: AAAI, 2019: 5117-5124. |
20 |
TRAN V, NGUYEN N, FUJITA H, et al. A combination of active learning and self-learning for named entity recognition on Twitter using conditional random fields. Knowledge-Based Systems, 2017, 132, 179- 187.
doi: 10.1016/j.knosys.2017.06.023 |
21 | SHEN Y, YUN H, LIPTON Z, et al. Deep active learning for named entity recognition [EB/OL]. (2018-02-04)[2021-09-08]. https://arxiv.org/pdf/1707.05928.pdf. |
22 | JACOB D, CHANG M, LEE K, et al. BERT: Pretraining of deep bidirectional transformers for language understanding [C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. 2019: 4171-4186. |
23 | RIEDEL S, YAO L, MCCALLUM A K. Modeling relations and their mentions without labeled text [C]//Proceedings of the 2010 European Conference on Machine Learning and Knowledge Discovery in Databases. Berlin: Springer, 2010: 148-163. |
24 |
郁可人, 傅云斌, 董启文. 基于神经网络语言模型的分布式词向量研究进展. 华东师范大学学报(自然科学版), 2017, (5): 52- 65.
doi: 10.3969/j.issn.1000-5641.2017.05.006 |
[1] | Jie WANG, Wenrui HUANG, Shengyu ZHAO, Xiaoya XIA, Fanyu HAN, Wei WANG, Yanbin ZHANG. OpenRank contribution evaluation method and empirical study in open-source course [J]. Journal of East China Normal University(Natural Science), 2024, 2024(5): 11-19. |
[2] | Jiarui ZHANG, Qiming ZHANG, Fenglin BI, Yanbin ZHANG, Wei WANG, Erjin REN, Haili ZHANG. Locally lightweight course teaching-assistant system based on IPEX-LLM [J]. Journal of East China Normal University(Natural Science), 2024, 2024(5): 162-172. |
[3] | Yanli FENG, Yu ZHOU, Fuxing HUANG, Junling WAN, Peisen YUAN. Study on short-term electricity load forecasting based on SF-Transformer for intelligent education platform [J]. Journal of East China Normal University(Natural Science), 2024, 2024(5): 173-182. |
[4] | Junlin REN, Huan WANG, Xiaodi HUANG, Yanting LI, Shenggen JU. Sequence-aware and multi-type behavioral data driven knowledge concept recommendation for massive open online courses [J]. Journal of East China Normal University(Natural Science), 2024, 2024(5): 45-56. |
[5] | Zhirui CHEN, Xuesong LU. Prompting open-source code large language models for student program repair [J]. Journal of East China Normal University(Natural Science), 2024, 2024(5): 93-103. |
[6] | Linding XIE, Yuan ZHANG, Yihong CAI. Bioinformatics-based construction of immune prognostic gene model for hepatocellular carcinoma and preliminary model validation [J]. Journal of East China Normal University(Natural Science), 2024, 2024(4): 100-110. |
[7] | Sijing RAO, Ying XIN, Junjun PAN. Skinning in character animation based on implicit surface [J]. Journal of East China Normal University(Natural Science), 2024, 2024(2): 143-156. |
[8] | Luping FENG, Liye SHI, Wen WU, Jun ZHENG, Wenxin HU, Wei ZHENG. Collaborative stranger review-based recommendation [J]. Journal of East China Normal University(Natural Science), 2024, 2024(2): 53-64. |
[9] | Yongzhuo ZHANG, Qingfeng ZHUGE, Edwin Hsing-Mean SHA, Yuhong SONG. Parallel block-based stochastic computing with adapted quantization [J]. Journal of East China Normal University(Natural Science), 2024, 2024(2): 76-85. |
[10] | Xin LU, Chang HUANG, Zhiwei JIN. Multi-view and multi-pose lock pin point cloud model reconstruction based on turntable [J]. Journal of East China Normal University(Natural Science), 2024, 2024(2): 86-96. |
[11] | Lingxiao TANG, Chang HUANG. Infrared small-target detection method based on double-layer local energy factor [J]. Journal of East China Normal University(Natural Science), 2024, 2024(2): 97-107. |
[12] | Shirui WANG, Fang SHEN, Renhu LI, Peng LI. Research on water surface glint removal and information reconstruction methods for unmanned aerial vehicle hyperspectral images [J]. Journal of East China Normal University(Natural Science), 2024, 2024(1): 36-49. |
[13] | Kaiyan XIAO, Jie LIAN. Sentence classification algorithm based on multi-kernel support vector machine [J]. Journal of East China Normal University(Natural Science), 2023, 2023(6): 85-94. |
[14] | Ruibo CUI, Feng WANG. Momentum-updated representation with reconstruction constraint for limited-view 3D object recognition [J]. Journal of East China Normal University(Natural Science), 2023, 2023(6): 61-72. |
[15] | Daojia CHEN, Zhiyun CHEN. Hierarchical description-aware personalized recommendation system [J]. Journal of East China Normal University(Natural Science), 2023, 2023(6): 73-84. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||