计算机科学

目标依赖的新闻事件识别

  • 张甜甜 ,
  • 兰曼
展开
  • 华东师范大学 计算机科学与技术学院, 上海 200062

收稿日期: 2021-09-28

  网络出版日期: 2023-03-23

Target-dependent event detection from news

  • Tiantian ZHANG ,
  • Man LAN
Expand
  • School of Computer Science and Technology, East China Normal University, Shanghai 200062, China

Received date: 2021-09-28

  Online published: 2023-03-23

摘要

海量新闻文本中往往涉及多个实体, 并蕴含复杂多样的事件. 为了挖掘这些实体、事件信息, 先前的以事件为中心的事件抽取方法大多先检测事件, 再抽取事件论元. 受限于触发词和事件识别, 该方法无法应用于真实工业场景下的新闻事件抽取. 考虑到命名实体识别(named entity recognition , NER)的性能达到90%以上, 提出了以目标实体为视角的事件抽取任务—目标依赖的事件识别(target-dependent event detection, TDED), 旨在抽取出实体并识别其对应的事件. 基于该任务, 提出了先抽取实体再识别目标级事件类型的两阶段模型框架. 该模型融合了事件关键词和句法依存距离特征, 能够学习目标依赖的上下文信息. 在构建好的真实中文金融数据集上的实验结果表明, 该模型抽取性能较佳, 即使在句中存在多个实体或事件的复杂情形下也能取得很好的性能表现.

本文引用格式

张甜甜 , 兰曼 . 目标依赖的新闻事件识别[J]. 华东师范大学学报(自然科学版), 2023 , 2023(2) : 60 -72 . DOI: 10.3969/j.issn.1000-5641.2023.02.008

Abstract

In real-world scenarios, various events in the news are not only too nuanced and complex to distinguish, but also involve multiple entities. To address these problems, previous event-centric methods are designed to detect events first and then extract arguments, relying on imperfect performance for event trigger detection; this process, however, is unfit to deal with the sheer volume of news in the real world. Given that the performance of named entity recognition (NER) is satisfactory, we shift our perspective from an event-centric to a target-centric view. This paper proposes a new task: target-dependent event detection (TDED), which aims to extract target entities and detect their corresponding events. We also propose a semantic and syntactic aware approach to support thousands of target entity extractions first and subsequently the detection of dozens of event types; this approach can be applied to data from massive corporations. Experimental results on a real-world Chinese financial dataset demonstrated that our model outperformed previous methods, particularly in complex scenarios.

参考文献

1 YANG B S, MITCHELL T M. Joint extraction of events and entities within a document context [C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics (ACL), 2016: 289-299. DOI: 10.18653/v1/N16-1.
2 WADDEN D, WENNBERG U, LUAN Y, et al. Entity, relation, and event extraction with contextualized span representations [C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics(ACL), 2019: 5784-5789. DOI: 10.18653/v1/D19-1585.
3 YANG S, FENG D W, QIAO L B, et al. Exploring pre-trained language models for event extraction and generation [C] // Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics (ACL), 2019: 5284-5294. DOI: 10.18653/v1/P19-1522.
4 MA J, WANG S, ANUBHAI R, et al. Resource-enhanced neural model for event argument extraction [C] // Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics(ACL), 2020: 3554-3559. DOI: 10.18653/v1/2020.findings-emnlp.318.
5 LIN Y, JI H, HUANG F, et al. A joint neural model for information extraction with global features [C] // Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics(ACL), 2020: 7999-8009. DOI: 10.18653/v1/2020.acl-main.713.
6 LINGUISTIC DATA CONSORTIUM. ACE [EB/OL]. (2021-03-10)[2021-06-23]. https://www.ldc.upenn.edu/collaborations/past-projects/ace.
7 ZHENG S, CAO W, XU W, et al. Doc2EDAG: An end-to-end document-level framework for Chinese financial event extraction [C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics (ACL), 2019: 337-346. DOI: 10.18653/v1/D19-1032.
8 LIANG X, CHENG D W, YANG F Z, et al. F-HMTC: Detecting financial events for investment decisions based on neural hierarchical multi-label text classification [C]// Proceedings of the 29th International Joint Conference on Artificial Intelligence(IJCAI-20). 2020: 4490-4496.
9 WANG X Z, WANG Z Q, HAN X, et al. MAVEN: A massive general domain event detection dataset [C] // Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics(ACL), 2020: 1652-1671.
10 LIU J, CHEN Y B, LIU K, et al. How does context matter? On the robustness of event detection with context-selective mask generalization [C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings. Association for Computational Linguistics(ACL), 2020: 2523-2532.
11 DING N, LI Z R, LIU Z Y, et al. Event detection with trigger-aware lattice neural network [C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics (ACL), 2019: 347-356.
12 TONG M H, XU B, WANG S, et al. Improving event detection via open-domain trigger knowledge [C] // Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics (ACL), 2020: 5887-5897.
13 ZHANG Y Y, XU G L, WANG Y, et al. A question answering-based framework for one-step event argument extraction. IEEE Access, 2020, (8): 65420- 65431.
14 HU M H, PENG Y X, HUANG Z, et al. Open-domain targeted sentiment analysis via span-based extraction and classification [C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics (ACL), 2019: 537-546.
15 MAO Y, SHEN Y, YU C, et al. A joint training dual-MRC framework for aspect based sentiment analysis [C]// Proceedings of the AAAI Conference on Artificial Intelligence. AAAI, 2021, 35(15): 13543-13551.
16 ZHONG Z X, CHEN D Q. A frustratingly easy approach for joint entity and relation extraction [EB/OL]. (2021-03-23)[2021-06-23]. https://arxiv.org/abs/2010.12812.
17 SOUZA F, NOGUEIRA R, LOTUFO R. Portuguese named entity recognition using BERT-CRF [EB/OL]. (2020-02-27)[2021-06-23]. https://arxiv.org/abs/1909.10649v2.
18 LIU M Y, TU Z Y, ZHONG T, et al. LTP: A new active learning strategy for bert-CRF based named entity recognition [EB/OL]. (2020-08-27)[2021-06-23]. https://arxiv.org/abs/2001.02524.
19 LIU Y H, OTT M, GOYAL N, et al. RoBERTa: A robustly optimized bert pretraining approach [EB/OL]. (2019-07-26)[2021-06-23]. https://arxiv.org/abs/1907.11692.
20 LAFFERTY J D, MCCALLUM A, PEREIRA F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data [C]// Proceedings of the 18th International Conference on Machine Learning (ICML-2001). San Francisco: Morgan Kaufmann Publishers Inc., 2001: 282-289.
21 FORNEY G D. The viterbi algorithm: A personal history [EB/OL]. (2005-04-29)[2021-06-23]. https://arxiv.org/pdf/cs/0504020.pdf.
22 GITHUB. WoBERT: Word-based Chinese BERT model-ZhuiyiAI [EB/OL]. [2021-06-23]. https://github.com/ZhuiyiTechnology/WoBERT.
23 ZENG B Q, YANG H, XU R Y, et al. LCF: A local context focus mechanism for aspect-based sentiment classification. Applied Sciences, 2019, 9 (16): 3389.
24 GITHUB. 2021百度关系抽取比赛Baseline [EB/OL]. (2021-03-10)[2021-06-23]. https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/information_extraction/DuIE.
25 SONG Y W, WANG J H, JIANG T, et al. Attentional encoder network for targeted sentiment classification [C]// Artificial Neural Networks and Machine Learning – ICANN 2019: Text and Time Series, ICANN 2019, Lecture Notes in Computer Science, vol 11730. Cham: Springer, 2019: 93-103.
26 LOSHCHILOV I, HUTTER F. Fixing weight decay regularization in Adam[EB/OL]. (2019-01-04)[2021-06-23]. https://arxiv.org/abs/1711.05101.
文章导航

/