This paper proposes a framework for extracting corporate bankruptcy-related events from ruling documents and thus extracts structured information about the related events. Combined with ruling documents, our framework uses distant supervision to generate training data; applies named entity recognition techniques to implement sequence label tagging on sentences of litigation documents; and implements event extraction with a self-defined list of event trigger words as well as an event dictionary to detect bankruptcy-related events and gather structured information. Our experimental results demonstrate the effectiveness of the framework.
YANG Jiale
,
WANG Junhao
,
QIAN Weining
,
LUO Yifeng
. Automatic extraction of corporate bankruptcy-related events from ruling documents[J]. Journal of East China Normal University(Natural Science), 2020
, 2020(4)
: 88
-97
.
DOI: 10.3969/j.issn.1000-5641.201921015
[1] MCCALLUM A, FREITAG D, PEREIRA F. Maximum entropy markov models for information extraction and segmentation [C]//ICML, 2000, 17: 591-598.
[2] LAFFERTY J, MCCALLUM A, PEREIRA F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data [C]//Proc 18th International Conf on Machine Learning, New York: ACM, 2001: 282-289.
[3] COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing (almost) from scratch [J]. Journal of Machine Learning Research, 2011(12): 2493-2537.
[4] HUANG Z, XU W, YU K. Bidirectional LSTM-CRF Models for sequence tagging [J]. Computer Science, 2015: 1508. 01991v1.
[5] 高丹, 彭敦陆, 刘丛. 海量法律文书中基于CNN的实体关系抽取技术 [J]. 小型微型计算机系统, 2018, 39(5): 1021-1026. DOI: 10.3969/j.issn.1000-1220.2018.05.028
[6] KOTSIANTIS S B, ZAHARAKIS I, PINTELAS P. Supervised machine learning: A review of classification techniques [J]. Emerging Artificial Intelligence Applications in Computer Engineering, 2007, 160: 3-24.
[7] BELAVAGI M C, MUNIYAL B. Performance evaluation of supervised machine learning algorithms for intrusion detection [J]. Procedia Computer Science, 2016, 89: 117-123. DOI: 10.1016/j.procs.2016.06.016.
[8] CARLSON A, BETTERIDGE J, WANG R C, et al. Coupled semi-supervised learning for information extraction [C]//Proceedings of the Third ACM International Conference on Web Search and Data Mining. New York: ACM, 2010: 101-110.
[9] HAN J, NGAN K N, LI M, et al. Unsupervised extraction of visual attention objects in color images [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2005, 16(1): 141-145.
[10] ZENG D, LIU K, CHEN Y, et al. Distant supervision for relation extraction via piecewise convolutional neural networks [C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. New York: ACM, 2015: 1753-1762.
[11] MINTZ M, BILLS S, SNOW R, et al. Distant supervision for relation extraction without labeled data [C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2. Association for Computational Linguistics, 2009: 1003-1011.
[12] 王礼敏. 面向法律文书的中文命名实体识别方法研究 [D]. 江苏 苏州: 苏州大学, 2018.