Journal of East China Normal University(Natural Science) ›› 2022, Vol. 2022 ›› Issue (6): 79-86.doi: 10.3969/j.issn.1000-5641.2022.06.009

• Computer Science • Previous Articles     Next Articles

Distant supervision relation extraction via the influence function

Ziyin HUANG, Yuanbin WU*()   

  1. School of Computer Science and Technology, East China Normal University, Shanghai 200062, China
  • Received:2021-08-13 Online:2022-11-25 Published:2022-11-22
  • Contact: Yuanbin WU E-mail:ybwu@cs.ecnu.edu.cn

Abstract:

Distant supervision relation extraction captures noisy instances while reducing the burden of manual annotation, which hinders the training and testing process. To alleviate this problem, we proposed a de-noising method based on the influence function. The influence function measures the influence of each training point; the influence of one training point is defined as the change in test loss after removing the training point. We observed that this property could be used to determine whether a training instance involves noisy data. First, we designed a scoring function based on the influence function. Then, we integrated the scoring function into a bootstrapping framework to obtain the final denoising dataset from a small clean set. Using this preprocessing method, every distantly supervised dataset could be denoised by our method. Experimental results showed that the proposed denoised dataset can achieve good performance on a public dataset.

Key words: distant supervision, relation extraction, influence function, bootstrapping

CLC Number: