Journal of East China Normal University(Natural Sc ›› 2018, Vol. 2018 ›› Issue (5): 183-194.doi: 10.3969/j.issn.1000-5641.2018.05.016

Previous Articles    

Extraction of social media data based on the knowledge graph and LDA model

MA You1, YUE Kun1, ZHANG Zi-chen1, WANG Xiao-yi2, GUO Jian-bin2   

  1. 1. School of Information Science and Engineering, Yunnan University, Kunming 650500, China;
    2. School of Ethnology and Sociology, Yunnan University, Kunming 650500, China
  • Received:2018-07-10 Online:2018-09-25 Published:2018-09-26

Abstract: Social media data extraction forms the basis of research and applications related to public opinion, news dissemination, corporate brand promotion, commercial marketing development, etc. Accurate extraction results are critical to guarantee the effectiveness of the data analysis. In this paper, we analyze the underlying topics in data based on the LDA (Latent Dirichlet Allocation) model; we further implement data extraction in specific domains by adopting featured word sequences and knowledge graphs that describe entities and relevant relationships. Experimental results using "Headline Today" news and Sina Weibo data show that our proposed method can be used to extract social media data effectively.

Key words: social media, data extraction, LDA (Latent Dirichlet Allocation), knowledge graph

CLC Number: