Journal of East China Normal University(Natural Sc ›› 2013, Vol. 2013 ›› Issue (3): 118-130.

• Article • Previous Articles     Next Articles

Efficient implementation for LDA in Mahout

XU Bo-xi1, HU Ning2, CHEN Wen-bin1, GAO Wei-guo1, CHENG Jin1   

  1. 1. School of Mathematical Sciences, Fudan University, Shanghai 200433, China; 2. Shanghai MediaV Advertising Corporation, Ltd., Shanghai} 200070, China
  • Received:2013-03-01 Revised:2013-04-01 Online:2013-05-25 Published:2013-07-10

Abstract: In a careful study of Latent Dirichlet Allocation (LDA)
using Gibbs sampling and the MapReduce framework, an efficient
implementation for LDA in Mahout was achieved. The experiments
showed the high performance of this distributed parallel LDA
program, and several issues about enhancing performance were
discussed.

Key words: Latent Dirichlet Allocation, Gibbs sampling, Mahout, distributed parallel computing, MapReduce framework

CLC Number: