XU Bo-xi, HU Ning, CHEN Wen-bin, GAO Wei-guo, CHENG Jin. Efficient implementation for LDA in Mahout[J]. Journal of East China Normal University(Natural Sc, 2013, 2013(3): 118-130.
{1} BLEI D M, NG A Y, JORDAN M I. Latent Dirichlet allocation[J]. Journal of Machine Learning Research, 2003 (3): 993-1022. {2} GRIFFITHS T L, STEYVERS M. Finding scientific topics[J]. Proceedings of the National Academy of Sciences, 2004(101): 5228-5235. {3} VENNER J. Pro Hadoop[M]. New York: Apress, 2009. {4} OWEN S, ANIL R, DUNNING T, FRIEDMAN E. Mahout in Action[M]. New York: Manning Publications, 2010. {5} STEYVERS M, GRIFFITHS T. Probabilistic topic models[M]//LANDAUER T, MCNAMARA D, DENNIS S, et al. Latent Semantic Analysis: A Road to Meaning.[s.l.]:Routledge, 2007. {6} HEINRICH G. Parameter estimation for text analysis[R]. Darmstadt: Fraunhofer IGD, 2004. {7} NEWMAN D, ASUNCION A, SMYTH P, WELLING M. Distributed inference for latent Dirichlet allocation[J]. Proc Neural Information Processing Systems, 2007(20): 1081-1088. {8} WANG Y, BAI H J, STANTON M, et al. PLDA: Parallel Latent Dirichlet Allocation for Large-Scale Applications[M]. Lecture Notes in Computer Science 5564. Berlin: Springer, 2009: 301-314. {9} GRIFFITHS T L, STEYVERS M. A probabilistic approach to semantic representation[C]// Proceedings of the Twenty-Fourth Annual Conference of Cognitive Science Society, 2002. {10} LIU Z Y, ZHANG Y Z, CHANG E Y. PLDA+: parallel latent Dirichlet allocation with data placement and pipeline processing[J]. ACM Transactions on Intelligent Systems and Technology, 2011(2): 26. {11} SMOLA A, NARAYANAMURTHY S. An architecture for parallel topic models[J]. Proceedings of the VLDB Endowment, 2010(3): 703-710. {12} EKANAYAKE J, LI H, ZHANG B J, et al. Twister: a runtime for iterative MapReduce[J]. Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010(1): 810-818. {13} BU Y Y, HOWE B, BALAZINSKA M, et al. HaLoop: efficient iterative data processing on large clusters[J].Proceedings of the VLDB Endowment, 2010(3): 285-296.