计算机科学

基于评论分析的评分预测与推荐

  • 高祎璠 ,
  • 余文喆 ,
  • 晁平复 ,
  • 郑芷凌 ,
  • 张蓉
展开
  • 华东师范大学 数据科学与工程研究院上海高可信计算重点实验室,上海200062
高祎璠,女,硕士研究生.E-mail: yfgao@ecnu.edu.cn.

收稿日期: 2014-12-15

  网络出版日期: 2015-05-28

基金资助

国家自然科学基金(61103039,61402177);国家自然科学基金重点项目(61232002)

Analyzing reviews for rating prediction and item recommendation

  • GAO Yi-Fan ,
  • YU Wen-Zhe ,
  • CHAO Ping-Fu ,
  • ZHENG Zhi-Ling ,
  • ZHANG Rong
Expand

Received date: 2014-12-15

  Online published: 2015-05-28

摘要

推荐系统广泛地应用在网络平台中,推荐模型需要预测用户的喜好,帮助用户找到适合的电影、书籍、音乐等商品.通过对用户评分和评论信息的分析,可以发现用户关注的商品特征,并根据商品的特征,推测用户对该商品的喜好程度.本文提出将评论中隐含的语义内容与评分相结合,设计并实现了一种新颖的商品推荐模型.首先利用主题模型挖掘评论文本中隐含的主题分布,用主题分布刻画用户偏好和商品画像,在逻辑回归模型上训练主题与打分的关系,最终评分可以被视为是对用户偏好和商品画像的相似程度的量化表示.最后,本文在真实数据上进行了大量对比实验,结果证明该模型比对比系统性能优越且稳定.

本文引用格式

高祎璠 , 余文喆 , 晁平复 , 郑芷凌 , 张蓉 . 基于评论分析的评分预测与推荐[J]. 华东师范大学学报(自然科学版), 2015 , 2015(3) : 80 -90 . DOI: 10.3969/j.issn.1000-5641.2015.03.010

Abstract

Recommender systems are widely deployed in Web applications that need to predict the preferences of users to items. They are popular in helping users find movies, books, music, and products in general. In this work, we design a method for item recommendation based on a novel model that captures correlations between hidden aspects in reviews and numeric ratings. It is motivated by the observation that a user’s preference against an item is affected by different aspects discussed in reviews. Our method first explores topic modeling to discover hidden aspects from review text. Profiles are then created for users and items separately based on aspects discovered in their reviews. Finally, we utilize logistic regression to model the user item relationship and the rating is modeled as the similarity between user and item profiles. Experiments over real world reviews demonstrate the advantage of our proposal over state of the art solution.

参考文献

[1]RAJARAMAN A, ULLMAN J D. Mining of Massive Datasets[M]. London: Cambridge University Press, 2011.

[2]BLANCOFERNNDEZ Y, PAZOSARIAS J J, GILSOLLA A, et al. A flexible semantic inference methodology to reason about user preferences in knowledgebased recommender systems[J]. KnowledgeBased Systems, 2008, 21(4): 305320.

[3]MCAULEY J, LESKOVEC J. Hidden factors and hidden topics: understanding rating dimensions with review text[C]//Proceedings of the 7th ACM conference on Recommender systems. ACM, 2013: 165172.

[4]SARWAR B, KARYPIS G, KONSTAN J, et al. Itembased collaborative filtering recommendation algorithms[C]//Proceedings of the 10th international conference on World Wide Web. ACM, 2001: 285295.

[5]KOREN Y, BELL R, VOLINSKY C. Matrix factorization techniques for recommender systems[J]. Computer, 2009, 42(8): 3037.

[6]KOREN Y, BELL R. Advances in collaborative filtering[M]//KANTOR P B, RICCI F, ROKACH L, et al. Recommender Systems Handbook. New York: Springer, 2010: 145186.

[7]BRODY S, EIHADAD N. An unsupervised aspectsentiment model for online reviews[C]//Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 2010: 804812.

[8]JO Y, OH A H. Aspect and sentiment unification model for online review analysis[C]//Proceedings of the fourth ACM international conference on Web search and data mining. ACM, 2011: 815824.

[9]TITOV I, MCDONALD R. Modeling online reviews with multigrain topic models[C]//Proceedings of the 17th international conference on World Wide Web. ACM, 2008: 111120.

[10]TITOV I, MCDONALD R T. A Joint Model of Text and Aspect Ratings for Sentiment Summarization[C]//ACL, 2008(8): 308316.

[11]POPESCU A M, ETZIONI O. Extracting product features and opinions from reviews[M]//KAO A, POTEET S R. Natural Language Processing and Text Mining. London: Springer, 2007: 928.

[12]PANG B, LEE L. Opinion mining and sentiment analysis[J]. Foundations and Trends in Information Retrieval, 2008, 2(12): 1135.

[13]QU L, IFRIM G, WEIKUM G. The bagofopinions method for review rating prediction from sparse text patterns[C]//Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics. 2010: 913921.

[14]GANU G, ELHADAD N, MARIAN A. Beyond the stars: Improving rating predictions using Review text content[C]//WebDB, 2009.

[15]BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[J]. The Journal of Machine Learning Research, 2003, 3: 9931022.

[16]BAMMANN K. Statistical models: theory and practice[J]. Biometrics, 2006(62): 943.

[17]BISHOP C M. Pattern recognition and machine learning[M]. New York: Springer, 2006.

[18]LEMIRE D, MACLACHLAN A. Slope one predictors for online ratingbased collaborative filtering[C]//SDM, 2005, 5: 15.

[19]GANTNER Z, RENDLE S, FREUDENTHALER C, et al. MyMediaLite: A free recommender system library[C]//Proceedings of the fifth ACM conference on Recommender systems. ACM, 2011: 305308.

[20]TSAPARAS P, NTOULAS A, TERZI E. Selecting a comprehensive set of reviews[C]//Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2011: 168176.
文章导航

/