

  • 匡俊 ,
  • 唐卫红 ,
  • 陈雷慧 ,
  • 陈辉 ,
  • 曾炜 ,
  • 董启民 ,
  • 高明
  • 1. 华东师范大学 数据科学与工程学院, 上海 200062;
    2. 上海市农业技术推广服务中心, 上海 201103;
    3. 深圳腾讯计算机系统有限公司, 北京 100080;
    4. 林西县职业技术教育中心, 内蒙古 林西 025250

收稿日期: 2017-05-19

  网络出版日期: 2018-05-29



Algorithm for video click-through rate prediction

  • KUANG Jun ,
  • TANG Wei-hong ,
  • CHEN Lei-hui ,
  • CHEN Hui ,
  • ZENG Wei ,
  • DONG Qi-min ,
  • GAO Ming
  • 1. School of Data Science and Engineering, East China Normal University, Shanghai 200062, China;
    2. Shanghai Agricultural Technology Extension and Service Center, Shanghai 201103, China;
    3. Shenzhen Tencent Computer System Co. Ltd., Beijing 100080, China;
    4. Vocational and Technical Education Center of Linxi County, Linxi Inner Mongolia 025250, China

Received date: 2017-05-19

  Online published: 2018-05-29




匡俊 , 唐卫红 , 陈雷慧 , 陈辉 , 曾炜 , 董启民 , 高明 . 基于特征工程的视频点击率预测算法[J]. 华东师范大学学报(自然科学版), 2018 , 2018(3) : 77 -87 . DOI: 10.3969/j.issn.1000-5641.2018.03.009


Click-through rate prediction has played an important role in video recommendation systems. A video recommendation system can suggest media to users based on the results of click-through rate prediction. In this way, users may be more likely to click the videos recommended by platforms. However, given the volume and imbalance of data in some applications, the accuracy of click-through rate prediction may be very low. To improve the performance, this paper proposes an integrated approach by combining feature engineering with techniques from machine learning. In the first stage, the algorithm uses feature engineering to extract user, video, and combinational features from the original dataset. In the second stage, the algorithm predicts the click-through rate by employing supervised models of logistic regression, factorization machine, and gradient boosting decision tree combined with logistic regression. The experimental results illustrate that the prediction accuracy of the factorization machine model and the gradient boosting decision tree combined with logistic regression model are better than the logistic regression model. Moreover, the cross combination of user and video features can improve the accuracy of the click-through rate prediction.


