Journal of East China Normal University(Natural Sc ›› 2018, Vol. 2018 ›› Issue (3): 77-87.doi: 10.3969/j.issn.1000-5641.2018.03.009

Previous Articles     Next Articles

Algorithm for video click-through rate prediction

KUANG Jun1, TANG Wei-hong2, CHEN Lei-hui1, CHEN Hui3, ZENG Wei3, DONG Qi-min4, GAO Ming1   

  1. 1. School of Data Science and Engineering, East China Normal University, Shanghai 200062, China;
    2. Shanghai Agricultural Technology Extension and Service Center, Shanghai 201103, China;
    3. Shenzhen Tencent Computer System Co. Ltd., Beijing 100080, China;
    4. Vocational and Technical Education Center of Linxi County, Linxi Inner Mongolia 025250, China
  • Received:2017-05-19 Online:2018-05-25 Published:2018-05-29

Abstract: Click-through rate prediction has played an important role in video recommendation systems. A video recommendation system can suggest media to users based on the results of click-through rate prediction. In this way, users may be more likely to click the videos recommended by platforms. However, given the volume and imbalance of data in some applications, the accuracy of click-through rate prediction may be very low. To improve the performance, this paper proposes an integrated approach by combining feature engineering with techniques from machine learning. In the first stage, the algorithm uses feature engineering to extract user, video, and combinational features from the original dataset. In the second stage, the algorithm predicts the click-through rate by employing supervised models of logistic regression, factorization machine, and gradient boosting decision tree combined with logistic regression. The experimental results illustrate that the prediction accuracy of the factorization machine model and the gradient boosting decision tree combined with logistic regression model are better than the logistic regression model. Moreover, the cross combination of user and video features can improve the accuracy of the click-through rate prediction.

Key words: click-through rate prediction, feature engineering, factorization machine, gradient boosting decision tree

CLC Number: