大数据分析

在线广告中点击率预测研究

  • 肖垚 ,
  • 毕军芳 ,
  • 韩易 ,
  • 董启文
展开
  • 1. 华东师范大学 数据科学与工程学院, 上海 200062;
    2. 长江口水文水资源勘测局, 上海 200136
肖垚,男,硕士研究生,研究方向为广告点击率预测

收稿日期: 2017-05-01

  网络出版日期: 2017-09-25

基金资助

国家重点研发计划(2016YFB1000905);国家自然科学基金广东省联合重点项目(U1401256);国家自然科学基金(61672234,61402177);华东师范大学信息化软课题

Study of click through rate prediction in online advertisement

  • XIAO YAO ,
  • BI Jun-fang ,
  • HAN YI ,
  • DONG Qi-wen
Expand
  • 1. School of Data Science and Engineering, East China Normal University, Shanghai 200062, China;
    2. Yangtze River Estuary Survey Bureau of Hydrology and Water Resource, CWRC, Ministry of Water Resources, Shanghai 200136, China

Received date: 2017-05-01

  Online published: 2017-09-25

摘要

随着互联网的发展和用户的增长,广告行业从传统的线下广告模式,逐步转变为线上广告模式.同时,由于大数据分析技术的运用,线上广告模式相比于传统广告也体现了巨大的优越性.广告主之间相互竞争,通过竞价的方式,将自己的广告投放在运营媒体的广告位上.所以,在投放前预测该广告可能被用户点击的概率(CTR),对于广告主减少成本和增加可能收入来说非常重要.本文在调研了目前常用的广告点击率预测模型的基础上,选取广告主、广告和投放媒体平台信息作为预测模型的特征,采用真实数据集验证说明各种模型的优劣性,以及不同特征对广告点击率预测结果的影响.

关键词: 计算广告; CTR; 机器学习

本文引用格式

肖垚 , 毕军芳 , 韩易 , 董启文 . 在线广告中点击率预测研究[J]. 华东师范大学学报(自然科学版), 2017 , 2017(5) : 80 -86,100 . DOI: 10.3969/j.issn.1000-5641.2017.05.008

Abstract

With the development of the Internet and the growth of users, the advertising industry originated from the traditional offline advertising model, is gradually transforming into online advertising model. At the same time, due to the use of large data analysis technology, online advertising shows great advantages when compared with traditional advertising. The advertisers deliver their advertisements to the platform's specific positions by competition auction of counterparts. Therefore, it is important to predict the click through rate (CTR) of a given advertisement before auction, which is important for advertisers to reduce costs and expand their likely revenue.This paper introduces the commonly used ad click rate prediction model, uses the information from different advertisers, advertisements and media platforms as the features of machine learning, and uses real data sets to illustrate the advantages of various models,and the impact of different features on the ad click rate.

参考文献

[1] GABRILOVICH E. An Overview of Computational Advertising[R/OL].[2013-03-21]. http://research.yahoo.com/pub/2915.
[2] AGARWAL D, CHAKRABARTI D. Statistical Challenge in Online Advertising[R/OL].[2013-03-21]. http://research.yahoo.com/pub/2430.
[3] 纪文迪, 王晓玲, 周傲英. 广告点击率估算技术综述[J]. 华东师范大学学报(自然科学版), 2013(3):2-14.
[4] AGARWAL D, AGRAWAL R, KHANNA R, et al. Estimating rates of rare events with multiple hierarchies through scalable log-linear models[C]//ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2010:213-222.
[5] RICHARDSON M, DOMINOWSKA E, RAGNO R. Predicting clicks:estimating the click-through rate for new ads[C]//International Conference on World Wide Web. ACM, 2007:521-530.
[6] HE X, PAN J, JIN O, et al. Practical Lessons from Predicting Clicks on Ads at Facebook[C]//Eighth International Workshop on Data Mining for Online Advertising. ACM, 2014:1-9.
[7] CHAPELLE O, ZHANG Y. A dynamic bayesian network click model for web search ranking[C]//International Conference on World Wide Web. ACM, 2009:1-10.
[8] DUPRET G E, PIWOWARSKI B. A user browsing model to predict search engine click data from past observations[C]//International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2008:331-338.
[9] DAVE K, VARMA V. Predicting the click-through rate for rare/new ads[R]. Center for Search and Information Extraction Lab International Institute of Information Technology Hyderabad, INDIA, 2010.
[10] REGELSON M, FAIN D. Predicting click-through rate using keyword clusters[C]//Proceedings of the Second Workshop on Sponsored Search Auctions, 2006:9623.
[11] RENDLE S. Factorization machines[C]//IEEE International Conference on Data Mining. IEEE Computer Society, 2010:995-1000.
[12] WANG X, LI W, CUI Y, et al. Click-through rate estimation for rare events in online advertising[G]//HUA X S, MEI T, HANJALIC A. Online Multimedia Advertising:Techniques and Technologies. Hershey:IGI Global, 2010. doi:10.4018/978-1-60960-189-8.ch001.
[13] AGARWAL D, BRODER A Z, CHAKRABARTI D, et al. Estimating rates of rare events at multiple resolutions[C]//ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-Kdd. ACM, 2007:16-25.
[14] AGARWAL D, CHEN B C, ELANGO P. Spatio-temporal models for estimating click-through rate[C]//International Conference on World Wide Web. ACM, 2009:21-30.
[15] SCHONLAU M. Boosted regression (boosting):An introductory tutorial and a stata plugin[J]. Stata Journal, 2005, 5(3):330-354.
[16] BURGES C J C. From ranknet to lambdarank to lambdamart:An overview[R]. Microsoft Research Technical Report, 2010.
[17] FANG Y, LIU J. A novel prior-based real-time click through rate prediction model[J]. International Journal of Machine Learning & Cybernetics, 2014, 5(6):887-895.
[18] FAIN D C, PEDERSEN J O. Sponsored search:A brief history[J]. Bulletin of the American Society for Information Science & Technology, 2010, 32(2):12-13.
[19] RICHARDSON M, DOMINOWSKA E, RAGNO R. Predicting clicks:estimating the click-through rate for new ads[C]//International Conference on World Wide Web. ACM, 2007:521-530.
[20] JOACHIMS T, GRANKA L, PAN B, et al. Accurately interpreting clickthrough data as implicit feedback[C]//Proceedings of the 28th Annual International ACM SIGIR, 2005:154-161.
文章导航

/