数据中台应用

基于聚合支付平台交易数据的商户流失预测

  • 徐一文 ,
  • 黎潇阳 ,
  • 董启文 ,
  • 钱卫宁 ,
  • 周昉
展开
  • 华东师范大学 数据科学与工程学院, 上海 200062

收稿日期: 2020-08-16

  网络出版日期: 2020-09-24

基金资助

国家自然科学基金(61902127); 上海市自然科学基金(19ZR1415700)

Merchant churn prediction based on transaction data of aggregate payment platform

  • XU Yiwen ,
  • LI Xiaoyang ,
  • DONG Qiwen ,
  • QIAN Weining ,
  • ZHOU Fang
Expand
  • School of Data Science and Engineering, East China Normal University, Shanghai 200062, China

Received date: 2020-08-16

  Online published: 2020-09-24

摘要

在聚合支付领域, 为了减少聚合支付平台的运营成本、提高平台利润率, 要解决的一个关键问题是确保平台中达到较低的商户流失率. 本文所关注的是聚合支付平台的商户流失预测问题, 目标是帮助平台及时挽回可能流失的客户. 基于交易流水数据和商户基本信息, 本文提出了与商户流失密切相关的特征, 采用多种传统机器学习模型进行流失预测. 考虑到商户的交易流水数据具有时序性, 增加了基于LSTM的多种时间序列模型来建模. 在真实数据集上的实验结果表明手动提取的特征具有一定的预测能力, 结果具有可解释性; 采用时间序列模型能够较好地学习到数据的时序特征, 从而进一步提升预测结果.

本文引用格式

徐一文 , 黎潇阳 , 董启文 , 钱卫宁 , 周昉 . 基于聚合支付平台交易数据的商户流失预测[J]. 华东师范大学学报(自然科学版), 2020 , 2020(5) : 167 -178 . DOI: 10.3969/j.issn.1000-5641.202091016

Abstract

In the field of aggregate payments, ensuring a low dropout rate of merchants on the platform is a key issue to reduce the overall platform operating cost and increase profit. This study focuses on the prediction of merchant churn for aggregate payment platforms and aims to help the platform reactivate potential churn merchants. The paper proposes a series of features that are highly relevant to merchant churn and applies a variety of traditional machine learning models for prediction. Given that the data analyzed contains sequential information, the study, moreover, applies LSTM-based techniques to address the prediction problem. Experimental results on a real dataset show that the proposed features have a certain predictive ability and the results are interpretable. And, the LSTM-based approaches are capable of capturing the timing characteristics in the data and further improve prediction results.

参考文献

[1] BHATTACHARYA C B. When customers are members: Customer retention in paid membership contexts [J]. Journal of the Academy of Marketing Science, 1998, 26(1): 31-44.
[2] REICHHELD F, DETRICK C. Loyalty: A prescription for cutting costs [J]. Marketing Management, 2003, 12(5): 24-24.
[3] HOCHREITER S, SCHMIDHUBER J. Long short-term memory [J]. Neural Computation, 1997, 9(8): 1735-1780.
[4] BAYTAS I M, XIAO C, ZHANG X, et al. Patient subtyping via time-aware LSTM networks [C]// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2017: 65-74.
[5] FENG W, TANG J, LIU T X. Understanding dropouts in MOOCs [C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2019, 33: 517-524.
[6] FEI M, YEUNG D Y. Temporal models for predicting student dropout in massive open online courses [C]// 2015 IEEE International Conference on Data Mining Workshop. IEEE, 2015: 256-263.
[7] YANG C, SHI X, JIE L, et al. I know you’ll be back: Interpretable new user clustering and churn prediction on a mobile social application [C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018: 914-922.
[8] LU Y, YU L, CUI P, et al. Uncovering the co-driven mechanism of social and content links in user churn phenomena [C]// Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019: 3093-3101.
[9] XIE Y, LI X, NGAI E W T, et al. Customer churn prediction using improved balanced random forests [J]. Expert Systems with Applications, 2009, 36(3): 5445-5449.
[10] WEI C P, CHIU I T. Turning telecommunications call details to churn prediction: A data mining approach [J]. Expert Systems with Applications, 2002, 23(2): 103-112.
[11] DASGUPTA K, SINGH R, VISWANATHAN B, et al. Social ties and their relevance to churn in mobile telecom networks [C]// Proceedings of the 11th International Conference on Extending Database Technology: Advances in Database Technology. 2008: 668-677.
[12] HUANG Y, ZHU F, YUAN M, et al. Telco churn prediction with big data [C]// Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. 2015: 607-618.
[13] CHEN T Q, GUESTRIN C. XGBoost: A scalable tree boosting system [C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016: 785-794.
[14] BAI T, ZHANG S, EGLESTON B L, et al. Interpretable representation learning for healthcare via capturing disease progression through time [C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018: 43-51.
[15] GRAVES A, SCHMIDHUBER J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures [J]. Neural Networks, 2005, 18(5/6): 602-610.
[16] CHO K, VAN MERRIËNBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation [EB/OL]. (2014-09-03) [2020-07-05]. https://arxiv.org/pdf/1406.1078v3.pdf.
[17] SRIVASTAVA N, MANSIMOV E, SALAKHUDINOV R. Unsupervised learning of video representations using LSTMs [C]// International Conference on Machine Learning. 2015: 843-852.
文章导航

/