华东师范大学学报(自然科学版) ›› 2017, Vol. ›› Issue (4): 97-113,138.doi: 10.3969/j.issn.1000-5641.2017.04.009

• 计算机科学 • 上一篇    下一篇

基于移动轨迹数据的商圈消费者规模分析

刘志, 刘辉平, 赵大鹏, 王晓玲   

  1. 华东师范大学 计算机科学与软件工程学院 数据科学与工程研究院, 上海 200062
  • 收稿日期:2016-07-20 出版日期:2017-07-25 发布日期:2017-07-20
  • 作者简介:刘志,男,硕士研究生,研究方向为数据挖掘.E-mail:lyz8538350@126.com.

Business circle population mobility statistics based on mobile trajectory data

LIU Zhi, LIU Hui-ping, ZHAO Da-peng, WANG Xiao-ling   

  1. Institute for Data Science and Engineering, School of Computer Science and Software Engineering, East China Normal University, Shanghai 200062, China
  • Received:2016-07-20 Online:2017-07-25 Published:2017-07-20

摘要: 随着城市化的推进以及大数据技术的不断发展,智慧商圈成为智慧城市建设的重要组成部分.智慧商圈的热门程度、消费者的规模、消费层次等因素成为智慧商圈建设的关注热点.然而,传统的消费者规模的统计,还是基于传统的问卷调查或者抽样等,这些方法不仅成本昂贵而且效率低下.但随着数据挖掘技术的发展,使得通过分析用户行为轨迹来确定商圈消费者规模成为可能.本文提出了一种基于轨迹数据分析的商圈消费者规模分析方法.本文的主要工作包括:①在轨迹数据中,如何确定商圈的边界这是一个首要的问题,基于此,才能确定一位消费者是在商圈内活动,还是在商圈外面.本文提出了根据商圈内基站点的位置分布,运用k-NearestNeighbor(kNN)分类算法,对该商圈的范围进行圈定的方法.②由于轨迹数据的不确定性特点,确定一个用户与商圈的关系也是一个难题.本文利用计算不规则多边形面积的方法计算基站点的权重值,结合时间阈值分析该区域内每天的消费者规模.③最后,鉴于轨迹数据的海量性,本文提出了一个大数据计算框架BPDA(Business-Circle Parallel Distributed Algorithm),基于Hadoop大数据处理平台和Kafka分布式消息系统,实现了基于移动轨迹数据的商圈消费者规模分析系统,并使用中山公园商圈基站数据,展示了本文所提方法的可行性.

关键词: 移动轨迹数据, 消费者规模分析, kNN 分类算法, 商圈轮廓

Abstract: With the advancement of urbanization and continental development of big data technology, smart business has become an important part of smart city construction. The popularity, consumer number scale and consumption level of smart business also become the hot spot in the construction of smart city. However, traditional consumer statistics method is based on traditional survey and sampling, etc. All of these traditional methods are high-cost and inefficient. Fortunately, the fast development of data mining technology makes statistics in business circle by analyzing user behavior trajectory data possible. In this paper, we propose a consumer scale analysis method on business circle using user trajectory data. There are three mainly work parts:① How to determine the real boundary of business circle in trajectory data analysis domain is a primary problem, and we can judge a consumer activity within or outside the business circle based on it. Facing this issue, we raise a new method to delineate business circle using k-Nearest Neighbor(kNN) classification algorithm based on the location of base station within business circle.② How to determine the relationship between user and business circle is also a new problem due to uncertainty of trajectory characteristics. We calculate irregular polygon area to evaluate the weight of each base station and also combine with time threshold in order to analyze consumer scale every day.③ Finally, considering large amounts in trajectory data, we propose a big data computing framework BPDA (Business-Circle Parallel Distributed Algorithm), which is based on Hadoop big data platform and Kafka distributed message system, to implement business circle consumers scale analysis system. Moreover, we take Zhongshan Park business circle as an instance to verify the feasibility of our algorithm.

Key words: mobile trajectory data, consumer scale analysis, kNN classification algorithm, business circle outline

中图分类号: