异构Redis集群大规模评论数据存储负载均衡设计

张敬伟; 丁志均; 杨青; 张会兵; 张海涛; 周娅

doi:10.3969/j.issn.1000-5641.2017.05.003

华东师范大学学报（自然科学版） >

2017 , Vol. 2017 >Issue 5: 20 - 29

DOI: https://doi.org/10.3969/j.issn.1000-5641.2017.05.003

数据管理

异构Redis集群大规模评论数据存储负载均衡设计

张敬伟 ,
丁志均 ,
杨青 ,
张会兵 ,
张海涛 ,
周娅

展开

1. 桂林电子科技大学广西可信软件重点实验室, 广西桂林 541004;
2. 桂林电子科技大学广西自动检测技术与仪器重点实验室, 广西桂林 541004

张敬伟,男,博士,副教授,研究方向为海量数据管理.E-mail:gtzjw@hotmail.com

收稿日期: 2017-06-30

网络出版日期: 2017-09-25

基金资助

国家自然科学基金（61363005，61462017，U1501252）；广西自然科学基金（2014GXNSFAA118353，2014GXNSFAA118390）；广西自动检测技术与仪器重点实验室基金（YQ15110）；广西高校中青年教师基础能力提升项目（ky2016YB156）

收起

Storage and load balancing for large-scale comment data on heterogeneous Redis cluster

ZHANG Jing-wei ,
DING Zhi-jun ,
YANG Qing ,
ZHANG Hui-bing ,
ZHANG Hai-tao ,
ZHOU Ya

Expand

1. Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin Guangxi 541004, China;
2. Guangxi Key Laboratory of Automatic Detection Technology and Instrument, Guilin University of Electronic Technology, Guilin Guangxi 541004, China

Received date: 2017-06-30

Online published: 2017-09-25

Fold

摘要

大规模评论数据的存储与查询性能对构建于其上的各类应用的快速响应具有重要影响.同时，异构计算环境中各计算节点性能呈现差异，如何充分开采各节点的计算和存储性能，优化大规模评论数据的存储与查询性能，是一个关键挑战.基于Redis集群的数据管理优势，首先提出了一种同构环境下基于卡槽存储平衡的大规模评论数据存储模型；然后论证了卡槽数目与节点查询效率的关系，以"负载与访问性能相平衡"的原则分配卡槽，进一步设计了异构环境下的集群节点负载计算和存储分配方法，充分开采了异构Redis集群中不同节点的性能.实验结果表明，提出的存储模型具有很好的存储平衡效果，提升了集群的整体查询效率.

关键词： 大规模评论数据; 存储负载均衡; 查询优化

本文引用格式

张敬伟 , 丁志均 , 杨青 , 张会兵 , 张海涛 , 周娅 . 异构Redis集群大规模评论数据存储负载均衡设计[J]. 华东师范大学学报（自然科学版）, 2017 , 2017(5) : 20 -29 . DOI: 10.3969/j.issn.1000-5641.2017.05.003

Abstract

The storage and query performance for large-scale comment data have a great influence on those applications built on the above data. In a heterogeneous computing environment, each node has different performance on storage and computation, it presents a key challenge for optimizing the storage and query performance for large-scale comment data by taking full advantage of the performance of each node. Based on the ability of Redis cluster, we design a storage model for large-scale comment data in a homogeneous Redis cluster, which provides the storage balancing in Redis slots. And then, we discuss the relationship between the number of Redis slots and query efficiency to design a method for allocating storage on the real load of each computing node for heterogeneous Redis clusters, which can make full use of the performance of each node and can guide to allocate slots to nodes by balancing the query performance and storage loading. Our experimental results show that the proposed model has a good effect on storage loading and improve the query efficiency of the heterogeneous Redis cluster.

Key words： large-scale comment data; storage and load balancing; query optimization

参考文献

[1] INTEL. A yearly product cadence moves the industry forward in a predictable fashion that can be planned in advance[EB/OL].[2017-05-10]. https://www.intel.com/content/www/us/en/silicon-innovations/intel-tock-modelgeneral.html.
[2] CHANG F, DEAN J, GHEMAWAT S. et al. Bigtable:A distributed storage system for structured data[J]. ACM Transactions on Computer Systems, 2006, 26(2):205-218.
[3] BORTHAKUR D. The Hadoop distributed file system:Achitecture and design[EB/OL].[2017-06-02]. http://hadoop.apache.org/common/docs/r0.180/hdfsdesign.pdf.
[4] 申德荣, 于戈, 王习特, 等. 支持大数据管理的NoSQL系统研究综述[J]. 软件学报, 2013(8):1786-1803.
[5] 何亚农, 宋玮, 赵跃龙. 基于平衡结构的对等网络存储系统研究[J]. 计算机工程与设计, 2011, 32(8):2611-2613.
[6] KALA K A, CHITHARANJAN K. Locality Sensitive Hashing based incremental clustering for creating affinity groups in Hadoop-HDFS-An infrastructure extension[C]//International Conference on Circuits, Power and Computing Technologies. IEEE, 2013:1243-1249.
[7] ROWSTRON A, DRUSCHEL P. Storage management and caching in PAST, a large-scale, persistent peer-topeer storage utility[C]//Proceedings of the 18th ACM Symposium on Operating Systems Principles. ACM, 2001:188-201.
[8] OKCAN A, RIEDEWALD M. Processing theta-joins using MapReduce[C]//Proceedings of SIGMOD International Conference on Management of Data. ACM, 2011:949-960.
[9] WEI Q, VEERAVALLI B, GONG B, et al. CDRM:A cost-effective dynamic replication management scheme for cloud storage cluster[C]//IEEE International Conference on CLUSTER Computing. 2010:188-196.
[10] XIE C, CAI B. A decentralized storage cluster with high reliability and flexibility[C]//Proceedings of 14th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing. IEEE, 2006:1-8.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献