Data Management

Storage and load balancing for large-scale comment data on heterogeneous Redis cluster

  • ZHANG Jing-wei ,
  • DING Zhi-jun ,
  • YANG Qing ,
  • ZHANG Hui-bing ,
  • ZHANG Hai-tao ,
  • ZHOU Ya
Expand
  • 1. Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin Guangxi 541004, China;
    2. Guangxi Key Laboratory of Automatic Detection Technology and Instrument, Guilin University of Electronic Technology, Guilin Guangxi 541004, China

Received date: 2017-06-30

  Online published: 2017-09-25

Abstract

The storage and query performance for large-scale comment data have a great influence on those applications built on the above data. In a heterogeneous computing environment, each node has different performance on storage and computation, it presents a key challenge for optimizing the storage and query performance for large-scale comment data by taking full advantage of the performance of each node. Based on the ability of Redis cluster, we design a storage model for large-scale comment data in a homogeneous Redis cluster, which provides the storage balancing in Redis slots. And then, we discuss the relationship between the number of Redis slots and query efficiency to design a method for allocating storage on the real load of each computing node for heterogeneous Redis clusters, which can make full use of the performance of each node and can guide to allocate slots to nodes by balancing the query performance and storage loading. Our experimental results show that the proposed model has a good effect on storage loading and improve the query efficiency of the heterogeneous Redis cluster.

Cite this article

ZHANG Jing-wei , DING Zhi-jun , YANG Qing , ZHANG Hui-bing , ZHANG Hai-tao , ZHOU Ya . Storage and load balancing for large-scale comment data on heterogeneous Redis cluster[J]. Journal of East China Normal University(Natural Science), 2017 , 2017(5) : 20 -29 . DOI: 10.3969/j.issn.1000-5641.2017.05.003

References

[1] INTEL. A yearly product cadence moves the industry forward in a predictable fashion that can be planned in advance[EB/OL].[2017-05-10]. https://www.intel.com/content/www/us/en/silicon-innovations/intel-tock-modelgeneral.html.
[2] CHANG F, DEAN J, GHEMAWAT S. et al. Bigtable:A distributed storage system for structured data[J]. ACM Transactions on Computer Systems, 2006, 26(2):205-218.
[3] BORTHAKUR D. The Hadoop distributed file system:Achitecture and design[EB/OL].[2017-06-02]. http://hadoop.apache.org/common/docs/r0.180/hdfsdesign.pdf.
[4] 申德荣, 于戈, 王习特, 等. 支持大数据管理的NoSQL系统研究综述[J]. 软件学报, 2013(8):1786-1803.
[5] 何亚农, 宋玮, 赵跃龙. 基于平衡结构的对等网络存储系统研究[J]. 计算机工程与设计, 2011, 32(8):2611-2613.
[6] KALA K A, CHITHARANJAN K. Locality Sensitive Hashing based incremental clustering for creating affinity groups in Hadoop-HDFS-An infrastructure extension[C]//International Conference on Circuits, Power and Computing Technologies. IEEE, 2013:1243-1249.
[7] ROWSTRON A, DRUSCHEL P. Storage management and caching in PAST, a large-scale, persistent peer-topeer storage utility[C]//Proceedings of the 18th ACM Symposium on Operating Systems Principles. ACM, 2001:188-201.
[8] OKCAN A, RIEDEWALD M. Processing theta-joins using MapReduce[C]//Proceedings of SIGMOD International Conference on Management of Data. ACM, 2011:949-960.
[9] WEI Q, VEERAVALLI B, GONG B, et al. CDRM:A cost-effective dynamic replication management scheme for cloud storage cluster[C]//IEEE International Conference on CLUSTER Computing. 2010:188-196.
[10] XIE C, CAI B. A decentralized storage cluster with high reliability and flexibility[C]//Proceedings of 14th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing. IEEE, 2006:1-8.
Outlines

/