华东师范大学学报(自然科学版) ›› 2014, Vol. 2014 ›› Issue (5): 228-239.doi: 10.3969/j.issn.10005641.2014.05.020

• 计算机科学与技术 • 上一篇    下一篇

LCDJ:面向内存集群计算的局部感知连接算法

张磊, 周敏奇, 王立   

  1. 华东师范大学 软件学院,上海 200062
  • 出版日期:2014-09-25 发布日期:2014-11-27
  • 通讯作者: 周敏奇,男,博士,副教授,硕士生导师,研究方向为内存数据库系统. E-mail:mqzhou@sei.ecnu.edu.cn
  • 作者简介:张磊,男,硕士研究生,研究方向为内存数据库系统. E-mail: zhangleicasa@gmail.com.
  • 基金资助:

    国家自然科学基金(61332006)

LCDJ: Locality conscious join algorithm for in-memory cluster computing

 ZHANG  Lei, ZHOU  Min-Qi, WANG  Li   

  1. Data Science and Engineering Institute, Software Engineering Institute, East China Normal University, Shanghai 200062, China
  • Online:2014-09-25 Published:2014-11-27

摘要: 等值连接是数据库系统中最为重要的操作之一,哈希连接在处理等值连接时,表现出较高的性能.在分布式内存数据库系统中,数据即已分布式地存储于多个节点上,哈希连接通常情况需要将参与连接的两个关系表在连接属性上按照相同的哈希函数进行数据重分区,从而保证连接属性值相同的元组被传输到同一个节点上进行本地连接操作.由于内存数据处理速率远远高于网络的数据传输速率,因此数据重分区占据了连接算法的绝大部分时间,成为分布式内存数据库系统中等值连接操作的性能瓶颈.本文提出了一种新颖的分布式内存数据库环境下的等值连接算法LCDJ(Locality Conscious Distributed Join),在充分利用高效的内存计算的同时尽量减少网络数据传输量.算法首先对每个表连接属性的数据分布进行精确的统计,并结合并行度和计算负载均衡因素,进而建立代价模型来衡量不同调度策略下的时间开销,并求出最优的调度策略.LCDJ实现于基于内存的分布式原型系统Claims中.实验结果表明,本文所提算法有效地降低了网络传输代价,大幅度减少了响应时间,比起当前流行的Hive和Shark等系统有明显的性能提升.

关键词: 分布式哈希连接, 内存数据库, 网络传输优化, 负载均衡, 分布式系统

Abstract: Equal join is one of the most important operators in database systems. Hash join is an efficient algorithm for equal join. In distributed inmemory database system, data tables are partitioned across multiple nodes. Hash join needs two input tables to be repartitioned on the joined attributes under the same hash function before local join, to make sure that tuples from the two tables with the same join values are transferred to the same node. Since the speed of data processing in memory is much faster than the speed of network, data repartition occupies a large amount of time and becomes the bottleneck of equal join in distributed in-memory database. This paper proposes a novel equal join algorithm, which takes full advantages of in-memory computing and reduces the volume of data to be transferred. The algorithm first accumulates accurate statistics on the joined attributes of two tables, and then builds a cost model to measure the cost of different schedule strategies, and generates the optimized schedule strategy. Furthermore, the degree of parallelism and computing load balance are taken into consideration in our cost model. The proposed algorithm is implemented on our in-memory distributed prototype system Claims. Extensive experiment confirms that the algorithm effectively reduces the cost of network communication, improves the query response time by a huge margin, and gets a higher performance than Hive and Shark.

Key words: distributed hash join, in-memory database, network communication optimization, computing load balance, distributed system

中图分类号: