收稿日期: 2016-07-07
网络出版日期: 2016-11-29
基金资助
国家自然科学基金(61332006); 国家 863 计划项目(2015AA015307)
Implementation of Semi-Join algorithm in a distributed system
Received date: 2016-07-07
Online published: 2016-11-29
随着新型分布式系统的使用范围越来越广, 应用不再满足于仅使用主键访问方式来读取数据, 如何在这些系统中高效实现 Join 等复杂操作成为研究的热点. 本文介绍了如何基于 Semi-Join 算法在分布式系统中实现 Join 操作, 提出了两种获取右表数据的方法, 并通过实验分析了该算法的性能.
关键词: 分布式数据库; Join操作; Semi-Join算法
钱招明 , 王 雷 , 余晟隽 , 宫学庆 . 分布式系统中 Semi-Join 算法的实现[J]. 华东师范大学学报(自然科学版), 2016 , 2016(5) : 75 -80 . DOI: 10.3969/j.issn.1000-5641.2016.05.009
As the scope of application of the new distributed system is becoming wider, the application is no longer satisfied with using primary key access to read the data, and how to efficiently achieve such complex operations as Join in these systems has become a research hot topic. This paper introduces how to realize the Join operation in the distributed systems based on the Semi-Join algorithm, and puts forward two ways to get the data in right table, and the performance of the algorithm is also analyzed through experiments.
Key words: distributed database; Join operation; Semi-Join algorithm
[ 1 ] BERNSTEIN P A, CHIU D M W. Using semi-joins to solve relational queries[J]. Journal of the ACM, 1981, 28(1): 25-40.
[ 2 ] AFRATI F N, ULLMAN J D. Optimizing multiway joins in a map-reduce environment[J]. IEEE Transactions on Knowledge & Data Engineering, 2011, 23(9): 1282-1298.
[ 3 ] BEAME P, KOUTRIS P, DAN S. Communication steps for parallel query processing[J]. Computer Science, 2013: 273-284.
[ 4 ] CHU S, BALAZINSKA M, SUCIU D. From theory to practice: Efficient join query evaluation in a parallel database system[C]//Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, 2015: 63-78.
[ 5 ] ELSEIDY M, ELGUINDY A, VITOROVIC A, et al. Scalable and adaptive online joins[J]. Proceedings of the Vldb Endowment, 2014, 7(6): 441-452.
[ 6 ] OKCAN A, RIEDEWALD M. Processing theta-joins using MapReduce[C]//ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June. 2011: 949-960.
[ 7 ] ZHANG X, CHEN L, WANG M. Efficient multi-way theta-join processing using mapreduce[J]. Proceedings of the Vldb Endowment, 2012, 5(11): 1184-1195.
[ 8 ] SCHNEIDER D A, DEWITT D J. Tradeoffs in processing complex join queries via hashing in multiprocessor database machines[C]//International Conference on Very Large Data Bases, August 13-16, 1990, Brisbane, Queensland, Australia. 1990: 469-480.
[ 9 ] NGO H Q , CHRISTOPHER, RUDRA A. Skew strikes back: new developments in the theory of join algorithms[J]. AcmSigmod Record, 2014, 42(4): 5-16.
/
〈 |
|
〉 |