Implementation of Semi-Join algorithm in a distributed system

  • QIAN Zhao-ming ,
  • WANG Lei ,
  • YU Sheng-jun ,
  • GONG Xue-qing
Expand
  • Institute for Data Science and Engineering, East China Normal University, Shanghai 200062, China

Received date: 2016-07-07

  Online published: 2016-11-29

Abstract

As the scope of application of the new distributed system is becoming wider, the application is no longer satisfied with using primary key access to read the data, and how to efficiently achieve such complex operations as Join in these systems has become a research hot topic. This paper introduces how to realize the Join operation in the distributed systems based on the Semi-Join algorithm, and puts forward two ways to get the data in right table, and the performance of the algorithm is also analyzed through experiments.

Cite this article

QIAN Zhao-ming , WANG Lei , YU Sheng-jun , GONG Xue-qing . Implementation of Semi-Join algorithm in a distributed system[J]. Journal of East China Normal University(Natural Science), 2016 , 2016(5) : 75 -80 . DOI: 10.3969/j.issn.1000-5641.2016.05.009

References

[ 1 ] BERNSTEIN P A, CHIU D M W. Using semi-joins to solve relational queries[J]. Journal of the ACM, 1981, 28(1): 25-40.
[ 2 ] AFRATI F N, ULLMAN J D. Optimizing multiway joins in a map-reduce environment[J]. IEEE Transactions on Knowledge & Data Engineering, 2011, 23(9): 1282-1298.
[ 3 ] BEAME P, KOUTRIS P, DAN S. Communication steps for parallel query processing[J]. Computer Science, 2013: 273-284.
[ 4 ] CHU S, BALAZINSKA M, SUCIU D. From theory to practice: Efficient join query evaluation in a parallel database system[C]//Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, 2015: 63-78.
[ 5 ] ELSEIDY M, ELGUINDY A, VITOROVIC A, et al. Scalable and adaptive online joins[J]. Proceedings of the Vldb Endowment, 2014, 7(6): 441-452.
[ 6 ] OKCAN A, RIEDEWALD M. Processing theta-joins using MapReduce[C]//ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June. 2011: 949-960.
[ 7 ] ZHANG X, CHEN L, WANG M. Efficient multi-way theta-join processing using mapreduce[J]. Proceedings of the Vldb Endowment, 2012, 5(11): 1184-1195.
[ 8 ] SCHNEIDER D A, DEWITT D J. Tradeoffs in processing complex join queries via hashing in multiprocessor database machines[C]//International Conference on Very Large Data Bases, August 13-16, 1990, Brisbane, Queensland, Australia. 1990: 469-480.
[ 9 ] NGO H Q , CHRISTOPHER, RUDRA A. Skew strikes back: new developments in the theory of join algorithms[J]. AcmSigmod Record, 2014, 42(4): 5-16.

Outlines

/