收稿日期: 2016-06-24
网络出版日期: 2016-11-29
基金资助
中央高校基本科研业务费专项资金资助(3102015JSJ0004)
Optimization strategies of correlated subquery for distributed database
Received date: 2016-06-24
Online published: 2016-11-29
毛思语 , 张利军 , 张小芳 , 高锦涛 , 李战怀 . 面向分布式数据库的相关子查询优化策略[J]. 华东师范大学学报(自然科学版), 2016 , 2016(5) : 56 -66 . DOI: 10.3969/j.issn.1000-5641.2016.05.007
A query which occurs in another query as a filter is called subquery, and if the filtering condition of a subquery depends on its parent query, it is called correlated
subquery. Generally, the execution cost of query with correlated subquery is high due to that subquery would be executed multiply, which leads to multiple disk access and extra communications in distributed system. Based on the investigation of the classical optimization strategies of correlated subquery, and according to the characteristics of distributed system, we adopt pulling up subquery, removing useless tree and eliminating aggregation function to optimize correlated subquery in distributed database system. And we implement these strategies in the distributed relational database OceanBase for the correlated subquery predicate EXIST. Experiment results show that these strategies can significantly improve the performance of a correlated subquery.
Key words: distributed database; correlated subquery; subquery optimization
[ 1 ] KIM W. On optimizing an SQL-like nested query[J]. ACM Transactions on Database Systems (TODS), 1982, 7(3): 443-469.
[ 2 ] 萨师煊, 王珊. 数据库系统概论[M]. 北京: 高等教育出版社, 2000.
[ 3 ] 李海翔. 数据库查询优化器的艺术[M]. 北京: 机械工业出版社, 2014.
[ 4 ] SILBERSCHATZ A, KORTH H F, SUDARSHAN S. Database System Concepts[M]. New York: McGraw-Hill, 1997.
[ 5 ] CAO B. Optimization of complex nested queries in relational databases[C]//Proceedings of 22nd International Conference on Data Engineering Workshops. [S.l.]: IEEE, 2006: X137.
[ 6 ] RAO J, ROSS K A. Reusing invariants: A new strategy for correlated queries [C]//SIGMOD, 1998, 27(2): 37-48.
[ 7 ] BELLAMKONDA S, AHMED R, WITKOWSKI A, et al. Enhanced subquery optimizations in oracle[C]//Proceedings of the VLDB Endowment. Germany: DBLP, 2009, 2(2): 1366-1377.
[ 8 ] 彭智勇. PostgreSQL数据库内核分析[M]. 北京: 机械工业出版社, 2012.
[ 9 ] KHAN M, KHAN M N A. Exploring query optimization techniques in relational databases[J]. International Journal of Database Theory & Application, 2013, 6(3): 11-20.
[10] 魏士伟, 黄文明, 康业娜, 等. 分布式数据库中基于半连接的查询优化算法研究[J].计算机应用, 2007, 27(B06): 34-36.
[11] SHIOI T, HATANO K. Query processing optimization using disk-based row-store and column-store[C]//Proceedings of the 17th International Conference on Information Integration and Web-based Applications & Services. New York: ACM, 2015: 69.
[12] CHEN G, WU Y, LIU J, et al. Optimization of sub-query processing in distributed data integration systems[J]. Journal of Network and Computer Applications, 2011, 34(4): 1035-1042.
[13] GALINDO-LEGARIA C, JOSHI M. Orthogonal optimization of subqueries and aggregation[C]//ACM SIGMOD Record. New York: ACM, 2001, 30(2): 571-581.
/
〈 |
|
〉 |