收稿日期: 2021-08-04
网络出版日期: 2021-09-28
基金资助
国家自然科学基金(62072179); 2020年关键软件适配验证中心项目
A fuzzer for query processing functionality of OLAP databases
Received date: 2021-08-04
Online published: 2021-09-28
查询处理是现代关系型数据库管理系统(DBMS)中最重要的功能之一, 主要包括查询优化和查询执行. 然而查询处理的复杂性导致了测试的高成本, 阻碍了开发过程中的快速迭代, 并可能在生产环境中导致严重错误. 为了更好地服务于DBMS查询处理功能的评测, 采用模糊测试的方法生成基于主键约束的随机数据和完全有效的复杂分析型查询; 构建约束优化, 对查询中算子的精确基数进行高效计算, 从而获得查询的正确结果; 最后实现了完整的工具. 通过对TiDB的不同版本进行了小规模的测试, 结果表明可以有效地检测出TiDB不同版本的一些Bug.
项兆坤 , 陈婷 , 苏仟 , 张蓉 . 面向OLAP数据库查询处理功能的模糊测试工具[J]. 华东师范大学学报(自然科学版), 2021 , 2021(5) : 74 -83 . DOI: 10.3969/j.issn.1000-5641.2021.05.007
Query processing, including optimization and execution, is one of the most critical functionalities of modern relational database management systems (DBMS). The complexity of query processing functionalities, however, leads to high testing costs. It hinders rapid iterations during the development process and can lead to severe errors when deployed in production environments. In this paper, we propose a tool to better serve the testing and evaluation of DBMS query processing functionalities; the tool uses a fuzzing approach to generate random data that is highly associated with primary keys and generates valid complex analytical queries. The tool constructs constrained optimization problems to efficiently compute the exact cardinalities of operators in queries and furnish the results. We launched small-scale testing of our method on different versions of TiDB and demonstrated that the tool can effectively detect bugs in different versions of TiDB.
Key words: OLAP database; query processing; query execution; exact cardinality; fuzzing
1 | SILBERSCHATZ A, KORTH H F, SUDARSHAN S. Database Systems Concepts [M]. 5th ed. New York: McGraw-Hill Book Company, 2005. |
2 | CHAUDHURI S. An overview of query optimization in relational systems [C]//Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems - PODS ’98. Seattle: ACM Press, 1998: 34–43. |
3 | LAN H, BAO Z, PENG Y. A survey on advancing the DBMS query optimizer: Cardinality estimation, cost model, and plan enumeration. Data Science and Engineering, 2021, 6 (1): 86- 101. |
4 | LEIS V, GUBICHEV A, MIRCHEV A, et al. How good are query optimizers, really?. Proceedings of the VLDB Endowment, 2015, 9 (3): 204- 215. |
5 | LI J, ZHAO B, ZHANG C. Fuzzing: A survey. Cybersecurity, 2018, 1 (1): 6. |
6 | MANES V J M, HAN H, HAN C, et al. The art, science, and engineering of fuzzing: A survey [J]. IEEE Transactions on Software Engineering, 2019. DOI: 10.1109/TSE.2019.2946563. |
7 | SLUTZ D R. Massive stochastic testing of SQL [C]//Very Large Data Base. 1998: 618-622. |
8 | SELTENREICH A, TANG B, MULLENDER S. SQLSmith [CP/OL]. [2021-08-01]. https://github.com/anse1/sqlsmith. |
9 | RIGGER M, SU Z. Testing database engines via pivoted query synthesis [C]//14th Symposium on Operating Systems Design and Implementation. 2020: 667-682. |
10 | CHEN X, WANG C, CHEUNG A. Testing query execution engines with mutations [C]//Proceedings of the Workshop on Testing Database Systems. Portland Oregon: ACM, 2020: 1–5. |
11 | GHIT B, POGGI N, ROSEN J, et al. SparkFuzz: Searching correctness regressions in modern query engines [C]//Proceedings of the Workshop on Testing Database Systems. Portland Oregon: ACM, 2020: 1–6. |
12 | ZHONG R, CHEN Y, HU H, et al. SQUIRREL: Testing database management systems with language validity and coverage feedback [C]//Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. Virtual Event USA: ACM, 2020: 955–970. |
13 | JUNG J, HU H, ARULRAJ J, et al. APOLLO: Automatic detection and diagnosis of performance regressions in database systems. Proceedings of the VLDB Endowment, 2019, 13 (1): 57- 70. |
14 | BLAZYTKO T, BISHOP M, ASCHERMANN C, et al. GRIMOIRE: Synthesizing structure while fuzzing [C]//28th Security Symposium (Security 19). 2019: 1985-2002. |
15 | ZALEWSKI M. American fuzzy lop (2.52b)[CP/OL]. [2021-08-01]. http://lcamtuf.coredump.cx/afl. |
16 | BATI H, GIAKOUMAKIS L, HERBERT S, et al. A genetic approach for random testing of database systems [C]//Proceedings of the 33rd International Conference on Very Large Data Bases. 2007: 1243-1251. |
17 | ASCHERMANN C, FRASSETTO T, HOLZ T, et al. NAUTILUS: Fishing for deep bugs with grammars [C]//NDSS. 2019. |
18 | PADHYE R, LEMIEUX C, SEN K, et al. Semantic fuzzing with zest [C]//Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2019: 329-340. |
19 | STILLGER M, FREYTAG J C. Testing the quality of a query optimizer. IEEE Data Engineering Bulletin, 1995, 18 (3): 41- 48. |
20 | GU Z, SOLIMAN M A, WAAS F M. Testing the accuracy of query optimizers [C]//Proceedings of the Fifth International Workshop on Testing Database Systems. 2012: 1-6. |
21 | CHAUDHURI S, NARASAYYA V, RAMAMURTHY R. Exact cardinality query optimization for optimizer testing. Proceedings of the VLDB Endowment, 2009, 2 (1): 994- 1005. |
22 | TRUMMER I. Exact cardinality query optimization with bounded execution cost [C]//Proceedings of the 2019 International Conference on Management of Data. Amsterdam Netherlands: ACM, 2019: 2–17. |
23 | SHIN J H, RUSU F, SUHAN A. Exact selectivity computation for modern in-memory database query optimization [EB/OL]. (2019-01-06)[2021-09-18]. https://arxiv.org/pdf/1901.01488.pdf. |
24 | WOLFRAM. Mathematica 12 [CP/OL]. [2021-08-01]. https://www.wolfram.com/mathematica/. |
/
〈 |
|
〉 |