Key Technologies for System

A fuzzer for query processing functionality of OLAP databases

  • Zhaokun XIANG ,
  • Ting CHEN ,
  • Qian SU ,
  • Rong ZHANG
Expand
  • 1. School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
    2. China Industrial Control Systems Cyber Emergency Response Team, Beijing 100040, China

Received date: 2021-08-04

  Online published: 2021-09-28

Abstract

Query processing, including optimization and execution, is one of the most critical functionalities of modern relational database management systems (DBMS). The complexity of query processing functionalities, however, leads to high testing costs. It hinders rapid iterations during the development process and can lead to severe errors when deployed in production environments. In this paper, we propose a tool to better serve the testing and evaluation of DBMS query processing functionalities; the tool uses a fuzzing approach to generate random data that is highly associated with primary keys and generates valid complex analytical queries. The tool constructs constrained optimization problems to efficiently compute the exact cardinalities of operators in queries and furnish the results. We launched small-scale testing of our method on different versions of TiDB and demonstrated that the tool can effectively detect bugs in different versions of TiDB.

Cite this article

Zhaokun XIANG , Ting CHEN , Qian SU , Rong ZHANG . A fuzzer for query processing functionality of OLAP databases[J]. Journal of East China Normal University(Natural Science), 2021 , 2021(5) : 74 -83 . DOI: 10.3969/j.issn.1000-5641.2021.05.007

References

1 SILBERSCHATZ A, KORTH H F, SUDARSHAN S. Database Systems Concepts [M]. 5th ed. New York: McGraw-Hill Book Company, 2005.
2 CHAUDHURI S. An overview of query optimization in relational systems [C]//Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems - PODS ’98. Seattle: ACM Press, 1998: 34–43.
3 LAN H, BAO Z, PENG Y. A survey on advancing the DBMS query optimizer: Cardinality estimation, cost model, and plan enumeration. Data Science and Engineering, 2021, 6 (1): 86- 101.
4 LEIS V, GUBICHEV A, MIRCHEV A, et al. How good are query optimizers, really?. Proceedings of the VLDB Endowment, 2015, 9 (3): 204- 215.
5 LI J, ZHAO B, ZHANG C. Fuzzing: A survey. Cybersecurity, 2018, 1 (1): 6.
6 MANES V J M, HAN H, HAN C, et al. The art, science, and engineering of fuzzing: A survey [J]. IEEE Transactions on Software Engineering, 2019. DOI: 10.1109/TSE.2019.2946563.
7 SLUTZ D R. Massive stochastic testing of SQL [C]//Very Large Data Base. 1998: 618-622.
8 SELTENREICH A, TANG B, MULLENDER S. SQLSmith [CP/OL]. [2021-08-01]. https://github.com/anse1/sqlsmith.
9 RIGGER M, SU Z. Testing database engines via pivoted query synthesis [C]//14th Symposium on Operating Systems Design and Implementation. 2020: 667-682.
10 CHEN X, WANG C, CHEUNG A. Testing query execution engines with mutations [C]//Proceedings of the Workshop on Testing Database Systems. Portland Oregon: ACM, 2020: 1–5.
11 GHIT B, POGGI N, ROSEN J, et al. SparkFuzz: Searching correctness regressions in modern query engines [C]//Proceedings of the Workshop on Testing Database Systems. Portland Oregon: ACM, 2020: 1–6.
12 ZHONG R, CHEN Y, HU H, et al. SQUIRREL: Testing database management systems with language validity and coverage feedback [C]//Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. Virtual Event USA: ACM, 2020: 955–970.
13 JUNG J, HU H, ARULRAJ J, et al. APOLLO: Automatic detection and diagnosis of performance regressions in database systems. Proceedings of the VLDB Endowment, 2019, 13 (1): 57- 70.
14 BLAZYTKO T, BISHOP M, ASCHERMANN C, et al. GRIMOIRE: Synthesizing structure while fuzzing [C]//28th Security Symposium (Security 19). 2019: 1985-2002.
15 ZALEWSKI M. American fuzzy lop (2.52b)[CP/OL]. [2021-08-01]. http://lcamtuf.coredump.cx/afl.
16 BATI H, GIAKOUMAKIS L, HERBERT S, et al. A genetic approach for random testing of database systems [C]//Proceedings of the 33rd International Conference on Very Large Data Bases. 2007: 1243-1251.
17 ASCHERMANN C, FRASSETTO T, HOLZ T, et al. NAUTILUS: Fishing for deep bugs with grammars [C]//NDSS. 2019.
18 PADHYE R, LEMIEUX C, SEN K, et al. Semantic fuzzing with zest [C]//Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2019: 329-340.
19 STILLGER M, FREYTAG J C. Testing the quality of a query optimizer. IEEE Data Engineering Bulletin, 1995, 18 (3): 41- 48.
20 GU Z, SOLIMAN M A, WAAS F M. Testing the accuracy of query optimizers [C]//Proceedings of the Fifth International Workshop on Testing Database Systems. 2012: 1-6.
21 CHAUDHURI S, NARASAYYA V, RAMAMURTHY R. Exact cardinality query optimization for optimizer testing. Proceedings of the VLDB Endowment, 2009, 2 (1): 994- 1005.
22 TRUMMER I. Exact cardinality query optimization with bounded execution cost [C]//Proceedings of the 2019 International Conference on Management of Data. Amsterdam Netherlands: ACM, 2019: 2–17.
23 SHIN J H, RUSU F, SUHAN A. Exact selectivity computation for modern in-memory database query optimization [EB/OL]. (2019-01-06)[2021-09-18]. https://arxiv.org/pdf/1901.01488.pdf.
24 WOLFRAM. Mathematica 12 [CP/OL]. [2021-08-01]. https://www.wolfram.com/mathematica/.
Outlines

/