收稿日期: 2023-07-01
网络出版日期: 2023-09-20
基金资助
国家自然科学基金(61972149, U22B2020)
Hybrid granular buffer management scheme for storage and computing separation architecture
Received date: 2023-07-01
Online published: 2023-09-20
存储计算分离方案已成为一种提高大规模数据处理性能及效率的系统架构, 但其存储层的访问效率低、网络开销大、对小文件不友好, 存在着极大的性能瓶颈. 基于MergeTree的数据库ClickHouse在数据存储过程中会产生很多小文件. ClickHouse和S3存算分离方案中文件粒度固定的SSD (solid state driver)缓存区不仅和内存数据不匹配, 还会造成缓存区空间浪费. 提出了一种面向存算分离架构的缓存管理方案HG-Buffer (hybrid granularity buffer), 旨在优化ClickHouse和S3的存储计算分离方案以及对象存储的小文件问题, 以提高缓存空间的利用率, 从而提高系统访问效率. HG-Buffer通过将SSD作为计算层和存储层之间的缓存层, 并将 SSD 缓冲区组织成两个粒度的缓冲区来实现: 对象缓冲区和块缓冲区。对象缓存粒度是对象存储中的数据粒度; 而块缓存粒度是系统访问数据的数据粒度, 其中块缓存粒度是对象缓存粒度的子集. HG-Buffer通过统计数据热度信息, 自适应地选择数据存储的位置, 以提高SSD空间的利用率, 从而提高系统性能. 在ClickHouse和S3上进行的实验评估证明了HG-Buffer的有效性和稳健性.
梅文娟 , 蔡鹏 . 面向存算分离架构的混合粒度缓存策略[J]. 华东师范大学学报(自然科学版), 2023 , 2023(5) : 26 -39 . DOI: 10.3969/j.issn.1000-5641.2023.05.003
The architecture of storage-compute separation has emerged as a solution for improving the performance and efficiency of large-scale data processing. However, there are notable performance bottlenecks in this approach, primarily due to the low access efficiency of object storage and the significant network overhead. Additionally, object storage exhibits low storage efficiency for small-sized files. For instance, ClickHouse, a MergeTree-based database, generates a plethora of small-sized files when storing data. To address these challenges, HG-Buffer (hybrid granularity buffer) is introduced as an SSD (solid state driver)-based caching management solution for optimizing the storage-compute separation in ClickHouse and S3, while also tackling the small-file issue in object storage. The primary objective of HG-Buffer is to minimize network transmission overhead and enhance system access efficiency. This is achieved by introducing SSD as a caching layer between the compute and storage layers and organizing the SSD buffer into two granularities: object buffer and block buffer. The object buffer granularity corresponds to the data granularity in object storage, while the block buffer granularity represents the data granularity accessed by the system, with the block buffer granularity being a subset of the object buffer granularity. By statistically analyzing data hotness information, HG-Buffer adaptively selects the storage location for data, improving SSD space utilization and system performance. Experimental evaluations conducted on ClickHouse and S3 demonstrate the effectiveness and robustness of HG-Buffer.
1 | ARMBRUST M, DAS T, SUN L W, et al.. Delta lake: High-performance ACID table storage over cloud object stores. Proceedings of the VLDB Endowment, 2020, 13 (12): 3411- 3424. |
2 | Google BigQuery [EB/OL]. [2023-05-18]. https://cloud.google.com/bigquery/2022. |
3 | CAMACHO-RODRÍGUEZ J, CHAUHAN A, GATES A, et al. Apache hive: From mapreduce to enterprise-grade big data warehousing [C]// Proceedings of the 2019 International Conference on Management of Data. ACM, 2019: 1773-1786. |
4 | COOPER B F, NARAYAN P P S, RAMAKRISHNAN R, et al.. PNUTS to sherpa: Lessons from yahoo!’s cloud database. Proceedings of the VLDB Endowment, 2019, 12 (12): 2300- 2307. |
5 | DAGEVILLE B, CRUANES T, ZUKOWSKI M, et al. The snowflake elastic data warehouse [C]// Proceedings of the 2016 International Conference on Management of Data. ACM, 2016: 215-226. |
6 | Amazon Simple Storage Service(S3) [EB/OL]. [2023-05-18]. http://aws.amazon.com/s3/. |
7 | Azure Blob Storage [EB/OL]. [2023-05-18]. https://azure.microsoft.com/enus/services/storage/blobs/. |
8 | Alibaba Object Storage Service [EB/OL]. [2023-05-18]. https://www.alibabacloud.com/product/oss/. |
9 | Google Cloud Storage [EB/OL]. [2023-05-18]. https://cloud.google.com/storage/. |
10 | KARUN A K, CHITHARANJAN K. A review on hadoop—HDFS infrastructure extensions [C]// 2013 IEEE Conference on Information & Communication Technologies. IEEE, 2013: 132-137. |
11 | VUPPALAPATI M, MIRON J, AGARWAL R, et al. Building an elastic query engine on disaggregated storage[C]// Proceedings of the 17th USENIX Conference on Networked Systems Design and Implementation. Berkeley, CA USA: USENIX Association, 2020: 449-462. |
12 | Alluxio Cache Service—Presto 0.282 Documentation [EB/OL]. [2023-05-18]. https://prestodb.io/docs/current/cache/service.html. |
13 | GUPTA A, AGARWAL D, TAN D, et al. Amazon redshift and the case for simpler data warehouses [C]// Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, 2015: 1917-1923 |
14 | Amazon Redshift [EB/OL]. [2023-05-18]. https://aws.amazon.com/redshift/. |
15 | GADBAN F, KUNKEL J.. Analyzing the performance of the S3 object storage API for HPC workloads. Applied Sciences, 2021, 11 (18): 8540. |
16 | CATO P. A simple approach to optimize S3 object gateways for massive numbers of small file writes [C]// 2022 IEEE International Conference on Big Data (Big Data). IEEE, 2022: 3186-3190. |
17 | ARMBRUST M, XIN R S, LIAN C, et al. Spark SQL: Relational data processing in spark [C]// Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, 2015: 1383-1394. |
18 | VERBITSKI A, GUPTA A, SAHA D, et al. Amazon aurora: Design considerations for high throughput cloud-native relational databases [C]// Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 2017: 1041-1052. |
19 | VERBITSKI A, GUPTA A, SAHA D, et al. Amazon aurora: On avoiding distributed consensus for I/Os, commits, and membership changes [C]// Proceedings of the 2018 International Conference on Management of Data. ACM, 2018: 789-796. |
20 | Amazon Athena — Serverless Interactive Query Service [EB/OL]. [2023-5-18]. https://aws. amazon.com/athena/. |
21 | LAMB A, FULLER M, VARADARAJAN R, et al.. The vertica analytic database: C-store 7 years later. Proceedings of the VLDB Endowment, 2012, 5 (12): 1790- 1801. |
22 | VANDIVER B, PRASAD S, RANA P, et al. Eon mode: Bringing the vertica columnar database to the cloud [C]// Proceedings of the 2018 International Conference on Management of Data. ACM, 2018: 797-809. |
23 | DURNER D, CHANDRAMOULI B, LI Y N.. Crystal: A unified cache storage system for analytical databases. Proceedings of the VLDB Endowment, 2021, 14 (11): 2432- 2444. |
24 | YANG Y F, YOUILL M, WOICIK M, et al.. FlexPushdownDB: Hybrid pushdown and caching in a cloud DBMS. Proceedings of the VLDB Endowment, 2021, 14 (11): 2101- 2113. |
25 | Fast Open-Source OLAP DBMS – ClickHouse [EB/OL]. [2023-5-18]. https://clickhouse.com. |
26 | PALANKAR M R, IAMNITCHI A, RIPEANU M, et al. Amazon S3 for science grids: A viable solution? [C]// Proceedings of the 2008 International Workshop on Data-Aware Distributed Computing. ACM, 2008: 55-64. |
27 | BONCZ P, NEUMANN T, ERLING O. TPC-H analyzed: Hidden messages and lessons learned from an influential benchmark [C]// Performance Characterization and Benchmarking, TPCTC 2013. Lecture Notes in Computer Science, vol 8391. Cham: Springer, 2013: 61-76. |
28 | ROVIO E, ESKOLA J, KOZUB S A, et al. Can high group cohesion be harmful?: A case study of a junior ice-hockey team[J]. Small Group Research, 2009, 40(4): 421-435 |
/
〈 |
|
〉 |