收稿日期: 2023-06-29
网络出版日期: 2023-09-20
基金资助
国家自然科学基金(62072179); 基础软硬件性能与可靠性测评工业和信息化部重点实验室开放课题; OceanBase联合实验室项目
An HTAP database prototype with an adaptive data synchronization
Received date: 2023-06-29
Online published: 2023-09-20
在HTAP (hybrid transactional and analytical processing)数据库中, 资源隔离和数据共享是一个难题, 虽然不同厂商通过不同的架构来实现资源隔离, 但是用户关注的新鲜度, 即OLTP (online transaction processing)和OLAP (online analytical processing)读写版本的差距, 由数据共享的一致性模型决定. 然而, 现有的HTAP数据库为了节约成本, 只应用单一一致性同步模型, 这与用户应用的多种一致性需求之间存在矛盾, 为了满足用户需求的最高一致性而采取向上兼容的方案降低了系统的整体性能. 通过构建新鲜度与性能权衡的代价模型, 提出了一致性切换算法和切换前后同步数据的处理策略, 实现了一个顺序一致性同步与线性一致性同步自适应切换的HTAP数据库原型系统, 使得无需调整HTAP架构, 即可支持不同一致性(新鲜度)需求的查询负载并实现系统性能的最大化, 最后对自适应切换的有效性也进行了实验验证.
俞融 , 杨攀飞 , 王清帅 , 张蓉 . 数据同步机制自适应优化的HTAP数据库原型系统[J]. 华东师范大学学报(自然科学版), 2023 , 2023(5) : 11 -25 . DOI: 10.3969/j.issn.1000-5641.2023.05.002
In HTAP (hybrid transactional and analytical processing) database, resource isolation and data sharing is a difficult problem. Although different vendors achieve resource isolation through different architectures, the freshness of user concerns, that is, the gap between online transactional processing (OLTP) write and online analytical processing (OLAP) read versions, is determined by the consistency model of data sharing. However, existing HTAP databases apply only one consistency synchronization model for an easy implementation, which is contradictory to the multiple consistency requirements of user applications, and the overall system performance is sacrificed for the highest consistency upward compatibility. In this paper, by constructing a cost model of freshness and performance tradeoff, proposing a consistency switching algorithm and a processing strategy for synchronized data before and after switching, and realizing an HTAP database prototype with adaptive switching between sequential consistency synchronization and linear consistency synchronization, which makes it possible to support query loads with different consistency (freshness) requirements and maximize the system performance without adjusting the HTAP architecture. The effectiveness of adaptive switching is also verified by extensive experiments.s of adaptive switching is also verified by extensive experiments.
1 | LI G, ZHANG C. HTAP databases: What is new and what is next[C]// Proceedings of the 2022 International Conference on Management of Data. 2022: 2483-2488. |
2 | 胡梓锐, 翁思扬, 王清帅, 等. HTAP数据库系统数据共享模型和优化策略 [J]. 软件学报, 2023. DOI: 10.13328/j.cnki.jos.006901. |
3 | LYU Z, ZHANG H H, XIONG G, et al. Greenplum: A hybrid database for transactional and analytical workloads [C]// Proceedings of the 2021 International Conference on Management of Data. 2021: 2530-2542. |
4 | HUANG D, LIU Q, CUI Q, et al.. TiDB: A Raft-based HTAP database. Proceedings of the VLDB Endowment, 2020, 13 (12): 3072- 3084. |
5 | WANG J Y, LI T L, SONG H Z, et al.. PolarDB-IMCI: A cloud-native HTAP database system at Alibaba. Proceedings of the ACM on Management of Data, 2023, 1 (2): 199. |
6 | CAO W, LIU Z J, WANG P, et al.. PolarFS: An ultra-low latency and failure resilient distributed file system for shared storage cloud database. Proceedings of the VLDB Endowment, 2018, 11 (12): 1849- 1862. |
7 | YANG Z K, YANG C H, HAN F S, et al.. OceanBase: A 707 million tpmC distributed relational database system. Proceedings of the VLDB Endowment, 2022, 15 (12): 3385- 3397. |
8 | VERBITSKI A, GUPTA A, SAHA D, et al. Amazon aurora: Design considerations for high throughput cloud-native relational databases [C]// Proceedings of the 2017 ACM International Conference on Management of Data. 2017: 1041-1052. |
9 | VERBITSKI A, GUPTA A, SAHA D, et al. Amazon aurora: On avoiding distributed consensus for I/Os, commits, and membership changes [C]// Proceedings of the 2018 International Conference on Management of Data. 2018: 789-796. |
10 | YANG J C, RAE I, XU J, et al.. F1 Lightning: HTAP as a service. Proceedings of the VLDB Endowment, 2020, 13 (12): 3313- 3325. |
11 | 张超, 李国良, 冯建华, 等.. HTAP数据库关键技术综述. 软件学报, 2023, 34 (2): 761- 785. |
12 | ?ZCAN F, TIAN Y Y, T?ZüN P. Hybrid transactional/analytical processing: A survey [C]// Proceedings of the 2017 ACM International Conference on Management of Data. 2017: 1771-1775. |
13 | KRASKA T, HENTSCHEL M, ALONSO G, et al.. Consistency rationing in the cloud: Pay only when it matters. Proceedings of the VLDB Endowment, 2009, 2 (1): 253- 264. |
14 | CHEN J, DING Y, LIU Y, et al.. ByteHTAP: Bytedance’s HTAP system with high data freshness and strong data consistency. Proceedings of the VLDB Endowment, 2022, 15 (12): 3411- 3424. |
15 | LU Y, LU Y, JIANG H. Adaptive consistency guarantees for large-scale replicated services [C]// Proceedings of the 2008 International Conference on Networking, Architecture, and Storage. 2008: 89-96. |
16 | YU H, VAHDAT A.. Design and evaluation of a conit-based continuous consistency model for replicated services. ACM Transactions on Computer Systems, 2002, 20 (3): 239- 282. |
17 | GAO L, DAHLIN M, NAYATE A, et al. Application specific data replication for edge services [C]// Proceedings of the 12th International Conference on World Wide Web. 2003: 449-460. |
18 | 杨志丰. 分布式存储系统的一致性是什么? [EB/OL]. (2022-06-27)[2023-05-11]. https://zhuanlan.zhihu.com/p/34656939. |
19 | YAN H, SHAWN T. 使用TiDB读取TiFlash [EB/OL]. (2022-12-06)[2023-06-04]. https://docs.pingcap.com/zh/tidb/stable/use-tidb-to-read-tiflash. |
20 | CAO W, LI F, HUANG G, et al. PolarDB-X: An elastic distributed relational database for cloud-native applications [C]// Proceedings of the 2022 IEEE 38th International Conference on Data Engineering. 2022: 2859-2872. |
21 | SHEN S, CHEN R, CHEN H, et al. Retrofitting high availability mechanism to tame hybrid transaction/analytical processing [C]// Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation. 2021: 219-238. |
/
〈 |
|
〉 |