收稿日期: 2021-09-09
网络出版日期: 2023-03-23
基金资助
国家自然科学基金(U1711263); 广西自然科学基金 (2018GXNSFAA281199, 2020GXNSFAA159117); 广西高校中青年教师基础能力提升项目(2018KY0651)
A memory allocation strategy for learned index based on huge pages
Received date: 2021-09-09
Online published: 2023-03-23
大数据时代, 数据信息的不断膨胀给数据的快速存取带来了巨大挑战. 因此, 设计一种高效的索引结构具有重要意义. ALEX (updatable adaptive learned index)是一种利用机器学习模型代替传统B-树索引结构的学习索引, 具有较好的时间、空间性能, 但存在频繁的缺页中断问题. 为解决此问题, 进一步提升ALEX性能, 在ALEX基础上提出了一种基于大页内存的内存预分配策略, 较好地降低了内存缺页中断率, 提升了ALEX性能. 在内存分配阶段, 采用预分配策略; 在内存回收阶段, 则采用延迟释放策略. 在Longitudes数据集上的实验表明, 该策略具有良好的效果.
官嘉林 , 朱艳 , 吴庭亮 , 陈艳 , 张敬伟 . 基于大页内存的学习索引内存分配策略[J]. 华东师范大学学报(自然科学版), 2023 , 2023(2) : 73 -81 . DOI: 10.3969/j.issn.1000-5641.2023.02.009
In the era of big data and with the continuous expansion of data, there are significant challenges with efficient access to data. Hence, designing an efficient index structure is of great significance. ALEX (updatable adaptive learned index) is a learned index that uses a machine learning model to replace the traditional B-tree index structure. Although it offers good time and space performance, it suffers from frequent page faults. In order to solve this problem and further improve the performance of ALEX, a memory pre-allocation strategy based on huge pages is proposed, on the basis of ALEX, that can help reduce the rate of memory page faults and improve the overall performance of ALEX. In the memory allocation phase, the pre-allocation strategy is adopted, and the memory free phase adopts a delayed release strategy. Experiments on the Longitudes dataset show that this strategy offers good performance.
Key words: learned index; huge pages; data access
1 | GARCIA-MOLINA H, ULLMAN J D, WIDOM J. Database System Implementation [M]. Upper Saddle River, NJ, USA: Prentice Hall, 2000: 47-95. |
2 | ALEXIOU K, KOSSMANN D, LARSON P ?. Adaptive range filters for cold data: Avoiding trips to siberia. Proceedings of the VLDB Endowment, 2013, 6 (14): 1714- 1725. |
3 | KIM C, CHHUGANI J, SATISH N, et al. FAST: Fast architecture sensitive tree search on modern CPUs and GPUs [C]// Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data. ACM, 2010: 339-350. |
4 | FAN B, ANDERSEN D G, KAMINSKY M, et al. Cuckoo filter: Practically better than bloom [C]// Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies. ACM, 2014: 75-88. |
5 | GRAEFE G, LARSON P A. B-tree indexes and CPU caches [C]// Proceedings 17th International Conference on Data Engineering. IEEE, 2001: 349-358. |
6 | RICHTER S, ALVAREZ V, DITTRICH J. A seven-dimensional analysis of hashing methods and its implications on query processing. Proceedings of the VLDB Endowment, 2015, 9 (3): 96- 107. |
7 | ZHANG H, ANDERSEN D G, PAVLO A, et al. Reducing the storage overhead of main-memory OLTP databases with hybrid indexes [C]// Proceedings of the 2016 International Conference on Management of Data. ACM, 2016: 1567-1581. |
8 | 李国良, 周煊赫, 孙佶, 等. 基于机器学习的数据库技术综述. 计算机学报, 2020, 43 (11): 2019- 2049. |
9 | KRASKA T, BEUTEL A, CHI E H, et al. The case for learned index structures [C]// Proceedings of the 2018 International Conference on Management of Data. ACM, 2018: 489-504. |
10 | DING J L, MINHAS U F, YU J, et al. ALEX: An updatable adaptive learned index [C]// Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. ACM, 2020: 969-984. DOI: 10.1145/3318464.3389711. |
11 | LI X, LI J, WANG X. ASLM: Adaptive single layer model for learned index [C]// International Conference on Database Systems for Advanced Applications. Cham: Springer, 2019: 80-95. |
12 | WU J C, ZHANG Y, CHEN S M, et al. Updatable learned index with precise positions. Proceedings of the VLDB Endowment, 2021, 14 (8): 1276- 1288. |
/
〈 |
|
〉 |