Journal of East China Normal University(Natural Science) ›› 2021, Vol. 2021 ›› Issue (5): 94-103.doi: 10.3969/j.issn.1000-5641.2021.05.009
• Key Technologies for System • Previous Articles Next Articles
Received:
2021-08-10
Online:
2021-09-25
Published:
2021-09-28
Contact:
Peng CAI
E-mail:pcai@dase.ecnu.edu.cn
CLC Number:
Jiabo SUN, Peng CAI. Query optimization technology based on an LSM-tree[J]. Journal of East China Normal University(Natural Science), 2021, 2021(5): 94-103.
Table 1
Overview of optimization methods for cache invalidation"
方法 | 技术 | 缺点 |
Dedicated Compaction Servers | 专用合并服务器处理合并操作, 预热算法替换缓存数据 | 对于访问较分散的负载并不适用 |
SM-tree | 降低合并操作频率 | 范围查询差, 存在大量重复数据 |
Leaper | 机器学习预取数据 | 不适合变化较大的负载, 离线模型训练增大模型复杂度 |
LSbM-tree | 在磁盘上建立合并缓存 | 长范围查询较差, 合并缓存占用磁盘空间, 严重写放大 |
AC-Key | 3 种缓存形式: key-value, key-pointer 和 block | 缓存相同数据浪费内存资源 |
Dual Grained Caches | fine-grained cache, coarse-grained cache | 细粒度缓存同步更新影响写性能 |
1 | ARMSTRONG T G, PONNEKANTI V, BORTHAKUR D, et al. Linkbench: A database benchmark based on the facebook social graph [C]//Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 2013: 1185-1196. |
2 | ATIKOGLU B, XU Y, FRACHTENBERG E, et al. Workload analysis of a large-scale key-value store [C]//Proceedings of the 12th ACM Sigmetrics/Performance Joint International Conference on Measurement and Modeling of Computer Systems. 2012: 53-64. |
3 | CAO Z, DONG S, VEMURI S, et al. Characterizing, modeling, and benchmarking rocksDB key-value workloads at facebook [C]//18th USENIX Conference on File and Storage Technologies (FAST’ 20). 2020: 209-223. |
4 | BU Y, BORKAR V, JIA J, et al. Pregelix: Big(ger) graph analytics on a dataflow engine[J]. Proceedings of the VLDB Endowment, 2015, 8(2): 161–172. |
5 |
DECANDIA G, HASTORUN D, JAMPANI M, et al. Dynamo: Amazon's highly available key-value store. ACM SIGOPS Operating Systems Review, 2007, 41 (6): 205- 220.
doi: 10.1145/1323293.1294281 |
6 | RAJU P, PONNAPALLI S, KAMINSKY E, et al. Mlsm: Making authenticated storage faster in ethereum [C]//10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’ 18). 2018. |
7 | CAO Z, CHEN S, LI F, et al. LogKV: Exploiting key-value stores for event log processing [C/OL]// 6th Biennial Conference on Innovative Data Systems Research (CIDR’13). 2013. http://cidrdb.org/cidr2013/Papers/CIDR13_Paper46.pdf. |
8 | CHEN G J, WIENER J L, IYER S, et al. Realtime data processing at facebook [C]//Proceedings of the 2016 International Conference on Management of Data. 2016: 1087-1098. |
9 | KONDYLAKIS H, DAYAN N, ZOUMPATIANOS K, et al. Coconut palm: Static and streaming data series exploration now in your palm [C]//Proceedings of the 2019 International Conference on Management of Data. 2019: 1941-1944. |
10 |
KONDYLAKIS H, DAYAN N, ZOUMPATIANOS K, et al. Coconut: Sortable summarizations for scalable indexes over static and streaming data series. The VLDB Journal, 2019, 28 (6): 847- 869.
doi: 10.1007/s00778-019-00573-w |
11 | DGRAPH. Badger Key-value DB in Go[EB/OL]. [2021-08-23]. https://github.com/dgraphio/badger. |
12 | KYROLA A, GUSSTRIN C. GraphChi-DB: Simple design for a scalable graph database system—on just a PC[EB/OL]. (2014-03-04)[2021-08-23]. https://arxiv.org/pdf/1403.0701.pdf. |
13 | ALSUBAIEE S, CAREY M J, LI C. LSM-based storage and indexing: An old idea with timely benefits [C]//Second International ACM Workshop on Managing and Mining Enriched Geo-spatial Data. 2015: 1-6. |
14 | SHETTY P J, SPILLANE R P, MALPANI R R, et al. Building workload-independent storage with VT-trees [C]//11th USENIX Conference on File and Storage Technologies (FAST’ 13). 2013: 17-30. |
15 | JANNEN W, YUAN J, ZHAN Y, et al. BetrFS: A right-optimized write-optimized file system[C]//13th USENIX Conference on File and Storage Technologies (FAST’ 15). 2015: 301-315. |
16 | VINCON T, HARDOCK S, RIEGGER C, et al. Noftl-kv: Tackling write-amplification on kv-stores with native storage management [C]//Extending Database Technology. 2018: 457-460. |
17 | DAYAN N, BONNET P, IDREOS S. GeckoFTL: Scalable flash translation techniques for very large flash devices [C]//Proceedings of the 2016 International Conference on Management of Data. 2016: 327-342. |
18 |
COMER D. Ubiquitous B-tree. ACM Computing Surveys (CSUR), 1979, 11 (2): 121- 137.
doi: 10.1145/356770.356776 |
19 |
O’NEIL P, CHENG E, GAWLICK D, et al. The log-structured merge-tree (LSM-tree). Acta Informatica, 1996, 33 (4): 351- 385.
doi: 10.1007/s002360050048 |
20 | CHANG F, DEAN J, GHEMAWAT S, et al. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems, 2008, 26 (2): 1- 26. |
21 | Google. Leveldb[EB/OL]. [2021-08-23]. https://leveldb.org/. |
22 | Facebook. Rocksdb[EB/OL]. [2021-08-23]. https://rocksdb.org/. |
23 | Apache. Cassandra[EB/OL]. [2021-08-23]. http://cassandra.apache.org/. |
24 | HBase[EB/OL]. [2021-08-23]. https://hbase.apache.org/ |
25 | ALSUBAIEE S, BEHM A, BORKAR V, et al. Storage management in AsterixDB[J]. Proceedings of the VLDB Endowment, 2014, 7(10): 841-852. |
26 | BEAVER D, KUMAR S, LI H C, et al. Finding a needle in haystack: Facebook's photo storage [J]. OSDI, 2010(10): 1-8. |
27 | MATSUNOBU Y, DONG S, LEE H. MyRocks: LSM-tree database storage engine serving Facebook's social graph[J]. Proceedings of the VLDB Endowment, 2020, 13(12): 3217-3230. |
28 | HUANG G, CHENG X, WANG J, et al. X-Engine: An optimized storage engine for large-scale E-commerce transaction processing [C]//Proceedings of the 2019 International Conference on Management of Data. 2019: 651-665. |
29 | CAO W, LIU Z, WANG P, et al. PolarFS: An ultra-low latency and failure resilient distributed file system for shared storage cloud database[J]. Proceedings of the VLDB Endowment, 2018, 11(12): 1849-1862. |
30 |
BLOOM B H. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 1970, 13 (7): 422- 426.
doi: 10.1145/362686.362692 |
31 | KIRSCH A, MITZENMACHER M. Less hashing, same performance: Building a better Bloom filter. Random Structures & Algorithms, 2008, 33 (2): 187- 218. |
32 | DAYAN N, ATHANASSOULIS M, IDREOS S. Optimal bloom filters and adaptive merging for LSM-trees. ACM Transactions on Database Systems, 2018, 43 (4): 1- 48. |
33 | DAYAN N, ATHANASSOULIS M, IDREOS S. Monkey: Optimal navigable key-value store[C]//Proceedings of the 2017 ACM International Conference on Management of Data. 2017: 79-94. |
34 | LI Y, TIAN C, GUO F, et al. Elasticbf: Elastic bloom filter with hotness awareness for boosting read performance in large key-value stores [C]//2019 USENIX Annual Technical Conference (USENIX ATC’19). 2019: 739-752. |
35 | ZHANG H, LIM H, LEIS V, et al. Surf: Practical range query filtering with fast succinct tries [C]//Proceedings of the 2018 International Conference on Management of Data. 2018: 323-336. |
36 | LUO S, CHATTERJEE S, KETSETSIDIS R, et al. Rosetta: A robust space-time optimized range filter for key-value stores [C]//Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2020: 2071-2086. |
37 | ZHONG W, CHEN C, WU X, et al. {REMIX}: Efficient Range Query for LSM-trees [C]//19th USENIX Conference on File and Storage Technologies (FAST’ 21). 2021: 51-64. |
38 |
O'NEIL E J, O'NEIL P E, WEIKUM G. The LRU-K page replacement algorithm for database disk buffering. Acm Sigmod Record, 1993, 22 (2): 297- 306.
doi: 10.1145/170036.170081 |
39 | TAILOR P, MORENA R D. A survey of database buffer cache management approaches. International Journal of Advanced Research in Computer Science, 2017, 8 (3): 409- 414. |
40 | AHMAD M Y, KEMME B. Compaction management in distributed key-value datastores[J]. Proceedings of the VLDB Endowment, 2015, 8(8): 850-861. |
41 | JAGADISH H V, NARAYAN P P S, SESHADRI S, et al. Incremental organization for data recording and warehousing [J]. VLDB, 1997, 97: 16-25. |
42 | YANG L, WU H, ZHANG T, et al. Leaper: A learned prefetcher for cache invalidation in LSM-tree based storage engines[J]. Proceedings of the VLDB Endowment, 2020, 13(12): 1976-1989. |
43 | TENG D, GUO L, LEE R, et al. LSbM-tree: Re-enabling buffer caching in data management for mixed reads and writes [C]//2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). 2017: 68-79. |
44 | WU F, YANG M H, ZHANG B, et al. AC-Key: Adaptive caching for LSM-based key-value Stores [C]//2020 USENIX Annual Technical Conference (USENIX ATC’ 20). 2020: 603-615. |
45 | LI X, XU G, FAN H, et al. Improving read performance of LSM-tree based KV stores via dual grained caches [C]//2019 IEEE 21st International Conference on High Performance Computing and Communications, IEEE 17th International Conference on Smart City, IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). 2019: 841-848. |
[1] | Yang YU, Huiqi HU, Xuan ZHOU. Optimization of LSM-tree storage systems based on non-volatile memory [J]. Journal of East China Normal University(Natural Science), 2021, 2021(5): 37-47. |
[2] | Tida CHEN, Dan CUI, Yuxin YUAN, Jiamin LIU, Minsheng HUANG, Ying Li. Changes in water quality and land use structure in the green-belt area around Shanghai [J]. Journal of East China Normal University(Natural Science), 2021, 2021(4): 81-89. |
[3] | YANG Wencan, HU Huiqi, DUAN Huichao, HU Yaoyi, QIAN Weining. CedarAdvisor: A load-adaptive automatic indexing recommendation tool [J]. Journal of East China Normal University(Natural Science), 2020, 2020(6): 52-62. |
[4] | WEI Xiaoxian, LIU Wenxin, CAI Peng. Global transaction log of a multi-master cloud database [J]. Journal of East China Normal University(Natural Science), 2020, 2020(5): 10-20. |
[5] | JIANG Yunfang, HAN Xuemei, SHI Tiemao, SONG Danran. A microclimatic impact analysis on multi-dimensional indicators of urban road fabric: Empirical research on Shanghai [J]. Journal of East China Normal University(Natural Science), 2020, 2020(3): 129-147. |
[6] | DAI Lei. Property of the consistent Fredholm index and property (ω1) [J]. Journal of East China Normal University(Natural Science), 2020, 2020(2): 1-7. |
[7] | WANG Jingjing, LU Yanqiong. Optimal conditions for the existence of positive solutions to periodic boundary value problems with second order difference equations [J]. Journal of East China Normal University(Natural Science), 2020, 2020(2): 41-49. |
[8] | GU Jinghua, GE Zhenpeng, WANG Jie. Review of estuarine and coastal environmental monitoring technologies [J]. Journal of East China Normal University(Natural Science), 2020, 2020(1): 159-170. |
[9] | LIU Zi-hao, HU Hui-qi, XU Rui, ZHOU Xuan. Implementation of LevelDB-based secondary index on two-dimensional data [J]. Journal of East China Normal University(Natural Sc, 2019, 2019(5): 159-167. |
[10] | WU Zhen, CHEN Rui-shan. A comparative analysis of the Ocean Health Index and the Pressure-State-Response evaluation methods for Shanghai's ocean ecosystem health [J]. Journal of East China Normal University(Natural Sc, 2019, 2019(4): 174-187. |
[11] | WANG Huan, ZHU Min. Vital nodes identification by the hybrid K-shell method based on vertex strength [J]. Journal of East China Normal University(Natural Sc, 2019, 2019(3): 101-109. |
[12] | WANG Fang, XU Xiao-chen, WU Wen-hui. Effect of egg yolk immunoglobulin on an experimental periodontitis model induced by Porphyromonas gingivalis [J]. Journal of East China Normal University(Natural Sc, 2019, 2019(3): 131-137. |
[13] | ZHANG Ting-hui, HUANG Min-sheng, MA Ming-hai, ZHANG Wen, CUI He. Analysis and assessment on monthly dynamics of water quality at Taopu industrial zone in Shanghai [J]. Journal of East China Normal University(Natural Sc, 2017, 2017(6): 147-155. |
[14] | WANG Yu-chen, GUO Zhong-yang, WANG Yuan-yuan. A forecasting model of crime risk based on random forest [J]. Journal of East China Normal University(Natural Sc, 2017, (4): 89-96. |
[15] | LONG Fei, WENG Hai-xing, GAO Ming, ZHANG Zhao. Distributed secondary index based on LSM Tree [J]. Journal of East China Normal University(Natural Sc, 2016, 2016(5): 36-44. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||