Journal of East China Normal University(Natural Science) ›› 2023, Vol. 2023 ›› Issue (5): 40-50.doi: 10.3969/j.issn.1000-5641.2023.05.004

• Database Systems • Previous Articles     Next Articles

Separate management strategies for Part metadata under the storage-computing separation architecture

Danqi LIU, Peng CAI*()   

  1. School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
  • Received:2023-06-30 Accepted:2023-07-24 Online:2023-09-25 Published:2023-09-15
  • Contact: Peng CAI E-mail:pcai@dase.ecnu.edu.cn

Abstract:

To address the deficiencies of ClickHouse, including underutilization of hardware resources, lack of flexibility, and slow node startup, this paper proposes metadata management strategies under the storage-compute separation architecture, which focuses on the description of data information through Part metadata. Part metadata are the most crucial component of metadata. To effectively manage data on remote shared storage, this study collected all Part metadata files and merged them. After key-value mapping, serialization, and deserialization processes, the merged metadata were stored in a distributed key-value database. Furthermore, a synchronization strategy was designed to ensure consistency between the data on remote shared storage and the metadata in the distributed key-value database. By implementing the above strategies, a metadata management system was developed for Part metadata, which effectively addressed the slow node startup issue in ClickHouse and supported efficient dynamic scaling of nodes.

Key words: database systems, storage-computing separation architecture, Part metadata management

CLC Number: