OceanBase是一个分布式关系型数据库,其目的是存储海量的高速增长的结构化数据,以廉价的服务器集群实现高可用、高可扩展、高性价比的服务.OceanBase采用内外存混合存储的模式,使用内存存储增量(新写入)数据,而使用外存存储基线(只读)数据,并将基线数据划分成大致等量的数据分片并采用分布式B+ tree的形式将分片存放在很多的数据服务器上,利用定时合并机制不断将增量数据与基线数据融合.本文介绍OceanBase基线数据存储的基本结构和分布方式、定时合并机制,以及基线数据在OceanBase中的具体存储格式的设计和实现.
OceanBase is a distributed relational database, its purpose is to store vast amounts of structured data in highgrowth, lowcost servers to achieve high availability, high scalability and costeffective services. OceanBase using memory and external store hybrid storage mode, stores the incremental (update) data in memory, and the baseline (readonly) data in external storage (usually disk), baseline data is divided into slices we called tablet roughly the same amount of data and the use of distributed B+ tree stored on many data servers, using the daily merge mechanism to keep the combined incremental data into baseline.This article describes the basic structure and distribution methods of OceanBase baseline data storage, as well as the daily merge mechanism, in addition, we will introduce in OceanBase baseline data storage format of the specific design and implementation.