华东师范大学学报(自然科学版) ›› 2015, Vol. 2015 ›› Issue (5): 172-.doi: 10.3969/j.issn.1000-5641.2015.05.015

• LBS系统及应用 • 上一篇    

基于Raft一致性协议的高可用性实现

张晨东,郭进伟,刘柏众,储佳佳,周敏奇,钱卫宁   

  1. 华东师范大学数据科学与工程研究院,上海200062
  • 收稿日期:2015-09-16 出版日期:2015-09-25 发布日期:2015-10-08
  • 通讯作者: 周敏奇,男,副教授,硕士生导师,研究方向为内存数据库 E-mail:mqzhou@sei.ecnu.edu.cn.
  • 作者简介:张晨东,男,硕士研究生,研究方向为分布式数据库.E-mail: 51131500048@ecnu.cn.
  • 基金资助:

    国家自然科学基金重点项目(61332006);863项目(2015AA015307)

High availability implementation based on Raft

ZHANG Chen-dong,GUO Jin-wei,LIU Bo-zhong,CHU Jia-jia,ZHOU Min-qi,QIAN Wei-ning   

  • Received:2015-09-16 Online:2015-09-25 Published:2015-10-08

摘要: 随着互联网的快速发展和大数据时代的来临,传统数据库的局限性开始逐渐显现,而支持海量数据存储和高并发访问的分布式数据库系统越来越流行.在此背景下阿里巴巴集团研发了一款适用于海量数据存储的分布式数据库系统(OceanBase),并提供单集群和多集群两种部署模式.但多集群部署模式下的可用性较低,无法满足关键性应用的需求,包括:发生故障时不支持主备集群的自动切换;主备集群之间无法保证日志的强同步.针对上述问题,本文分析了传统数据库的高可用方案,针对OceanBase架构的特点,结合了Raft算法的思想,设计并实现了基于时间戳的分布式选举模块、自动化的集群切换模块和基于QUORUM策略的日志强同步模块.经实验验证,以上模块的实现能够提高系统整体的可用性.

关键词: 分布式数据库, 高可用性, Raft一致性协议, 日志同步

Abstract: With the rapid development of Internet and the upcoming Big Data era, the limitation of traditional database has been emerged and enlarged. The distributed database system based on massive data storage and high concurrent accesses has become more and more popular. Alibaba group developed a distributed database system suitable for mass data storage named OceanBase, which supports two deployment modes, i.e.〖KG-*3〗, single cluster and multiple clusters. But the availability of multiple clusters mode is not efficient and can’t satisfy the requirement of some critical applications, where it does not support the automatic switch between master cluster and slave cluster when a failure occurred and the inconsistent log is also generated during switching under multiple clusters mode. To address these problems, we analysis the high availability solutions of the traditional database,aiming at the characteristics of OceanBase architecture, combining the idea of in Raft, and then designs and implements the distributed election module based on the timestamp of logs, the automatic clusters switching module and the strong synchronization logs module based on QUORUM.The experimental results showed that the above approachescould improve the availability of the whole system.

中图分类号: