Journal of East China Normal University(Natural Sc

Previous Articles     Next Articles

Fault-tolerance in distributed in-memory database systems

ZHAO Zhen-hui, HUANG Cheng-shen, ZHOU Min-qi, ZHOU Ao-ying   

  1. Institute for Data Science and Engineering, East China Normal University, Shanghai 200062, China
  • Received:2016-06-27 Online:2016-09-25 Published:2016-11-29

Abstract:

In the big data era, distributed system has been widely deployed and applied in various fields. Nevertheless, the more nodes involved, the higher probability of system failures may occur. It is important to introduce fault-tolerance mechanism for distributed systems to achieve even higher performance, higher reliability and higher availability. CLAIMS system is an in-memory database system for real-time data analysis, which is mainly used for financial applications. It provides near real time query task and analytic task. This paper mainly discuss fault-tolerance mechanism in CLAIMS. Achieve lease-based quick system failure detection (Fail-fast). Achieve restart of affected analytic task after detecting failure (Fail-over). Achieve in-memory state recovery of abnormal node. Experiment indicate that the algorithm presented in this paper can achieve fault-tolerance in CLAIMS.

Key words: distributed in-memory database, fault-tolerance, lease