Article

Implementation of database schema design in distributed environment

  • PANG Tian-Ze ,
  • ZHANG Chen-Dong ,
  • GAO Ming ,
  • GONG Xue-Qing
Expand
  • Software Engineering Institute, East China Normal University, Shanghai 200062, China

Online published: 2014-11-27

Abstract

Recently, we have witnessed an exponential increase in the amount of data. It results in a problem that a centralized database is hard to scaleup to the massive business requirements. A distributed database (DDB) is an alternative that can be scalable to the large scale applications by distributing the data to multinode server. Now, many enterprises have successfully implemented some distributed databases, such as Google Spanner and TaoBao OceanBase. In the theory of the designation of traditional database, different normal forms reduce the operational exception and data redundancy, and also ensure the data integrity. However, a schema design strictly following the normal forms leads to an inefficiently distributed database system because of the large amount of distributed relational operations. Fortunately, denormalization can significantly improve the query efficiency by reducing the number of relations and the amount of the distributed relational operations. OceanBase, a distributed database, is implemented by TaoBao and has high performance for OLTP, rather than OLAP. In this paper, we introduce how to utilize denormalization to design the schema for OceanBase and to improve the performance of OLAP. Finally, we illustrate the efficiency and effectiveness of the denormalization design for OceanBase in the empirical study by using benchmark TPC-H.

Cite this article

PANG Tian-Ze , ZHANG Chen-Dong , GAO Ming , GONG Xue-Qing . Implementation of database schema design in distributed environment[J]. Journal of East China Normal University(Natural Science), 2014 , 2014(5) : 290 -300 . DOI: 10.3969/j.issn.10005641.2014.05.026

Outlines

/