Journal of East China Normal University(Natural Sc ›› 2014, Vol. 2014 ›› Issue (5): 290-300.doi: 10.3969/j.issn.10005641.2014.05.026

• Article • Previous Articles     Next Articles

Implementation of database schema design in distributed environment

 PANG  Tian-Ze, ZHANG  Chen-Dong, GAO  Ming, GONG  Xue-Qing   

  1. Software Engineering Institute, East China Normal University, Shanghai 200062, China
  • Online:2014-09-25 Published:2014-11-27

Abstract: Recently, we have witnessed an exponential increase in the amount of data. It results in a problem that a centralized database is hard to scaleup to the massive business requirements. A distributed database (DDB) is an alternative that can be scalable to the large scale applications by distributing the data to multinode server. Now, many enterprises have successfully implemented some distributed databases, such as Google Spanner and TaoBao OceanBase. In the theory of the designation of traditional database, different normal forms reduce the operational exception and data redundancy, and also ensure the data integrity. However, a schema design strictly following the normal forms leads to an inefficiently distributed database system because of the large amount of distributed relational operations. Fortunately, denormalization can significantly improve the query efficiency by reducing the number of relations and the amount of the distributed relational operations. OceanBase, a distributed database, is implemented by TaoBao and has high performance for OLTP, rather than OLAP. In this paper, we introduce how to utilize denormalization to design the schema for OceanBase and to improve the performance of OLAP. Finally, we illustrate the efficiency and effectiveness of the denormalization design for OceanBase in the empirical study by using benchmark TPC-H.

Key words: denormalization, distributed database, OceanBase, TPC-H

CLC Number: