中国综合性科技类核心期刊(北大核心)
中国科学引文数据库来源期刊(CSCD)
美国《化学文摘》(CA)收录
美国《数学评论》(MR)收录
俄罗斯《文摘杂志》收录
导航切换
Home
About Journal
About Journal
Awards
Editorial Board
Editorial Team
Submit Review
Guide for Authors
Online Submission
Editor Work
Peer Review
Editor-in-chief
Office Work
Reader Center
Current Issue
Topics
Top Cited
Top Read
Top Downloaded
Volumn Content
Subject Content
Archive
Subscribe
E-mail Alert
Download
Contacts Us
Office Work
中文
Content of Data Management in our journal
Published in last 1 year
|
In last 2 years
|
In last 3 years
|
All
Please wait a minute...
For Selected:
Download Citations
EndNote
Reference Manager
ProCite
BibTeX
RefWorks
Toggle Thumbnails
Select
Parallel join based on distributed system OceanBase
XU Shi-lei, WANG Lei, HU Hui-qi, QIAN Wei-ning, ZHOU Ao-ying
Journal of East China Normal University(Natural Sc 2017, 2017 (
5
): 1-10. DOI: 10.3969/j.issn.1000-5641.2017.05.001
Abstract
(
568
)
HTML
(
18
)
PDF
(771KB)(
908
)
Knowledge map
With the rapid growth of application data and the continued development of distributed database systems, data storage in physical independent nodes has become a trend. In this trend, when the application needs to perform complex join queries, it inevitably generates a lot of network traffic. Therefore, improving the efficiency of join query in distributed system is a hot topic. Based on the analysis of the nested loop join, Hash join, semi-join in the OceanBase, this paper puts forward the optimization idea of using hardware resources reasonably and using multithread to execute join operations in parallel. We implement experiment on OceanBase with nested loop join algorithm, Hash join algorithm, semi-join algorithm respectively. The experimental results confirm that the efficiency of join algorithm is positively related to parallelism in a certain number of threads.
Reference
|
Related Articles
|
Metrics
Select
Distributed stream processing system for join operations
CHEN Ming-zhu, WANG Xiao-tong, FANG Jun-hua, ZHANG Rong
Journal of East China Normal University(Natural Sc 2017, 2017 (
5
): 11-19. DOI: 10.3969/j.issn.1000-5641.2017.05.002
Abstract
(
372
)
HTML
(
90
)
PDF
(641KB)(
602
)
Knowledge map
Real-time stream processing system plays an increasingly important role in practical applications. Stream Join constitutes one of the most important and expensive operation in big data analysis. However, skewed data distribution in real-world applications and inherent features of streaming data, such as infinity and unpredictability, put great pressure on the join processing in distributed stream systems. Mainstream industrial stream systems have low versatility on join processing, providing no programming interface; though several academic stream prototype systems solve such a problem to a certain extent, they support equi-join processing only, or results in high resource utilization and severe load imbalance. In this paper, after analyzing three typical distributed stream systems, we integrate the techniques based on Join-Matrix into Storm, design and implement a general stream processing system which supports arbitrary theta joins. Experiments demonstrate that the system proposed in this paper outperforms the static-of-the-art strategies.
Reference
|
Related Articles
|
Metrics
Select
Storage and load balancing for large-scale comment data on heterogeneous Redis cluster
ZHANG Jing-wei, DING Zhi-jun, YANG Qing, ZHANG Hui-bing, ZHANG Hai-tao, ZHOU Ya
Journal of East China Normal University(Natural Sc 2017, 2017 (
5
): 20-29. DOI: 10.3969/j.issn.1000-5641.2017.05.003
Abstract
(
650
)
HTML
(
19
)
PDF
(643KB)(
746
)
Knowledge map
The storage and query performance for large-scale comment data have a great influence on those applications built on the above data. In a heterogeneous computing environment, each node has different performance on storage and computation, it presents a key challenge for optimizing the storage and query performance for large-scale comment data by taking full advantage of the performance of each node. Based on the ability of Redis cluster, we design a storage model for large-scale comment data in a homogeneous Redis cluster, which provides the storage balancing in Redis slots. And then, we discuss the relationship between the number of Redis slots and query efficiency to design a method for allocating storage on the real load of each computing node for heterogeneous Redis clusters, which can make full use of the performance of each node and can guide to allocate slots to nodes by balancing the query performance and storage loading. Our experimental results show that the proposed model has a good effect on storage loading and improve the query efficiency of the heterogeneous Redis cluster.
Reference
|
Related Articles
|
Metrics
Select
Design and implementation of Smart materialization for column-store in CLAIMS
ZHANG Han, ZHOU Min-qi
Journal of East China Normal University(Natural Sc 2017, 2017 (
5
): 30-39. DOI: 10.3969/j.issn.1000-5641.2017.05.004
Abstract
(
379
)
HTML
(
17
)
PDF
(621KB)(
480
)
Knowledge map
Materialization is a necessary operation in the process of query execution. Materialization strategy and materialization technology play an important role in the process of query execution. Therefore, it is necessary to design a materialization strategy for column-store database. According to the shortcomings of early materialization and later materialization, we provide a strategy named Smart materialization that are different from the two strategies mentioned above. Here we need to define a concept in the logical query plan-projection, the structure is used to select the desired attributes, the physical table is cut by column, to ensure that the structure at the beginning of the query can reduce the direct load to memory of the amount of data, to avoid additional overhead. In the logical query plan, the projection is divided by columns, and the next required columns are predicted according to the relevance of the query in a set of queries, and the required columns are stabilized in one of the most appropriate projection. We use the data set of TPC-H to verify its validity worked on the disturbed in-memory database-CLAIMS.
Reference
|
Related Articles
|
Metrics
Select
An outer join algorithm based on Cuckoo filter
YU Yang, ZHOU Min-qi, FANG Zhu-he
Journal of East China Normal University(Natural Sc 2017, 2017 (
5
): 40-51. DOI: 10.3969/j.issn.1000-5641.2017.05.005
Abstract
(
515
)
HTML
(
23
)
PDF
(750KB)(
579
)
Knowledge map
In recent years, due to the development of the Internet, data size has been increased rapidly. At the era of big data, the analysis efficiency of distributed database system needs to be optimized urgently. Nevertheless, the join operation is the main performance bottleneck of a distributed database system. Join operations are mainly divided into inner join and outer join, and outer join is widely used in the business situations. Distributed join algorithm involves a large amount of network transmission, which affects the performance of the system severely. Although there are some studies in the literature of inner join optimized, these optimization methods cannot be directly applied to outer join. This paper proposes a distributed outer join algorithm based on Cuckoo filter. By building a Cuckoo filter with replication subdivision technology for data filtering and allocation, it reduces the amount of data transmission and improves the degree of parallelism accordingly. Finally, it improves the query performance. We implement this algorithm in Ginkgo. Based on the given extensive experimental verification, the algorithm largely improves the efficiency of the outer join.
Reference
|
Related Articles
|
Metrics