Journal of East China Normal University(Natural Science) ›› 2021, Vol. 2021 ›› Issue (5): 157-168.doi: 10.3969/j.issn.1000-5641.2021.05.014

• Data Analysis and Applications • Previous Articles     Next Articles

Query processing of large-scale product knowledge in a CPU-GPU heterogeneous environment

Chuangxin FANG(), Hao SONG, Yuming LIN*(), Ya ZHOU   

  1. Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin Guangxi 541004, China
  • Received:2021-08-07 Online:2021-09-25 Published:2021-09-28
  • Contact: Yuming LIN E-mail:innofang@163.com;ymlin@guet.edu.cn

Abstract:

Knowledge graphs are an effective way to structurally represent and organize unstructured knowledgeare; in fact, these graphs are commonly used to support many intelligent applications. However, product-related knowledge is typically massive in scale, heterogeneous, and hierarchical; these characteristics present a challenge for traditional knowledge query processing methods based on relational and graph models. In this paper, we propose a solution to address these challenges by designing and implementing a product knowledge query processing method using CPU and GPU collaborative computing. Firstly, in order to leverage the full parallel computing capability of GPU, a product knowledge storage strategy based on a sparse matrix is proposed and optimized for the scale of the task. Secondly, based on the storage structure of the sparse matrix, a query conversion method is designed, which transforms the SPARQL query into a corresponding matrix calculation, and extends the join query algorithm to the GPU for acceleration. In order to verify the effectiveness of the proposed method, we conducted a series of experiments on an LUBM dataset and a semisynthetic dataset of products. The experimental results showed that the proposed method not only improves retrieval efficiency for large-scale product knowledge datasets compared with existing RDF query engines, but also achieves better retrieval performance on a general RDF standard dataset.

Key words: product knowledge, heterogeneous environment, RDF data, query processing

CLC Number: