华东师范大学学报(自然科学版) ›› 2023, Vol. 2023 ›› Issue (5): 164-181.doi: 10.3969/j.issn.1000-5641.2023.05.014

• 数据分析 • 上一篇    下一篇

基于Data Fabric的多模数据管理方法

郑新俊1, 田国良2,*(), 黄飞虎3,*()   

  1. 1. 南京理工大学 网络空间安全学院, 江苏 江阴 214414
    2. 中国移动通信集团江苏有限公司, 南京 210029
    3. 北京宝兰德软件公司, 北京 100021
  • 收稿日期:2023-07-08 接受日期:2023-07-08 出版日期:2023-09-25 发布日期:2023-09-20
  • 通讯作者: 田国良,黄飞虎 E-mail:15322160@qq.com;270636275@qq.com

A multimode data management method based on Data Fabric

Xinjun ZHENG1, Guoliang TIAN2,*(), Feihu HUANG3,*()   

  1. 1. School of Cyber Science and Engineering, Nanjing University of Science and Technology, Jiangyin, Jiangsu 214414, China
    2. China Mobile Group Jiangsu Co.Ltd., Nanjing 210029, China
    3. Beijing Boland Software Coporation, Beijing 100021, China
  • Received:2023-07-08 Accepted:2023-07-08 Online:2023-09-25 Published:2023-09-20
  • Contact: Guoliang TIAN,Feihu HUANG E-mail:15322160@qq.com;270636275@qq.com

摘要:

随着政府和企业在信息化向数字化演进历程中信息化程度的加深, 各类应用系统生成的数据日益多模化、多源化、海量化, 这对数据管理造成了新的挑战. 为了解决这些挑战, 数据管理领域涌现出了许多新的技术和理念, 其中Data Fabric (数据编织) 便是一种新兴的数据管理技术和方法, 它将分布式数据存储、处理和应用整合为一个整体, 并提供了一套可视化的接口进行管理. 本文首先分析了Data Fabric的技术架构、技术特点、技术价值和对多模数据进行管理与应用的完整流程. 其次, 提出了基于时序指标的多模多源数据的异常监测方法、基于日志数据的多模多源数据的异常监测方法, 它们通过Data Fabric技术的使用, 处理速度分别提高了33.3%和42.2%, F1-score分别提高12.2个和14.8个百分点, 进一步说明了Data Fabric技术和本文新提出方法的高效性和应用价值.

关键词: Data Fabric, 多模数据管理, 数据虚拟化

Abstract:

In the process of government and enterprise evolution, as information technology deepens from informatization into digitization, the data generated by various applications are becoming increasingly multimode, multisource, and massive, thereby posing new challenges to data management. To address these challenges, many new technologies and concepts have emerged in the field of data management. Data Fabric is a method that integrates distributed data storage, processing, and applications into a whole, providing a set of visual interfaces for management. First, we analyzed the technical architecture, characteristics, value, and complete process of managing and applying the multimode data of Data Fabric. Subsequently, we proposed anomaly monitoring methods based on time series indicators as well as log data for multimode and multisource data, whereby the processing speed improved by 33.3% and 42.2%, and F1 score improved by 12.2 pps (percentage points) and 14.8 pps, respectively, using Data Fabric technology. This further demonstrates the efficiency and application value of Data Fabric technology in the newly proposed methods.

Key words: Data Fabric, multimode data management, data virtualization

中图分类号: