华东师范大学学报(自然科学版) ›› 2023, Vol. 2023 ›› Issue (2): 119-131.doi: 10.3969/j.issn.1000-5641.2023.02.013

• 计算机科学 • 上一篇    

面向异常检测的时序数据库查询优化

张帅1, 胡卉芪1,*(), 徐尧强2, 周烜1   

  1. 1. 华东师范大学 数据科学与工程学院, 上海 200062
    2. 国家电网有限公司华东分部, 上海 200120
  • 收稿日期:2022-01-17 出版日期:2023-03-25 发布日期:2023-03-23
  • 通讯作者: 胡卉芪 E-mail:hqhu@dase.ecnu.edu.cn

Time series database query optimization for anomaly detection

Shuai ZHANG1, Huiqi HU1,*(), Yaoqiang XU2, Xuan ZHOU1   

  1. 1. School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
    2. East China Branch of State Grid Corporation of China, Shanghai 200120, China
  • Received:2022-01-17 Online:2023-03-25 Published:2023-03-23
  • Contact: Huiqi HU E-mail:hqhu@dase.ecnu.edu.cn

摘要:

随着物联网的发展, 大量传感器设备被接入网络, 这些设备所产生的数据的异常检测关系到系统服务的稳定性. 时序数据库是专门为时序数据优化的数据库系统. 作为监控系统的重要环节, 时序数据库担负着时序数据的管理和查询任务. 但目前时序数据库在处理多个数据源数据的查询上存在着延迟高、没有充分利用系统计算资源的缺点. 针对上述缺点, 基于InfluxDB, 重新设计了时序数据库的查询执行模型, 提出了InfluxDB-PP (parallel processing), 很好地解决了上述问题. 实验结果表明, InfluxDB-PP在实时数据异常查询场景下, 查询时延相较于InfluxDB降低了约85.7%.

关键词: 监控系统, 时序数据库, 多数据源数据查询优化

Abstract:

With the development of the Internet of Things, a large number of sensor devices can be connected to a network. Anomaly detection of data generated by these devices is related to the stability of system services. A time series database is a database system optimized for time series data. As an important component of a monitoring system, time series databases are responsible for storing and querying continuous streams of time series data. The current time series database, however, cannot fully utilize system computing resources and cannot meet the latency requirements when coping with queries from multiple data sources. To address these drawbacks, we redesigned the query execution model of a time series database based on the well-known InfluxDB, and we proposed InfluxDB-PP (parallel processing) as a method to address the aforementioned problems. The experimental results show that InfluxDB-PP reduces query latency by about 85.7% compared to InfluxDB for real-time anomaly data query scenarios.

Key words: monitoring system, time series database, multi-source data query optimization

中图分类号: