计算机科学

面向异常检测的时序数据库查询优化

  • 张帅 ,
  • 胡卉芪 ,
  • 徐尧强 ,
  • 周烜
展开
  • 1. 华东师范大学 数据科学与工程学院, 上海 200062
    2. 国家电网有限公司华东分部, 上海 200120

收稿日期: 2022-01-17

  网络出版日期: 2023-03-23

Time series database query optimization for anomaly detection

  • Shuai ZHANG ,
  • Huiqi HU ,
  • Yaoqiang XU ,
  • Xuan ZHOU
Expand
  • 1. School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
    2. East China Branch of State Grid Corporation of China, Shanghai 200120, China

Received date: 2022-01-17

  Online published: 2023-03-23

摘要

随着物联网的发展, 大量传感器设备被接入网络, 这些设备所产生的数据的异常检测关系到系统服务的稳定性. 时序数据库是专门为时序数据优化的数据库系统. 作为监控系统的重要环节, 时序数据库担负着时序数据的管理和查询任务. 但目前时序数据库在处理多个数据源数据的查询上存在着延迟高、没有充分利用系统计算资源的缺点. 针对上述缺点, 基于InfluxDB, 重新设计了时序数据库的查询执行模型, 提出了InfluxDB-PP (parallel processing), 很好地解决了上述问题. 实验结果表明, InfluxDB-PP在实时数据异常查询场景下, 查询时延相较于InfluxDB降低了约85.7%.

本文引用格式

张帅 , 胡卉芪 , 徐尧强 , 周烜 . 面向异常检测的时序数据库查询优化[J]. 华东师范大学学报(自然科学版), 2023 , 2023(2) : 119 -131 . DOI: 10.3969/j.issn.1000-5641.2023.02.013

Abstract

With the development of the Internet of Things, a large number of sensor devices can be connected to a network. Anomaly detection of data generated by these devices is related to the stability of system services. A time series database is a database system optimized for time series data. As an important component of a monitoring system, time series databases are responsible for storing and querying continuous streams of time series data. The current time series database, however, cannot fully utilize system computing resources and cannot meet the latency requirements when coping with queries from multiple data sources. To address these drawbacks, we redesigned the query execution model of a time series database based on the well-known InfluxDB, and we proposed InfluxDB-PP (parallel processing) as a method to address the aforementioned problems. The experimental results show that InfluxDB-PP reduces query latency by about 85.7% compared to InfluxDB for real-time anomaly data query scenarios.

参考文献

1 SANDEEP M, CHANDAVARKAR B R. Data processing in IoT, sensor to cloud: Survey [C]// 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT). IEEE, 2021. DOI:10.1109/ICCCNT51525.2021.9579976.
2 EL ZOUKA H A, HOSNI M M. Secure IoT communications for smart healthcare monitoring system. Internet of Things, 2021, 13, 100036.
3 JENSEN S, PEDERSEN T, THOMSEN C. Time series management systems: A survey. IEEE Transactions on Knowledge and Data Engineering, 2017, 29 (11): 2581- 2600.
4 BADER A, KOPP O, FALKENTHAL M. Survey and comparison of open source time series databases [C]// Datenbanksysteme für Business, Technologie und Web (BTW 2017) - Workshopband. 2017: 249-268. Bon: Gesellschaft für Informatik.
5 INFLUXDATA INC. InfluxDB [EB/OL]. [2022-01-05]. https://github.com/influxdata/influxdb.
6 KEOGH E J, LONARDI S, CHIU B. Finding surprising patterns in a time series database in linear time and space [C]// Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2002: 550-556
7 HU B, CHEN Y P, KEOGH E. Time series classification under more realistic assumptions [C]// Proceedings of the 2013 SIAM International Conference on Data Mining (SDM). SIAM: 2013: 578-586
8 PETITJEAN F, FORESTIER G, WEBB G I, et al. Dynamic time warping averaging of time series allows faster and more accurate classification [C]// 2014 IEEE International Conference on Data Mining. IEEE, 2014: 470-479.
9 PERERA K S, HAHMANN M, LEHNER W, et al. Efficient approximate OLAP querying over time series [C]// Proceedings of the 20th International Database Engineering and Applications Symposium. ACM, 2016: 205-211.
10 PERERA K S, HAHMANN M, LEHNER W, et al. Modeling large time series for efficient approximate query processing [C]// Database Systems for Advanced Applications. Berlin: Springer, 2015: 190-204.
11 SHI X H, FENG Z Z, LI K X, et al. ByteSeries: An in-memory time series database for large-scale monitoring systems [C]// Proceedings of the 11th ACM Symposium on Cloud Computing. ACM, 2020: 60-73.
12 KADIYALA S, SHIRI N. A compact multi-resolution index for variable length queries in time series databases. Knowledge and Information Systems, 2008, 15 (2): 131- 147.
13 HENNESSY J L, PATTERSON D A. Computer Architecture: A Quantitative Approach [M]. 4th ed. [S. l.]: Morgan Kaufmann Publishers, 2006.
文章导航

/