华东师范大学学报(自然科学版) ›› 2019, Vol. 2019 ›› Issue (5): 123-132.doi: 10.3969/j.issn.1000-5641.2019.05.010

• 新兴应用中的计算机智能 • 上一篇    下一篇

基于孤立森林算法的电能量异常数据检测

黄福兴1,2, 周广山1,2, 丁宏1,2, 张罗平1,2, 钱淑韵3, 袁培森3   

  1. 1. 南瑞集团(国网电力科学研究院)有限公司, 南京 211106;
    2. 国电南瑞科技股份有限公司, 南京 211106;
    3. 南京农业大学 信息科学技术学院, 南京 210095
  • 收稿日期:2019-07-29 出版日期:2019-09-25 发布日期:2019-10-11
  • 通讯作者: 袁培森,男,博士,讲师,研究方向为智能信息处理、海量数据处理与分析.E-mail:peiseny@njau.edu.cn. E-mail:peiseny@njau.edu.cn
  • 作者简介:黄福兴,男,硕士,高级工程师,研究方向为电能量计量、综合能源管控与服务.E-mail:huangfuxing@sgepri.sgcc.com.cn.

Electric energy abnormal data detection based on Isolation Forests

HUANG Fu-xing1,2, ZHOU Guang-shan1,2, DING Hong1,2, ZHANG Luo-ping1,2, QIAN Shu-yun3, YUAN Pei-sen3   

  1. 1. NARI Group Corporation(State Grid Electric Power Research Institute), Nanjing 211106, China;
    2. NARI Technology Co., Ltd, Nanjing 211106, China;
    3. College of Information Science and Technology, Nanjing Agricultural University, Nanjing 210095, China
  • Received:2019-07-29 Online:2019-09-25 Published:2019-10-11

摘要: 随着电力系统信息化建设的深入,用户对于电能量数据的质量要求逐渐提高,因此保证海量电能量数据的准确性、可靠性以及完整性具有重要意义.本文采用一种基于孤立森林的异常检测算法,实现大规模电能量数据的异常检测.孤立森林算法通过划分大规模电能量数据集,生成随机二叉树和孤立森林构建模型,通过计算测试电能量数据样本到每棵树的根结点的距离检测异常数据点.该算法不仅能够快速处理海量数据,而且结果准确、可靠性高.本文在大规模电能量数据的正向有功总电量PAP和反向有功总电量RAP字段上进行检测,实验结果表明,该算法检测效率较高,并具有较高的检测正确率.

关键词: 孤立森林, 异常检测, 电能量数据

Abstract: With the development of power information systems, users' requirements for the quality of power data has gradually increased. Hence, it is important to ensure the accuracy, reliability, and integrity of massive power data. In this paper, an anomaly detection algorithm based on Isolation Forests is used to realize anomaly detection of large-scale electric energy data. Isolation Forest algorithms generate random binary trees and isolated forest models by dividing training samples and detecting abnormal data points. The algorithm can not only process massive data quickly, but it also offers accurate results and a high degree of reliability. In this paper, the positive active total power (PAP) and reverse active total power (RAP) fields of large-scale electric energy data are determined. The experimental results show that the algorithm has high detection efficiency and accuracy.

Key words: Isolation Forest, anomaly detection, electric energy data

中图分类号: