收稿日期: 2020-09-11
网络出版日期: 2021-11-26
Anomaly detection of transformer loss data based on a robust random cut forest
Received date: 2020-09-11
Online published: 2021-11-26
在智能电网飞速发展的趋势下, 新型数字基础设施建设成为电力企业的核心业务之一, 电力企业数据的治理和智能化分析为平台运营、数据增值变现等商业模式创新提供了条件. 在电力数字化和智能化治理背景下, 使用鲁棒性随机分割森林算法实现变压器损耗数据的异常值智能化检测. 通过鲁棒性随机分割森林算法划分样本点以构建鲁棒性随机分割森林结构模型, 通过插入和删除样本点对结构复杂度的影响程度给定该样本点的异常值评分. 鲁棒性随机分割森林算法适用于实时损耗数据异常检测, 兼顾异常值检测效果和运行效率, 具有较高的可信度. 对真实变压器损耗数据集中进行异常值检测试验, 实验结果表明该算法高效、灵活, 相较于其他方法, 精确率、召回率及运行效率均有显著提升.
张国芳 , 温丽丽 , 吴蒙 , 刘通宇 , 郑宽昀 , 黄福兴 , 袁培森 . 基于鲁棒性随机分割森林算法的变压器损耗异常值检测[J]. 华东师范大学学报(自然科学版), 2021 , 2021(6) : 135 -146 . DOI: 10.3969/j.issn.1000-5641.2021.06.014
With the rapid development of smart grids, the construction of new digital infrastructure has become one of the core businesses of power companies. Power companies’ governance and intelligent analytical capabilities enable opportunities for business model innovation, such as platform operation and value-added data realization. In the context of power digitization and intelligent governance, we use the robust random cut forest in this paper for transformer loss data anomaly intelligence detection. The algorithm divides sample points by random cutting to construct a random cut forest structure model by inserting and removing sample points in the structure; the anomaly score of a sample point is then given by the influence of complexity. This method is suitable for anomaly detection on real-time loss data and offers a high degree of credibility, effectiveness, and efficiency. An experiment of anomaly detection on real transformer loss data shows that the method is efficient and flexible. The accuracy, recall, and efficiency of the proposed method, moreover, is substantially better than alternatives.
1 | 王忠杰, 文乐, 杨新民. 大数据在智能化电厂中的应用研究与展望. 中国电力, 2019, 52 (3): 133- 139. |
2 | 李炳森, 胡全贵, 陈小峰, 等. 电网企业数据中台的研究与设计. 电力信息化, 2019, 17 (7): 29- 34. |
3 | 林鸿, 方学民, 袁葆, 等. 电力物联网多渠道客户服务中台战略研究与设计. 供用电, 2019, 36 (6): 39- 45. |
4 | SUNDARARAJAN A, HERNANDEZ A S, SARWAT A I. Adapting big data standards, maturity models to smart grid distributed generation: Critical review. IET Smart Grid, 2020, 3 (4): 508- 519. |
5 | PASSERINI F, TONELLO A M. Smart grid monitoring using power line modems: Effect of anomalies on signal propagation. IEEE Access, 2019, (7): 27302- 27312. |
6 | 刘树仁, 宋亚奇, 朱永利, 等. 基于Hadoop的智能电网状态监测数据存储研究. 计算机科学, 2013, 40 (1): 81- 84. |
7 | HUO Y, PRASAD G, ATANACKOVIC L, et al. Cable diagnostics with power line modems for smart grid monitoring. IEEE Access, 2019, (7): 60206- 60220. |
8 | WITTEN I H, FRANK E, HALL M A, et al. Data Mining: Practical Machine Learning Tools and Techniques [M]. 4th ed. San Francisco: Morgan Kaufmann, 2016. |
9 | COSTA D, PORTELA F, SANTOS M F. An overview of data mining representation techniques [C]// Proceedings of the 2019 7th International Conference on Future Internet of Things and Cloud Workshops. IEEE, 2019: 90-95. |
10 | AKOGLU L, TONG H, KOUTRA D. Graph based anomaly detection and description: A survey. Data Mining & Knowledge Discovery, 2015, 29 (3): 626- 688. |
11 | CHANDOLA V, BANERJEE A, KUMAR V. Anomaly detection for discrete sequences: A survey. IEEE Transactions on Knowledge & Data Engineering, 2012, 24 (5): 823- 839. |
12 | TRAN T N, DRAB K, DASZYKOWSKI M. Revised DBSCAN algorithm to cluster data with dense adjacent clusters. Chemometrics & Intelligent Laboratory Systems, 2013, 120, 92- 96. |
13 | 王文红, 李惊涛, 陈俊彦, 等. 基于聚类算法对异常事件分析评价电能表整体状态的方法: CN201310624924.4 [P]. 2014-03-12. |
14 | LIU F T, TING K M, ZHOU Z. Isolation forest [C]// 2008 Eighth IEEE International Conference on Data Mining. IEEE, 2008: 413-422. |
15 | 余翔, 陈国洪, 李霆, 等. 基于孤立森林算法的用电数据异常检测研究. 信息技术, 2018, 42 (12): 88- 92. |
16 | GUHA S, MISHRA N, ROY G, et al. Robust random cut forest based anomaly detection on streams [C]// International Conference on Machine Learning. PMLR, 2016: 2712-2721. |
17 | INOUE J, YAMAGATA Y, CHEN Y, et al. Anomaly detection for a water treatment system using unsupervised machine learning [C]// Proceedings of the 2017 IEEE International Conference on Data Mining Workshops. IEEE, 2017: 1058-1065. |
18 | BARTOS M, MULLAPUDI A, TROUTMAN S. RRCF: Implementation of the robust random cut forest algorithm for anomaly detection on streams. Journal of Open Source Software, 2019, 4 (35): 1336. |
19 | WANG Y, WANG Z, XIE Z, et al. Practical and white-box anomaly detection through unsupervised and active learning [C]// 2020 29th International Conference on Computer Communications and Networks. IEEE, 2020. DOI: 10.1109/ICCCN49398.2020.9209704. |
20 | BOX G E P, JENKINS G M, REINSEL G C, et al. Time series analysis: Forecasting and control. Journal of the Operational Research Society, 2015, 22 (2): 199- 201. |
21 | HABEEB R A A, NASARUDDIN F, GANI A, et al. Real-time big data processing for anomaly detection: A survey. International Journal of Information Management, 2019, 45, 289- 307. |
/
〈 |
|
〉 |