用户名: 密码: 验证码:
基于Spark Streaming的电力流式大数据分析架构及应用
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Power Streaming Big Data Analysis Architecture and Application Based on Spark Streaming
  • 作者:田璐 ; 齐林海 ; 李青 ; 王红 ; 田世明 ; 卜凡鹏
  • 英文作者:TIAN Lu;QI Linhai;LI Qing;WANG Hong;TIAN Shiming;BU Fanpeng;School of Control and Computer Engineering,North China Electric Power University;China Electric Power Research Institute Co.,Ltd.;
  • 关键词:Spark ; Streaming ; 电力流式大数据 ; 电力数据分析 ; 异常检测
  • 英文关键词:Spark Streaming;;power streaming big data;;power data analysis;;anomaly detection
  • 中文刊名:DXXH
  • 英文刊名:Electric Power Information and Communication Technology
  • 机构:华北电力大学控制与计算机工程学院;中国电力科学研究院有限公司;
  • 出版日期:2019-02-15
  • 出版单位:电力信息与通信技术
  • 年:2019
  • 期:v.17;No.186
  • 语种:中文;
  • 页:DXXH201902004
  • 页数:7
  • CN:02
  • ISSN:10-1164/TK
  • 分类号:27-33
摘要
近年来,为了应对许多业务需求的实时性要求,大数据流计算得到了研究。文章通过使用Apache Hadoop、Spark Streaming、Kafka和NoSQL Cassandra等开源资源,提出了一种用于电力流式大数据分析的通用架构。通过高吞吐量发布-订阅消息传递、实时计算和分布式存储系统的结合有效地解决并发访问数据流的收集、存储、实时分析等问题,从而实现电力行业流数据的实时分析。最后构建用电数据实时异常检测系统验证了其性能。
        In recent years, to meet the real-time requirements of many businesses, big data stream computing has been studied. This paper proposes a general architecture for power streaming big data analysis by using open source resources such as Apache Hadoop, Spark Streaming, Kafka and NoSQL Cassandra. The combination of high-throughput publish-subscribe messaging, real-time computing, and distributed storage systems effectively solves the problem of concurrent access to data stream collection, storage, real-time analysis, etc., enabling real-time analysis of streaming data in the power industry. Finally, we built a real-time anomaly detection system using electricity data to verify its performance.
引文
[1]孙大为,张广艳,郑纬民.大数据流式计算:关键技术及系统实例[J].软件学报,2014,25(4):839-862.SUN Dawei,ZHANG Guangyan,ZHENG Weimin.Big data streaming computation:key technologies and system examples[J].Journal of Software,2014,25(4):839-862.
    [2]周国亮,吕凛杰,王桂兰.电力大数据全景实时分析关键技术[J].电信科学,2016,32(4):159-168.ZHOU Guoliang,LV Linjie,WANG Guilan.Key technology of panoramic real-time analysis of power big data[J].Telecommunications Science,2016,32(4):159-168.
    [3]TRILLES S,SCHADE S,BELMONTEó,et al.Real-time anomaly detection from environmental data streams[M]//In Geographic Information Science as an Enabler of Smarter Cities and Communities,Lecture Notes in Geoinformation and Cartography,Springer International Publishing,2015:125-144.
    [4]ZHAO S,CHANDRASHEKAR M,LEE Y,et al.Real-time network anomaly detection system using machine learning[C]//Design of Reliable Communication Networks,IEEE,2015:267-270.
    [5]TA V D,LIU C M,NKABINDE G W.Big data stream computing in healthcare real-time analytics[C]//IEEE International Conference on Cloud Computing and Big Data Analysis,2016:37-42.
    [6]ZHOU Y,WANG Y,MA X.A user behavior anomaly detection approach based on sequence mining over data streams[C]//International Conference on Parallel and Distributed Computing,Applications and Technologies,IEEE,2017:376-381.
    [7]SUZUMURA T,OIKI T.StreamWeb:real-time web monitoring with stream computing[C]//IEEE International Conference on Web Services,2011:620-627.
    [8]CHINTAPALLI S,DAGIT D,EVANS B,et al.Benchmarking streaming computation engines:storm,flink and spark streaming[C]//Parallel and Distributed Processing Symposium Workshops,2016 IEEE International,2016:1789-1792.
    [9]崔星灿,禹晓辉,刘洋,等.分布式流处理技术综述[J].计算机研究与发展,2015,52(2):318-332.CUI Xingcan,YAN Xiaohui,LIU Yang,et al.Overview of distributed stream processing technology[J].Journal of Computer Research and Development,2015,52(2):318-332.
    [10]CONVOLBO M W,CHOU J,HSU C H,et al.GEODIS:towards the optimization of data locality-aware job scheduling in geodistributed data centers[J].Computing,2017,100(1):21-46.
    [11]MONEDERO I,BISCARRI F,León C,et al.Detection of frauds and other non-technical losses in a power utility using Pearson coefficient,Bayesian networks and decision trees[J].International Journal of Electrical Power&Energy Systems,2012,34(1):90-98.
    [12]NIZAR A H,DONG Z Y,WANG Y.Power utility nontechnical loss analysis with extreme learning machine method[J].IEEETransactions on Power Systems,2008,23(3):946-955.
    [13]ANGELOS E W S,SAAVEDRA O R,Cortés O A C,et al.Detection and identification of abnormalities in customer consumptions in power distribution systems[J].IEEETransactions on Power Delivery,2011,26(4):2436-2442.
    [14]庄池杰,张斌,胡军,等.基于无监督学习的电力用户异常用电模式检测[J].中国电机工程学报,2016,36(2):379-387.ZHUANG Chijie,ZHANG Bin,HU Jun,et al.Detection of abnormal power consumption mode of power users based on unsupervised learning[J].Proceedings of the CSEE,2016,36(2):379-387.
    [15]刘水,刘强,周奇,等.基于用户信息采集的智能反窃电监控技术[J].江西电力,2017,41(8):17-19.
    [16]严英杰,盛戈皞,陈玉峰,等.基于大数据分析的输变电设备状态数据异常检测方法[J].中国电机工程学报,2015,35(1):52-59.YAN Yingjie,SHENG Gehao,CHEN Yufeng,et al.Anomaly detection method for state data of transmission and distribution equipment based on big data analysis[J].Proceedings of the CSEE,2015,35(1):52-59.
    [17]王德文,杨力平.智能电网大数据流式处理方法与状态监测异常检测[J].电力系统自动化,2016,40(14):122-128.WANG Dewen,YANG Liping.Big data stream processing method and state monitoring anomaly detection in smart grid[J].Automation of Electric Power Systems,2016,40(14):122-128.
    [18]周志阳,冯百明,杨朋霖,等.基于Storm的流数据KNN分类算法的研究与实现[J].计算机工程与应用,2017,53(19):71-75,97.ZHOU Zhiyang,FENG Baiming,YANG Penglin,et al.Research and implementation of stream data KNN classification algorithm based on Storm[J].Computer Engineering and Applications,2017,53(19):71-75,97.
    [19]Electricity load diagrams 2011-2014 data set[DB/OL].[2018-06-10].http://archive.ics.uci.edu/ml/datasets/ElectricityLoadDiagra ms20112014.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700