用户名: 密码: 验证码:
基于事件的优化方法简介及其在能源互联网中的应用
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A tutorial on event-based optimization with application in energy Internet
  • 作者:贾庆山 ; 杨玉 ; 夏俐 ; 管晓宏
  • 英文作者:JIA Qing-Shan;YANG Yu;XIA Li;GUAN Xiaohong;Center for Intelligent and Networked Systems, Department of Automation, Tsinghua University;State Key Laboratory for Intelligent Network and Network Security of Ministry of Education,Xi'an Jiaotong University;
  • 关键词:事件驱动 ; 性能势 ; 事件Q因子 ; 性能差分 ; 仿真优化 ; 能源互联网
  • 英文关键词:event-based;;performance potential;;Q-factors;;performance difference;;simulation-based optimization;;energy internet
  • 中文刊名:KZLY
  • 英文刊名:Control Theory & Applications
  • 机构:清华大学自动化系智能与网络化系统研究中心;西安交通大学智能网络与网络安全教育部重点实验室;
  • 出版日期:2018-01-15
  • 出版单位:控制理论与应用
  • 年:2018
  • 期:v.35
  • 基金:国家重点研发计划(2016YFB0901900);; 国家自然科学基金项目(61673229,61174072,61222302,91224008,61221063,U1301254)资助~~
  • 语种:中文;
  • 页:KZLY201801005
  • 页数:9
  • CN:01
  • ISSN:44-1240/TP
  • 分类号:35-43
摘要
许多实际系统具有事件驱动的特性,即系统状态的动态演化由一系列离散事件触发,这类系统称为离散事件动态系统(discrete event dynamic system,DEDS).针对这类系统的性能优化,本文介绍一种基于事件的优化模型(event-based optimization,EBO).该模型的典型特征是基于事件采取决策,与马尔科夫决策过程(Markov decision process,MDP)基于状态的决策方法相比具有如下几个方面的优点:一是一个事件通常对应一组具有相同特征的状态转移的集合,系统的事件数目往往远小于状态数,因此可利用系统的事件特征实现性能势集结,缓解问题的维数灾难题;二是许多实际系统只要求在特定事件发生时采取行动,对于这类系统,马尔科夫决策过程难以有效利用系统的结构信息.具体而言,马尔科夫决策过程要求不同状态下的决策独立,而系统的同一个事件通常对应着多种不同状态,难以利用相同事件可采取相同决策的结构特点.本文以马尔科夫决策过程为基础,重点围绕3个方面展开:一是介绍基于事件优化模型的基本概念及其理论和应用发展;二是介绍事件优化模型中基于性能势或事件Q因子的策略迭代算法;三是以建筑微电网中分布式风力发电供给电动汽车充电的协调优化问题为例,探讨基于事件的优化模型在能源互联网系统(energy internet)中的应用前景.
        In many practical systems, the control or decision-making is usually triggered by certain events. These systems are classified as discrete event dynamic systems(DEDSs). Considering the performance optimization of these systems, a new optimization framework called event-based optimization(EBO) is introduced in this paper. Compared with Markov decision process(MDP), one of the main characteristics of EBO is that decisions are made based on "events" rather than states. In this regard, there exist a number of advantages for EBO. First, an event usually corresponds to a set of state transitions with some common properties. Generally, the number of events of a system requiring decisions is much smaller than that of states. Therefore, the EBO approach can utilize the special structure of systems characterized by events to aggregate the potentials, thus alleviating the curses of dimensionality. Second, the EBO approach applies to many practical problems where actions are required only when certain events happen. Such problems do not fit well the standard MDP formulation in which the decisions made based on different states are independent. However, for that cases, the same action may be taken for the same event, which may correspond to many different states. Based on the basic theory of MDP, this paper is addressed around three aspects. First, we briefly review the basic ideas of EBO and the development for its theory and applications. Second, we introduce the simulation-based policy iteration methods for EBO based on the performance potentials or Q-factors; Third, a case study is conducted on the coordination of electric vehicle charging with the distributed wind power generation of a building, which aims to shed some lights on the application of EBO in energy Internet.
引文
[1]CASSANDRAS C G,LANFORTUNE S.Introduction to Discrete Event Systems[M].New York:Springer Science&Business Media,2009.
    [2]PUTERMAN M L.Markov Decision Processes:Discrete Stochastic Dynamic Programming[M].Hoboken:John Wiley&Sons,2014.
    [3]FANG Yangwang,WANG Hongqiang,WU Youli.Multiple electric cylinders proportion synchronization control based on improved cycle-coupling structure[J].Control Theory&Applications,2010,27(1):99–102.(方洋旺,王洪强,伍友利.具有条件马尔科夫结构的离散随机系统最优控制[J].控制理论与应用,2010,27(1):99–102.)
    [4]XIA L.Event-based optimization of admission control in open queueing networks[J].Discrete Event Dynamic Systems,2014,24(2):133–151.
    [5]ZHAO Y,ZHAO Q,JIA Q S,et al.Event-based optimization for dispatching policies in material handling systems of general assembly lines[C]//IEEE Conference on Decision and Control.Mexico:IEEE,2008:2173–2178.
    [6]REN Z,KROGH B H.State aggregation in Markov decision processes[C]//Proceedings of the 41st IEEE Conference on Decision and Control.Las Vegas:IEEE,2002:3819–3824.
    [7]JIA Q S.On state aggregation on approximate complex value functions in large-scale Markov decision process[J].IEEE Transactions on Automatic Control,2011,56(2):333–344.
    [8]XIA L,ZHAO Q,JIA Q S.A structure property of optimal policies for maintenance problems withsafety-critical components[J].IEEE Transactions on Automation Science and Engineering,2008,5(3):519–531.
    [9]JIA Q S.A structural property of optimal policies for multicomponent maintenance problems[J].IEEE Transactions on Automation Science and Engineering,2010,7(3):677–680.
    [10]HUANG Q,JIA Q S,QIU Z,et al.Matching EV charging load with uncertain wind:A simulation-based policy improvement approach[J].IEEE Transactions on Smart Grid,2015,6(3):1425–1433.
    [11]BERTSEKAS D P,TSITSIKLIS J N.Neuro-dynamic programming:an overview[C]//Proceedings of the 34th IEEE Conference on Decision and Control.New Orleans:IEEE,1995:560–564.
    [12]SUTTON R S,BARTO A G.Reinforcement Learning:An Introduction[M].Cambridge:MIT press,1998.
    [13]POWELL W B.Approximate Dynamic Programming:Solving the Curses Of Dimensionality[M].Hoboken:John Wiley&Sons,2007.
    [14]ARZEN K.A simple event-based PID controller[C]//Proceedings of1999 IFAC World Congress.Oxford:Pergamon,1999:423–428.
    [15]HEEMELS W P M H,SANDEE J H,VAN DEN BOSCH P P J.Analysis of event-driven controllers for linear systems[J].International journal of control,2008,81(4):571–590.
    [16]WANG X,LEMMON M D.Event-triggering in distributed networked control systems[J].IEEE Transactions on Automatic Control,2011,56(3):586–601.
    [17]WU W,ARAPOSTATHIS A.Optimal control of stochastic systems with costly observations-the general markovian model and the LQG problem[C]//Proceedings of the 2005 American Control Conference.Portland:IEEE,2005:294–299.
    [18]IMER O C,BASAR T.Optimal estimation with limited measurements[C]//Proceedings of the 44th IEEE Conference on Decision and Control.Seville:IEEE,2005:1029–1034.
    [19]COGILL R,LALL S,HESPANHA J P.A constant factor approximation algorithm for event-based sampling[M]//Perspectives in Mathematical System Theory,Control,and Signal Processing.Berlin:Springer,2010:51–60.
    [20]CHEN Dan,XI Ning,WANG Yuechao,et al.Event-based predictive control strategy for teleoperation via internet[J].Control Theory&Applications,2010,27(5):623–626.(陈丹,席宁,王越超,等.网络遥控操作系统中基于事件的预测控制策略[J].控制理论与应用,2010,27(5):623–626.)
    [21]JIANG Qi,XI Hongsheng,YIN Baoxian.Online adaptive optimization for event-driven dynamic service composition[J].Control Theory&Applications,2011,28(8):1049–1055.(江琦,奚宏生,殷保鲜.事件驱动的动态服务组合策略在线自适应优化[J].控制理论与应用,2011,28(8):1049–1055.)
    [22]BERNHARDSSON B,ASTROM K J.Comparison of periodic and event based sampling for first-order stochastic systems[C]//Preprints of the 14th IFAC World Congress.Oxford:Pergamon,1999.
    [23]CAO X R.Basic ideas for event-based optimization of Markov systems[J].Discrete Event Dynamic Systems,2005,15(2):169–197.
    [24]CAO X R.Stochastic learning and optimization-a sensitivity-based approach[J].IFAC Proceedings Volumes,2008,41(2):3480–3492.
    [25]CAO X R.Stochastic Learning and Optimization—A SensitivityBased Approach[M].New York:Springer Science;Business Media,2007.
    [26]WANG D X,CAO X R.Event-based optimization for POMDPs and its application in portfolio management[J].IFAC Proceedings Volumes,2011,44(1):3228–3233.
    [27]JIA Q S,GUO Y.Event-based evacuation in outdoor environment[C]//IEEE Chinese Control and Decision Conference(CCDC).Taiyuan:IEEE,2012:33–38.
    [28]JIA Q S,WEN Z,XIA L.Event-based sensor activation for indoor occupant distribution estimation[C]//IEEE International Conference on Control Automation Robotics&Vision.Guangzhou:IEEE,2012:240–245.
    [29]SUN B,LUH P B,JIA Q S,et al.Event-based optimization within the lagrangian relaxation framework for energy savings in HVAC systems[J].IEEE Transactions on Automation Science and Engineering,2015,12(4):1396–1406.
    [30]WU Z,JIA Q S,GUAN X.Optimal control of multiroom hvac system:an event-based approach[J].IEEE Transactions on Control Systems Technology,2016,24(2):662–669.
    [31]ZHANG J.A special case of partially observable Markov decision processes problem by event-based optimization[C]//2016 IEEE International Conference on Industrial Technology(ICIT).Taipei:IEEE,2016:1522–1526.
    [32]XIA L,JIA Q S,CAO X R.A tutorial on event-based optimization–a new optimization framework[J].Discrete Event Dynamic Systems,2014,24(2):103–132.
    [33]JIA Q S.On Solving event-based optimization with average reward over infinite stages[J].IEEE Transactions on Automatic Control,2011,56(12):2912–2917.
    [34]JIA Q S.On solving optimal policies for finite-stage event-based optimization[J].IEEE Transactions on Automatic Control,2011,56(9):2195–2200.
    [35]SONDIK E J.The optimal control of partially observable markov decision processes[D].Stanford:Stanford University,1971.
    [36]SMALLWOOD R D,SONDIK E J.The optimal control of partially observable Markov processes over a finite horizon[J].Operations research,1973,21(5):1071–1088.
    [37]SPAAN M T J.Partially Observable Markov Decision Processes[M]//Reinforcement learning.Berlin:Springer Heidelberg,2012:387–414.
    [38]YU Shenhang,SUN Ying,NIU Xiaona,et al.Energy internet system based on distributed renewable energy generation[J].Electric Power Automation Equipment,2010,(5):104–108.(于慎航,孙莹,牛晓娜,等.基于分布式可再生能源发电的能源互联网系统[J].电力自动化设备,2010,(5):104–108.)
    [39]TSOUKALAS L H,GAO R.From smart grids to an energy internet:Assumptions,architectures and requirements[C]//IEEE 3rd International Conference on Electric Utility Deregulation and Restructuring and Power Technologies.Nanjing:IEEE,2008:94–98.
    [40]DONG Zhaoyang,ZHAO Junhua,WEN Fushuan,et al.From smartgrid to energy internet:basic concept and research framework[J].Automation of Electric Power Systems,2014,38(15):1–11.(董朝阳,赵俊华,文福拴,等.从智能电网到能源互联网:基本概念与研究框架[J].电力系统自动化,2014,38(15):1–11.)
    [41]HUANG A Q,CROW M L,HEYDT G T,et al.The future renewable electric energy delivery and management(FREEDM)system:the energy internet[J].Proceedings of the IEEE,2011,99(1):133–148.
    [42]DOU C,YUE D,HAN Q L,et al.Multi-agent system based eventtriggered hybrid control scheme for energy internet[J].IEEE Access,2017,5:3263–3272.
    [43]KARFOPOULOS E L,HATZIARGYRIOU N D.Distributed coordination of electric vehicles providing v2g services[J].IEEE Transactions on Power Systems,2016,31(1):329–338.
    [44]VASIRANI M,KOTA R,CAVALCANTE R,et al.An agent-based approach to virtual power plants of wind power generators and electric vehicles[J].IEEE Transactions on Smart Grid,2013,4(3):1314–1322.
    [45]HUANG Q,JIA Q S,XIA L,et al.EV charging load scheduling following uncertain renewable energy supply by stochastic matching[C]//2014 IEEE International Conference on Automation Science and Engineering(CASE).Taipei:IEEE,2014:137–142.
    [46]CHIS A,LUNDEN J,KOIVUNEN V.Reinforcement learning-based plug-in electric vehicle charging with forecasted price[J].IEEE Transactions on Vehicular Technology,2016,66(5):3674–3684.
    [47]ISHUGAH T F,Li Y,WANG R Z,et al.Advances in wind energy resource exploitation in urban environment:a review[J].Renewable and Sustainable Energy Reviews,2014,37:613–626.
    [48]HUANG Q,JIA Q S,QIU Z,et al.Matching EV charging load with uncertain wind:a simulation-based policy improvement approach[J].IEEE Transactions on Smart Grid,2015,6(3):1425–1433.
    [49]YANG Y,JIA Q S,GUAN X.The joint scheduling of EV charging load with building mounted wind power using simulation-based policy improvement[C]//IEEE International Symposium on Flexible Automation.Cleveland:IEEE,2016:165–170.
    [50]REN Z,KROGH B H.State aggregation in Markov decision processes[C]//IEEE Conference on Decision and Control.Las Vegas:IEEE,2002,4:3819–3824.
    [51]CAO X R,ZHAO Y,JIA Q S,et al.An introduction to event-based optimization:theory and applications[C]//Reinforcement Learning and Approximate Dynamic Programming for Feedback Control.[S.n.]:[s.l.],2013:432–451.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700