基于深度强化学习的城市交通信号控制算法

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于深度强化学习的城市交通信号控制算法

详细信息查看全文 | 推荐本文 |

英文篇名：Urban traffic signal control based on deep reinforcement learning
作者：舒凌洲 ; 吴佳 ; 王晨
英文作者：SHU Lingzhou;WU Jia;WANG Chen;School of Information and Software Engineering, University of Electronic Science and Technology of China;
关键词：深度学习 ; 卷积神经网络 ; 强化学习 ; 交通信号控制
英文关键词：deep learning;;Convolutional Neural Network(CNN);;reinforcement learning;;traffic signal control
中文刊名：JSJY
英文刊名：Journal of Computer Applications
机构：电子科技大学信息与软件工程学院;
出版日期：2019-01-28 10:49
出版单位：计算机应用
年：2019
期：v.39;No.345
基金：国家自然科学基金资助项目(61503059)~~
语种：中文;
页：JSJY201905044
页数：5
CN：05
ISSN：51-1307/TP
分类号：255-259

摘要

针对城市交通信号控制中如何有效利用相关信息优化交通控制并保证控制算法的适应性和鲁棒性的问题,提出一种基于深度强化学习的交通信号控制算法,利用深度学习网络构造一个智能体来控制整个区域交通。首先通过连续感知交通环境的状态来选择当前状态下可能的最优控制策略,环境的状态由位置矩阵和速度矩阵抽象表示,矩阵表示法有效地抽象出环境中的主要信息并减少了冗余信息;然后智能体以在有限时间内最大化车辆通行全局速度为目标,根据所选策略对交通环境的影响,利用强化学习算法不断修正其内部参数;最后,通过多次迭代,智能体学会如何有效地控制交通。在微观交通仿真软件Vissim中进行的实验表明,对比其他基于深度强化学习的算法,所提算法在全局平均速度、平均等待队长以及算法稳定性方面展现出更好的结果。其中,与基线相比,平均速度提高9%,平均等待队长降低约13.4%。实验结果证明该方法能够适应动态变化的复杂的交通环境。
To meet the requirements for adaptivity, and robustness of the algorithm to optimize urban traffic signal control, a traffic signal control algorithm based on Deep Reinforcement Learning(DRL) was proposed to control the whole regional traffic with a control Agent contructed by a deep learning network. Firstly, the Agent predicted the best possible traffic control strategy for the current state by observing continously the state of the traffic environment with an abstract representation of a location matrix and a speed matrix, because the matrix representation method can effectively abstract vital information and reduce redundant information about the traffic environment. Then, based on the impact of the strategy selected on the traffic environment, a reinforcement learning algorithm was employed to correct the intrinsic parameters of the Agent constantly in order to maximize the global speed in a period of time. Finally, after several iterations, the Agent learned how to effectively control the traffic.The experiments in the traffic simulation software Vissim show that compared with other algorithms based on DRL, the proposed algorithm is superior in average global speed, average queue length and stability; the average global speed increases 9% and the average queue length decreases 13.4% compared to the baseline. The experimental results verify that the proposed algorithm can adapt to complex and dynamically changing traffic environment.

引文

[1] 李颖宏,王力,尹怡欣.区域交通信号系统节点分析及优化策略研究[J].计算机应用,2010,30(4):1107-1109.(LI Y H,WANG L,YIN Y X.Node analysis and optimization strategy for regional traffic network system [J].Journal of Computer Applications,2010,30(4):1107-1109.)
    [2] CHIU S,CHAND S.Self-organizing traffic control via fuzzy logic[C]// Proceedings of the 32nd IEEE Conference on Decision and Control.Piscataway,NJ:IEEE,1994:1897-1902.
    [3] NAKAMITI G,GOMIDE F.Fuzzy sets in distributed traffic control[C]// Proceedings of IEEE 5th International Fuzzy Systems.Piscataway,NJ:IEEE,1996:1617-1623.
    [4] MIKAMI S,KAKAZU Y.Genetic reinforcement learning for cooperative traffic signal control[C]// Proceedings of the 1st IEEE Conference on Evolutionary Computation.Piscataway,NJ:IEEE,1994:223-228.
    [5] MANIKONDA V,LEVY R,SATAPATHY G,et al.Autonomous Agents for traffic simulation and control[J].Transportation Research Record Journal of the Transportation Research Board,2001,1774(1):1-10.
    [6] LEE J H,LEE-KWANG H.Distributed and cooperative fuzzy controllers for traffic intersections group[J].IEEE Transactions on Systems,Man & Cybernetics Part C:Applications & Reviews,1999,29(2):263-271.
    [7] SUTTON R S,BARTO A G.Reinforcement learning:an introduction[J].IEEE Transactions on Neural Networks,1998,9(5):1054-1054.
    [8] MEDINA J C,HAJBABAIE A,BENEKOHAL R F.Arterial traffic control using reinforcement learning Agents and information from adjacent intersections in the state and reward structure[C]// Proceedings of the 13th International IEEE Conference on Intelligent Transportation Systems.Piscataway,NJ:IEEE,2010:525-530.
    [9] PRASHANTH L A,BHATNAGAR S.Reinforcement learning with function approximation for traffic signal control[J].IEEE Transactions on Intelligent Transportation Systems,2011,12(2):412-421.
    [10] ABDULHAI B,PRINGLE R,KARAKOULAS G J.Reinforcement learning for true adaptive traffic signal control[J].Journal of Transportation Engineering,2003,129(3):278-285.
    [11] BINGHAM E.Reinforcement learning in neurofuzzy traffic signal control[J].European Journal of Operational Research,2001,131(2):232-241.
    [12] LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436.
    [13] LI L,LYU Y S,WANG F Y.Traffic signal timing via deep reinforcement learning[J].IEEE/CAA Journal of Automatica Sinica,2016,3(3):247-254.
    [14] MOUSAVI S S,SCHUKAT M,HOWLEY E.Traffic light control using deep policy-gradient and value-function-based reinforcement learning[J].IET Intelligent Transport Systems,2017,11(7):417-423.
    [15] van der POL E.Deep reinforcement learning for coordination in traffic light control[D].Amsterdam:University of Amsterdam,2016:1-56.
    [16] MNIH V,KAVUKCUOGLU K,SILVER D,et al.Playing atari with deep reinforcement learning[J/OL].arXiv Preprint,2013,2013:arXiv:1312.5602 [2013- 12- 09].https://arxiv.org/abs/1312.5602.
    [17] MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529.
    [18] LI Y X.Deep reinforcement learning:an overview[J/OL].arXiv Preprint,2017,2017:arXiv:1701.07274 [2017- 01- 25].https://arxiv.org/abs/1701.07274.
    [19] DULACARNOLD G,EVANS R,SUNEHAG P,et al.Reinforcement learning in large discrete action spaces[J/OL].arXiv Preprint,2016,2016:arXiv:1603.06861 [2016- 03- 22].https://arxiv.org/abs/1603.06861.
    [20] MNIH V,BADIA A P,MIRZA M,et al.Asynchronous methods for deep reinforcement learning[J/OL].arXiv Preprint,2016,2016:arXiv:1603.01783 [2016- 02- 04].https://arxiv.org/abs/1602.01783.
    [21] WANG Z,SCHAUL T,HESSEL M,et al.Dueling network architectures for deep reinforcement learning[C]// Proceedings of the 33rd International Conference on International Conference on Machine Learning.New York:JMLR.org,2016:1995-2003.
    [22] DULAC-ARNOLD G,EVANS R,HASSELT H V.Deep reinforcement learning in large discrete action spaces[J/OL].arXiv Preprint,2015,2015:arXiv:1512.07679 [2015- 12- 24].https://arxiv.org/abs/1512.07679.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700