博文

平行动态规划

已有 8979 次阅读 2017-3-7 14:06 |系统分类:博客资讯

PDP: Parallel Dynamic Programming

Fei-Yue Wang, Fellow, IEEE, Jie Zhang, Member, IEEE, Qinglai Wei, Member, IEEE, Xinhu Zheng, Student Member, IEEE, and Li Li, Fellow, IEEE

Institute of Automation, Chinese Academy of Sciences, NationalUniversity of Defense Technology, Qingdao Academy of Intelligent Industries,Tsinghua University, China, University of Minnesota, USA

Abstract: Deep reinforcement learning is a focus research area inartificial intelligence. The principle of optimality in dynamic programming isa key to the success of reinforcement learning methods. The principle ofadaptive dynamic programming (ADP) is first presented instead of direct dynamicprogramming (DP), and the inherent relationship between ADP and deepreinforcement learning is developed. Next, analytics intelligence, as the necessaryrequirement, for the real reinforcement learning, is discussed. Finally, theprinciple of the parallel dynamic programming, which integrates dynamicprogramming and analytics intelligence, is presented as the futurecomputational intelligence.

Index Terms: Parallel dynamic programming, Dynamic programming, Adaptive dynamicprogramming, Reinforcement learning, Deep learning, Neural networks, Artificialintelligence.

Citation: F.-Y. Wang, J. Zhang, Q. L. Wei, X. H. Zheng, and L. Li, “PDP: paralleldynamic programming,” IEEE/CAA Journal of Automatica Sinica, vol. 4, no. 1, pp.1-5, Jan. 2017.

Full Text-PDF:

PDP Parallel Dynamic Programming.pdf