Proximal_Policy_Optimization
代码说明:
说明: 强化学习可以按照方法学习策略来划分成基于值和基于策略两种。而在深度强化学习领域将深度学习与基于值的Q-Learning算法相结合产生了DQN算法,通过经验回放池与目标网络成功的将深度学习算法引入了强化学习算法。(Reinforcement learning can be divided into value-based learning and strategy based learning according to method learning strategies. In the field of deep reinforcement learning, dqn algorithm is generated by combining deep learning with value-based Q-learning algorithm. Through experience playback pool and target network, deep learning algorithm is successfully introduced into reinforcement learning algorithm.)
文件列表:
Proximal_Policy_Optimization, 0 , 2019-04-08
Proximal_Policy_Optimization\discrete_DPPO.py, 8808 , 2019-01-21
Proximal_Policy_Optimization\DPPO.py, 8270 , 2019-01-21
Proximal_Policy_Optimization\simply_PPO.py, 6458 , 2019-01-21
下载说明:请别用迅雷下载,失败请重下,重下不扣分!