Collection of DQNs that run in OpenAI gym. Simple Policy Learning Network: As the name suggests this network learns policies to solve the basic problem in Reinforcement Learning of CartPole Balancing. I have used an architecture of 4x8x2 to learn the policy. The learning rate is 0.3 and the gamma - discount factor - is 0.99. On average this currently takes around 500 episodes to finish the task.
I plan to expand this repository and soon train value learning and policy learning DQNs for Atari Games.