Playing Atari with Deep Reinforcement Learning

About

Replicating Google Deepmind's paper "Playing Atari with Deep Reinforcement Learning"

Dependencies

Numpy
Tensorflow
Matplotlib
OpenAI Gym

Getting started

The network architecture is in DQN.py. The class replayMemory.py stores and manages the transitions created during training. The file main.py is used to run the whole program callable using

python3 main.py

The network is saved in the folder myModel while the tensorboard's file in results

Result

I implemented DQN on the games Pong and Breakout. I first used the hyperparameters given on the Nature paper but the agent was not able to learn any policy better than a random one. The agent was outputting the same q(s,a) for different states maybe due to neurons died problem.

This can happens when a big gradient value changes the weights linked to the neuron in such a way that the neuron will always output a very negative logit. So, even if the learning rate was given by the paper and the architecture is the same, could happen that a minimum difference in the settings of the architecture such as using a different weights initializer or a different frame preprocessing could make the learning rate 0.00025 not the optimal one.

In conclusion, I decided to use the hyperparameters used in this article. More precisely:

Learning Rate: 0.0001
Target Update Frequency: 1000 training steps
Initialize Replay Buffer: 10000 transactions
Epsilon Decay: 100000 steps
Final Epsilon: 0.01

Moreover, I used Adam Optimizer instead of RMSProp Optimizer which experimentally has given me better results in a shorter period of time. My DQN required 700 episoded which are around 5 hours and 20 minutes to master Pong and more than 3000 episodes which are around 13 hours and 30 minutes to play decently Breakout.

Sources

arXiv by Deepmind
Nature paper by Deepmind
Full article about the paper

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Playing Atari with Deep Reinforcement Learning

About

Dependencies

Getting started

Result

Sources

Files

README.md

Latest commit

History

README.md

File metadata and controls

Playing Atari with Deep Reinforcement Learning

About

Dependencies

Getting started

Result

Sources