This is a Tensorflow implementation of Double-DQN Agent [paper]
In this work, I represent a simple self-driving car using the DDQN algorithm.
The algorithm consists of two main parts. The model and the agent
- Model
The model is an artificial neural network that consists of five layers.
-> The input layer : represents the distance of the car from the walls on the front side, left side, and right side.
-> hidden layers : There are 3 hidden layers, each layer containing 32 neurons.
-> out-put layer : The output layer represents the action that the agent takes based on the calculation of the NN,
There are three possible actions: left, right, and break (decrease speed).
Model Structure
- Agent
The Q-learning agent uses a Q-table, which is a multi-dimotional table.
containsin the different stateofom thenvironmentnt beside the score for that state
The q-table for each state is normally updated by the corresponding scroe for that state.
But with using the DDQN algorithm, we actually use two models known as Q-eval and Q-target.
The Q-eval is the model responsible for the training and taking action.
On the other hand, we do not train the Q-target; instead, we only update its weights every specified number of episodes.
The role of the Q-target is to update the Q-table so that the agent does not get stuck in one area and get used to low scores.
Further explanation can be found in the original paper.
We minimize the mean squared error between Q and Q* , but we have Q' slowly copy the parameters of Q . We can do so by periodically hard-copying over the parameters
Model/
: includes different trained models saved in an H5 format.Track/
: contains the structure of the track to easley train model on various tracks.Learning rate graphs/
: includes the graphical representation of the training (score & average score).
The code is developed using python 3.9.0 on Windows 10. NVIDIA GPU (GT 340M or above) ared needed to train and test.
See requirements.txt
for other dependencies.
This code is freely available for free non-commercial use, and may be redistributed under these conditions. Please, see the LICENSE for further details. Third-party datasets and softwares are subject to their respective licenses.