An optimized deep neural network(DNN) (using MNIST dataset) written without the use of any frameworks ( from the course --> "Andrew W. Trask - Grokking Deep Learning-Manning Publications (2019)" ). A mini-batch gradient descent with a batch-size of 100, tanh used as the activation function for the hidden layer and softmax for the output layer. To avoid overfitting, dropout is used for regularizing the DNN. Backward propagation used to initialize the delta(pure error) values of the hidden layer with the help of the delta value gained from the output layer which is back propagated and multiplied with the respective weights of the hidden layer. And inclusion of other components to assist the gradient descent( for instance alpha value to prevent overshooting), Containing a testing accuracy of 87% . A basic introductory deep learning concept for anyone willing to dive in the world of neurons (or just a code for anyone who's going through the course [mentioned earlier] and is in need of it ).