Backend: Model Hyperparameter Tuning #30

AngelinaZhai · 2023-03-01T17:34:15Z

Experiment with hyperparameters on updated main branch. Convert the following tasks to issues and assign yourself.

For this task, please focus on tuning the following model parameters:

Batch size, number of epochs, learning rate
Number of neutrons (ie. hidden fully connected layers) in model (this should be modified in the model itself)
Number of stacked layers in the model
Size of data fed into the network

Tips

Deep learning training can be very taxing on your machine, and your hardware conditions CAN affect the training results and speed. Please regularly monitor the program's memory usage (this can be done with htop on Linux, Activity Monitor on Mac, and equivalents on Windows). If you notice that you're almost out of memory (for instance, have 1.5GB of space left on a 16GB ram) and the model is about halfway through training, you may have to stop the training process and reboot the computer to clear your RAM. DO NOT use the restart option, turn it off and on again manually. A lack of memory can cause the network to stop learning altogether.
You do not need to write down every single set of parameter change, but it is good practice to commit when one creates a good result and specify what you changed. You can also use local timeline tracker (ie. Timeline view on VSCode) if you need to revert your code to a previous iteration.
Looking at the loss and accuracy graphs will provide you a good overview of the model performance, but you may have to try many things to improve the model performance. In my experience, consulting ChatGPT has been helpful when I describe what I have tried, and what the output graphs look like. Do crosscheck its suggestions with Google and StackOverflow as it can produce misinformation at times.

Resources

Provide feedback