Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend: Model Hyperparameter Tuning #30

Open
3 tasks
AngelinaZhai opened this issue Mar 1, 2023 · 0 comments
Open
3 tasks

Backend: Model Hyperparameter Tuning #30

AngelinaZhai opened this issue Mar 1, 2023 · 0 comments

Comments

@AngelinaZhai
Copy link
Owner

Experiment with hyperparameters on updated main branch. Convert the following tasks to issues and assign yourself.

  • RNN (single and bi directional)
  • LSTM (single and bi directional)
  • GRU (single and bi directional)

For this task, please focus on tuning the following model parameters:

  • Batch size, number of epochs, learning rate
  • Number of neutrons (ie. hidden fully connected layers) in model (this should be modified in the model itself)
  • Number of stacked layers in the model
  • Size of data fed into the network

Tips

  • Deep learning training can be very taxing on your machine, and your hardware conditions CAN affect the training results and speed. Please regularly monitor the program's memory usage (this can be done with htop on Linux, Activity Monitor on Mac, and equivalents on Windows). If you notice that you're almost out of memory (for instance, have 1.5GB of space left on a 16GB ram) and the model is about halfway through training, you may have to stop the training process and reboot the computer to clear your RAM. DO NOT use the restart option, turn it off and on again manually. A lack of memory can cause the network to stop learning altogether.
  • You do not need to write down every single set of parameter change, but it is good practice to commit when one creates a good result and specify what you changed. You can also use local timeline tracker (ie. Timeline view on VSCode) if you need to revert your code to a previous iteration.
  • Looking at the loss and accuracy graphs will provide you a good overview of the model performance, but you may have to try many things to improve the model performance. In my experience, consulting ChatGPT has been helpful when I describe what I have tried, and what the output graphs look like. Do crosscheck its suggestions with Google and StackOverflow as it can produce misinformation at times.

Resources

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant