-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validation and early stopping during training #883
Comments
Hi @kinggongzilla, thanks for filing this issue! Currently we only support an early stopping that's based on the # of steps taken in an epoch, i.e. you can set However this doesn't satisfy your use case of only early stopping / saving a checkpoint based on some validation results. In training evaluation + stopping criteria based on evaluation is a large space we haven't looked deeply into, what do you folks think @ebsmothers @RdoubleA? I could see a future in which we allow users to specify a validation dataset or validation split, and incorporate validation metrics into our checkpointer for whether to save a checkpoint or not. This is definitely something we could look at enabling in the future if there's sufficient interest. |
Thanks for the quick reply! |
+1 this would be super useful. |
+1 Would be super useful! |
Thanks all for the comments. This feature (along with general validation loops) are fairly high on our wishlist right now. We still need to do a bit of design to make sure it's not too intrusive into our recipes, but definitely hear you on the need for this feature. We will keep you posted here! |
Hi @ebsmothers, Is there any update on this? |
Is there a way to evaluate the model performance during training on a validation dataset and only save a new checkpoint if it achieves lower validation loss?
The text was updated successfully, but these errors were encountered: