Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation and early stopping during training #883

Open
kinggongzilla opened this issue Apr 26, 2024 · 6 comments
Open

Validation and early stopping during training #883

kinggongzilla opened this issue Apr 26, 2024 · 6 comments
Assignees
Labels
community help wanted We would love the community's help completing this issue enhancement New feature or request high-priority

Comments

@kinggongzilla
Copy link

Is there a way to evaluate the model performance during training on a validation dataset and only save a new checkpoint if it achieves lower validation loss?

@rohan-varma
Copy link
Member

Hi @kinggongzilla, thanks for filing this issue!

Currently we only support an early stopping that's based on the # of steps taken in an epoch, i.e. you can set max_steps_per_epoch flag in the configuration to early stop your model based on a # of steps.

However this doesn't satisfy your use case of only early stopping / saving a checkpoint based on some validation results.

In training evaluation + stopping criteria based on evaluation is a large space we haven't looked deeply into, what do you folks think @ebsmothers @RdoubleA? I could see a future in which we allow users to specify a validation dataset or validation split, and incorporate validation metrics into our checkpointer for whether to save a checkpoint or not. This is definitely something we could look at enabling in the future if there's sufficient interest.

@kinggongzilla
Copy link
Author

Thanks for the quick reply!
Being able to define a validation dataset and do early stoppingbased on the validation loss would definitely be super helpful.

@optimass
Copy link

optimass commented May 6, 2024

+1 this would be super useful.

@Some-random
Copy link

+1 Would be super useful!

@ebsmothers
Copy link
Contributor

Thanks all for the comments. This feature (along with general validation loops) are fairly high on our wishlist right now. We still need to do a bit of design to make sure it's not too intrusive into our recipes, but definitely hear you on the need for this feature. We will keep you posted here!

@felipemello1 felipemello1 added high-priority enhancement New feature or request labels Jun 28, 2024
@RdoubleA RdoubleA added the community help wanted We would love the community's help completing this issue label Aug 21, 2024
@RdoubleA RdoubleA changed the title Early stopping during training Validation and early stopping during training Aug 21, 2024
@Tandon-A
Copy link

Tandon-A commented Nov 3, 2024

Hi @ebsmothers,

Is there any update on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community help wanted We would love the community's help completing this issue enhancement New feature or request high-priority
Projects
None yet
Development

No branches or pull requests

8 participants