Validation and early stopping during training #883

kinggongzilla · 2024-04-26T20:33:28Z

Is there a way to evaluate the model performance during training on a validation dataset and only save a new checkpoint if it achieves lower validation loss?

rohan-varma · 2024-04-26T20:56:16Z

Hi @kinggongzilla, thanks for filing this issue!

Currently we only support an early stopping that's based on the # of steps taken in an epoch, i.e. you can set max_steps_per_epoch flag in the configuration to early stop your model based on a # of steps.

However this doesn't satisfy your use case of only early stopping / saving a checkpoint based on some validation results.

In training evaluation + stopping criteria based on evaluation is a large space we haven't looked deeply into, what do you folks think @ebsmothers @RdoubleA? I could see a future in which we allow users to specify a validation dataset or validation split, and incorporate validation metrics into our checkpointer for whether to save a checkpoint or not. This is definitely something we could look at enabling in the future if there's sufficient interest.

kinggongzilla · 2024-04-26T21:17:00Z

Thanks for the quick reply!
Being able to define a validation dataset and do early stoppingbased on the validation loss would definitely be super helpful.

optimass · 2024-05-06T17:13:30Z

+1 this would be super useful.

Some-random · 2024-05-07T05:58:40Z

+1 Would be super useful!

ebsmothers · 2024-05-07T16:23:34Z

Thanks all for the comments. This feature (along with general validation loops) are fairly high on our wishlist right now. We still need to do a bit of design to make sure it's not too intrusive into our recipes, but definitely hear you on the need for this feature. We will keep you posted here!

Tandon-A · 2024-11-03T01:42:31Z

Hi @ebsmothers,

Is there any update on this?

felipemello1 assigned ebsmothers Jun 28, 2024

felipemello1 added high-priority enhancement New feature or request labels Jun 28, 2024

RdoubleA mentioned this issue Aug 21, 2024

Can I just get the loss on validation and test set? #1066

Closed

RdoubleA added the community help wanted We would love the community's help completing this issue label Aug 21, 2024

RdoubleA changed the title ~~Early stopping during training~~ Validation and early stopping during training Aug 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation and early stopping during training #883

Validation and early stopping during training #883

kinggongzilla commented Apr 26, 2024

rohan-varma commented Apr 26, 2024

kinggongzilla commented Apr 26, 2024

optimass commented May 6, 2024

Some-random commented May 7, 2024

ebsmothers commented May 7, 2024

Tandon-A commented Nov 3, 2024

Validation and early stopping during training #883

Validation and early stopping during training #883

Comments

kinggongzilla commented Apr 26, 2024

rohan-varma commented Apr 26, 2024

kinggongzilla commented Apr 26, 2024

optimass commented May 6, 2024

Some-random commented May 7, 2024

ebsmothers commented May 7, 2024

Tandon-A commented Nov 3, 2024