-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPOP and BIPOP restart mechanisms #403
Conversation
} | ||
|
||
// Check if the total number of evaluations has exceeded the limit | ||
if (totalFunctionEvaluations >= 1e9) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, most likely. Plus there should be more parameters to control the termination criteria of the underlying CMA (from the other PR) which I will need to add. But since I still need to review the correctness of the termination criteria I don't want to add/delete stuff now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, makes sense.
Co-authored-by: Marcus Edel <[email protected]>
Don't forget to add documentation to |
Completely forgot about the documentation, added it now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding documentation, just a couple comments about design and style. I hope to get the automatic style checker job online sometime, but until then if you don't mind going through and making sure everything matches I would appreciate it (and that would make it easier to fix that CI job whenever it does come online)
* after each restart. | ||
* @param maxRestarts Maximum number of restarts. | ||
*/ | ||
IPOPCMAES(const CMAESType& CMAES = CMAESType(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This applies to BIPOPCMAES too.) For the other CMAES variants, like ActiveCMAES
, we reflect the constructor parameters of CMAES instead of asking users to pass a CMAES object. That's also the strategy done for SGD variants. Do you think we could do the same here? So this would be more like
IPOPCMAES<SelectionPolicyType, TransformationPolicyType>(lambda, transformationPolicy, batchSize, maxIterations, tolerance, selectionPolicy, stepSize, populationFactor, maxRestarts)
or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both IPOP and BIPOP are restart mechanisms for the underlying CMAES. They are originally defined as extensions for the original "vanilla" CMAES, however, the restarts could be done for any variant. My intent was to have a CMAES object so the user could do the restart mechanism on whatever CMAES they wanted ("vanilla" CMAES or ActiveCMAES
), which is why the user would input a CMAES object.
Nevertheless, I could just initialize whichever CMAES the user wants inside with the constructor you suggest. Or just ignore the different CMAES variants and just use the "vanilla" CMAES always.
What do you suggest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way you described it there, does it make more sense to simply modify CMAES to add another template parameter that represents the restart mechanism? That could simplify the code significantly.
From a user perspective, regardless of what choice you make internally, I do think it's important to make the constructor "feel" the same as the other CMAES variants, whether that is done by having IPOP-CMAES and BIPOP-CMAES be typedefs of CMAES with certain template parameters, or by holding a CMAES object internally. So, e.g., a constructor taking each of the same arguments as the CMAES constructor (or however many of those arguments are applicable).
Hopefully that makes sense, let me know if I can clarify anything I wrote. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way you described it there, does it make more sense to simply modify CMAES to add another template parameter that represents the restart mechanism? That could simplify the code significantly.
In this case, you mean adding something like template typename SelectionPolicyType = FullSelection, typename TransformationPolicyType = EmptyTransformation<>, typename RestartPolicyType = NoRestart>
instead of having what I did right now?
Then I would need to modify the CMAES_impl.hpp Optimize
to have something along the lines of restartPolicy.shouldRestart()
and restartPolicy.restart()
? Where the restart policy could either be BIPOP or IPOP or new policies.
If that's the case, I see what you mean, and it might indeed be cleaner.
But currently, I'm using the Optimize
from the CMAES, and when it terminates, that elicits a restart. I would have to change quite a bit of code so that instead of returning when there is a stop condition, it loops until the restarts are done.
I'm fine doing it either way, but I might need to think about how to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be handled by the callback feature, that said, I don't want to make the code too complex. Instead should we just create an independent IPOP implementation, that does not reuse CMAES? BIPOP could be a flag to the IPOP implementation. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BIPOP could be a flag to the IPOP implementation.
I can definitely reduce both into one, with a flag to choose.
As for the code, I'm also fine doing either way, I feel like creating an instance of CMAES inside the IPOP and then use Optimize
might be the best though. This way the user can control all the parameters in the same fashion, while also not duplicating code, and any improvement we do to the CMAES will be reflected in the IPOP (and BIPOP).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zoq @rcurtin I refactored it into one code only (POP-CMA-ES) where with a flag it launches BIPOP or IPOP. I also tried to make it as similar to the other CMAES optimizers. However, now instead of passing it an CMAES object, I just initializes the CMAES class inside and use it.
Let me know if you prefer this version, and if you do, i'll write out the documentation for this and should be pretty much ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it, thanks for taking the effort to do that. Two quick notes:
- Feel free to use underscores in the class names, like
POP_CMAES
; this can make the class name a little more decipherable. :) - If you want to keep the BIPOP and IPOP versions as separate optimizers, you could make the
useBIPOP
parameter into a template parameter and use ausing
statement to define theIPOPCMAES
andBIPOPCMAES
classes. Up to you whether you want to do that---it really hinges on whether you want to provide to the user a single classPOPCMAES
or two classes. (I guess two classes would be more in line with how ensmallen is structured, with one optimizer per class, but either way is fine with me.)
Thanks again for the effort; at least from my side I have no other major comments. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the comments make sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done the suggestions @rcurtin!
Going to mark this no longer as draft
b63d8f9
to
a6fd0ff
Compare
doc/optimizers.md
Outdated
@@ -778,6 +780,72 @@ optimizer2.Optimize(f, coordinates); | |||
* [SGD in Wikipedia](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) | |||
* [SGD](#standard-sgd) | |||
|
|||
## BIPOP_CMAES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more documentation bit I realized; can you add BIPOP_CMAES and IPOP_CMAES to the list of optimizers for separable functions in function_types.md
? Thanks 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just did that, also added ActiveCMAES which wasn't added.
Also, had to increase a bit the margin of the tests as in 2 environments it was failing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable.
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
Co-authored-by: Ryan Curtin <[email protected]>
@conradsnicta let's wait until 2.22 launches |
Thank you for the suggestions, @rcurtin, and for handling the suggestions, @conradsnicta. I apologize for the delay in my response; I've been quite busy recently. I'm fine with all the style changes, as well as the RNG one. Additionally, I tested them locally and everything works as before. I updated the HISTORY.md accordingly. Thanks again for your support! |
I pushed one more changes for the "CMA-ES" name changes to the documentation. Otherwise, I think this is ready for merge so I'll merge once there is a green build. |
Hi everyone!
I've taken a break from the termination criteria and have been implementing restart mechanisms for CMA-ES. The plan was to do IPOP, but adding BIPOP was quite simple so I suggest adding it too.
IPOP: is an extension of the standard CMA-ES that features an increasing population size strategy, where at each termination of the optimizer the population is increased by a constant factor. This is based on the initial work found in PR #373.
BIPOP: is pretty straightforward after setting up IPOP. BIPOP uses two interlaced strategies: one with an increase in population size with a constant factor (identical to IPOP) and another with a smaller, varying population size. I've seen some places that this strategy works better on certain functions.
Additionally, there are two things to consider: