Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nelder-Mead simplex algorithm with adaptive parameters #231

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

zoq
Copy link
Member

@zoq zoq commented Oct 18, 2020

Nelder-Mead simplex algorithm with adaptive parameters as proposed in "Implementing the Nelder-Mead simplex algorithmwith adaptive parameters" by F Gao and L. Han. The implementation outperforms the standard Nelder-Mead method (for high dimensional problems).

@zoq zoq mentioned this pull request Oct 18, 2020
Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, so glad to see more arbitrary optimizers implemented! Most of my comments are minor, but I think I found a bug and perhaps a couple efficiency improvements. 👍

@@ -1246,7 +1246,7 @@ can be paired with the `Lookahead` optimizer.
| `BaseOptimizerType` | **`baseOptimizer`** | Optimizer for the forward step. | Adam |
| `double` | **`stepSize`** | Step size for each iteration. | `0.5` |
| `size_t` | **`k`** | The synchronization period. | `5` |
| `size_t` | **`max_iterations`** | Maximum number of iterations allowed (0 means no limit). | `100000` |
| `size_t` | **`maxIterations`** | Maximum number of iterations allowed (0 means no limit). | `100000` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

| `double` | **`alpha`** | The reflection parameter. | `` |
| `double` | **`beta`** | The expansion parameter. | `` |
| `double` | **`gamma`** | The contraction parameter. | `` |
| `double` | **`delta`** | The shrink step parameter. | `` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention that the defaults here are dependent on the dimension of the problem, according to Equation 4.1 of the paper? https://www.webpages.uidaho.edu/~fuchang/res/ANMS.pdf

#### See also:

* [Nelder-Mead in Wikipedia](https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method)
* [Arbitrary functions](#arbitrary-functions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also update the documentation for arbitrary functions to reference Nelder-Mead?

@@ -103,6 +103,7 @@
#include "ensmallen_bits/parallel_sgd/parallel_sgd.hpp"
#include "ensmallen_bits/pso/pso.hpp"
#include "ensmallen_bits/rmsprop/rmsprop.hpp"
#include "ensmallen_bits/nelder_mead/nelder_mead.hpp"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's up to you, but the rest of these seem to be sorted alphabetically, so maybe we should do the same with this one. :)

* @param function Function to optimize.
*/
template<typename FunctionType, typename MatType>
void Simplex(MatType& simplex,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it does not make a difference for the API you are using, but you could mark this static if you wanted.

// Shift values by one to the right and swap the last with the first.
const size_t fLowest = order(dim);
for (size_t d = dim; d > 0; --d)
order(d) = order(d - 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too you could use subvec() to copy all at once, or, if you use a max-heap this step may not be needed. (I believe std::priority_queue is a max-heap.)

else
{
// Inside contraction.
const MatType xic = (1.0 - gamma) * M + gamma *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have alpha multiplied too? I did the algebra in my head and I think it should, but I didn't double check and I have a headache right now... 😄

// Sorting can be slow, however there is overwhelming evidence that shrink
// transformations almost never happen in practice. See "Efficient
// Implementation of the Nelder–Mead Search Algorithm" by S. Singer and S.
// Singer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought for sure this had to be a typo, but no, there are actually two S. Singers, and they collaborated on this. 😄

for (size_t d = 1; d < dim; ++d)
{
simplex.col(order(d)) = simplex.col(order(0)) + delta *
(simplex.col(order(d)) - simplex.col(order(0)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be possible to express this as two operations (first on columns with index less than order(0), and second on columns with index greater than order(0)), using repmat(), but it's up to you if you want to rewrite it like that. I believe it would be more efficient, but I don't know how much of a bottleneck it would be. The Evaluate() would still need to be called over all new points in a loop though.

// Singer.
order = arma::sort_index(fx);
fx = fx.cols(order);
simplex = simplex.cols(order);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to reorder simplex at the end of the loop? (And fx too?) Inside the main loop of the optimization, those two members are always accessed with order(), so I think it is fine to leave them in the order they're in.

@mlpack-bot
Copy link

mlpack-bot bot commented Nov 26, 2020

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

@mlpack-bot mlpack-bot bot added the s: stale label Nov 26, 2020
@mlpack-bot mlpack-bot bot closed this Dec 3, 2020
@zoq zoq reopened this Dec 3, 2020
@mlpack-bot mlpack-bot bot removed the s: stale label Dec 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants