Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nelder-Mead simplex algorithm with adaptive parameters #231

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
* Make a few tests more robust
([#228](https://github.com/mlpack/ensmallen/pull/228)).

* Add Nelder-Mead optimizer
([#231](https://github.com/mlpack/ensmallen/pull/231)).

### ensmallen 2.14.2: "No Direction Home"
###### 2020-08-31
* Fix implementation of fonesca fleming problem function f1 and f2
Expand Down
57 changes: 56 additions & 1 deletion doc/optimizers.md
Original file line number Diff line number Diff line change
Expand Up @@ -1246,7 +1246,7 @@ can be paired with the `Lookahead` optimizer.
| `BaseOptimizerType` | **`baseOptimizer`** | Optimizer for the forward step. | Adam |
| `double` | **`stepSize`** | Step size for each iteration. | `0.5` |
| `size_t` | **`k`** | The synchronization period. | `5` |
| `size_t` | **`max_iterations`** | Maximum number of iterations allowed (0 means no limit). | `100000` |
| `size_t` | **`maxIterations`** | Maximum number of iterations allowed (0 means no limit). | `100000` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

| `double` | **`tolerance`** | Maximum absolute tolerance to terminate algorithm. | `1e-5` |
| `DecayPolicyType` | **`decayPolicy`** | Instantiated decay policy used to adjust the step size. | `DecayPolicyType()` |
| `bool` | **`exactObjective`** | Calculate the exact objective (Default: estimate the final objective obtained on the last pass over the data). | `false` |
Expand Down Expand Up @@ -1278,6 +1278,61 @@ optimizer.Optimize(f, coordinates);
* [Lookahead Optimizer: k steps forward, 1 step back](https://arxiv.org/abs/1907.08610)
* [Differentiable separable functions](#differentiable-separable-functions)

## Nelder-Mead

*An optimizer for [arbitrary functions](#arbitrary-functions).*

The Nelder-Mead (NM) method (also called downhill simplex method is a heuristic
(search method for minimizing an objective function given in an N-dimensional
space. The key concept of the method is the simplex, an N dimensional polytope
that is a convex hull of a set of N + 1 linearly independent points. The method
is an iterative process which transforms an initial simplex iteratively into a
different one following a path in the space along which the function values are
progressively reduced.

#### Constructors

* `NelderMead()`
* `NelderMead(`_`maxIterations, tolerance, simplexer`_`)`
* `NelderMead(`_`alpha, beta, gamma, delta, maxIterations, tolerance, simplexer`_`)`

#### Attributes

| **type** | **name** | **description** | **default** |
|----------|----------|-----------------|-------------|
| `double` | **`alpha`** | The reflection parameter. | `` |
| `double` | **`beta`** | The expansion parameter. | `` |
| `double` | **`gamma`** | The contraction parameter. | `` |
| `double` | **`delta`** | The shrink step parameter. | `` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention that the defaults here are dependent on the dimension of the problem, according to Equation 4.1 of the paper? https://www.webpages.uidaho.edu/~fuchang/res/ANMS.pdf

| `size_t` | **`maxIterations`** | Maximum number of iterations allowed (0 means no limit). | `100000` |
| `double` | **`tolerance`** | The final value of the objective function for termination. If set to negative value, tolerance is not considered. | `1e-15` |
| `SimplexerType` | **`simplexer`** | The simplex policy used to construct the initial simplex. | `AffineSimplexer` |

Attributes of the optimizer may also be changed via the member methods
`Alpha()`, `Beta()`, `Gamma()`, `Delta()`, `MaxIterations()`,
`Tolerance()` and `Simplexer()`.

#### Examples:

<details open>
<summary>Click to collapse/expand example code.
</summary>

```c++
RosenbrockFunction f;
arma::mat coordinates = f.GetInitialPoint();

NelderMead optimizer;
optimizer.Optimize(f, coordinates);
```

</details>

#### See also:

* [Nelder-Mead in Wikipedia](https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method)
* [Arbitrary functions](#arbitrary-functions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also update the documentation for arbitrary functions to reference Nelder-Mead?


## LRSDP (low-rank SDP solver)

*An optimizer for [semidefinite programs](#semidefinite-programs).*
Expand Down
1 change: 1 addition & 0 deletions include/ensmallen.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@
#include "ensmallen_bits/parallel_sgd/parallel_sgd.hpp"
#include "ensmallen_bits/pso/pso.hpp"
#include "ensmallen_bits/rmsprop/rmsprop.hpp"
#include "ensmallen_bits/nelder_mead/nelder_mead.hpp"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's up to you, but the rest of these seem to be sorted alphabetically, so maybe we should do the same with this one. :)


#include "ensmallen_bits/sa/sa.hpp"
#include "ensmallen_bits/sarah/sarah.hpp"
Expand Down
60 changes: 60 additions & 0 deletions include/ensmallen_bits/nelder_mead/affine_simplexer.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
/**
* @file affine_simplexer.hpp
* @author Marcus Edel
*
* Definition of a simple affine simplex to ensure that the representation of
* points in the affine hull is unique.
*
* ensmallen is free software; you may redistribute it and/or modify it under
* the terms of the 3-clause BSD license. You should have received a copy of
* the 3-clause BSD license along with ensmallen. If not, see
* http://www.opensource.org/licenses/BSD-3-Clause for more information.
*/
#ifndef ENSMALLEN_NELDER_MEAD_AFFINE_SIMPLEXER_HPP
#define ENSMALLEN_NELDER_MEAD_AFFINE_SIMPLEXER_HPP

namespace ens {

/*
* Definition of the affine simplexer.
*/
class AffineSimplexer
{
public:
/**
* Affine simplexer, a simplex is represented by an (n + 1)-dimensional vector
* of n-dimensional vectors. It is used together with the initial vector to
* create the initial simplex. To construct the ith vertex, the simplexer
* multiplies entry i in the initial vector with a constant.
*
* @param simplex Constructed simplex.
* @param iterate Intial starting point.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param iterate Intial starting point.
* @param iterate Initial starting point.

* @param function Function to optimize.
*/
template<typename FunctionType, typename MatType>
void Simplex(MatType& simplex,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it does not make a difference for the API you are using, but you could mark this static if you wanted.

const MatType& iterate,
FunctionType& /* function */)
{
// Convenience typedefs.
typedef typename MatType::elem_type ElemType;

// Construct the initial simplex.
// Large initial simplex is used.
const ElemType scalingFactor = std::min(std::max(
arma::as_scalar(arma::max(
arma::abs(iterate))), (ElemType) 1.0), (ElemType) 10.0);

simplex = arma::eye<MatType>(iterate.n_rows, iterate.n_rows + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
simplex = arma::eye<MatType>(iterate.n_rows, iterate.n_rows + 1);
simplex.eye(iterate.n_rows, iterate.n_rows + 1);

I think this simplification will work. (It shouldn't make any difference in what's actually done.)

simplex.col(iterate.n_rows) = (1.0 -
std::sqrt((ElemType) (iterate.n_rows + 1))) /
(ElemType) iterate.n_rows * arma::ones<MatType>(iterate.n_rows, 1);

simplex *= scalingFactor;
simplex.each_col() += iterate;
}
};

} // namespace ens

#endif
234 changes: 234 additions & 0 deletions include/ensmallen_bits/nelder_mead/nelder_mead.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
/**
* @file nelder_mead.hpp
* @author Marcus Edel
*
* Definition of an Nelder-Mead with adaptive parameters proposed by F. Gao and
* L. Han in "Implementing the Nelder-Mead simplex algorithm with adaptive
* parameters".
*
* ensmallen is free software; you may redistribute it and/or modify it under
* the terms of the 3-clause BSD license. You should have received a copy of
* the 3-clause BSD license along with ensmallen. If not, see
* http://www.opensource.org/licenses/BSD-3-Clause for more information.
*/
#ifndef ENSMALLEN_NELDER_MEAD_HPP
#define ENSMALLEN_NELDER_MEAD_HPP

#include "affine_simplexer.hpp"

namespace ens {

/**
* Nelder-Mead is a direct search method. It keeps track of the function value
* at a number of points in the search space and iteratively generates a
* sequence of simplices to approximate an optimal point.
*
* For more information, please refer to:
*
* @code
* @article{Nelder1965ASM,
* author = {J. Nelder and R. Mead},
* title = {A Simplex Method for Function Minimization},
* year = {1965},
* journal = {Comput. J.},
* volume = {7},
* pages = {308-313}
* }
* @endcode
*
* @code
* @article{Fuchang2012,
* author = {Gao, Fuchang and Han, Lixing},
* title = {Implementing the Nelder-Mead Simplex Algorithm with Adaptive
* Parameters},
* year = {2012},
* publisher = {Kluwer Academic Publishers},
* volume = {51},
* number = {1},
* journal = {Comput. Optim. Appl.},
* month = jan,
* pages = {259–277}
* }
* @endcode
*
* NelderMead can optimize arbitrary functions. For more details, see the
* documentation on function types included with this distribution or on the
* ensmallen website.
*
* @tparam SimplexerType Simplex policy used by Nelder Mead to construct the
* initial simplex.
*/
template<typename SimplexerType = AffineSimplexer>
class NelderMeadType
{
public:
/**
* Construct the NelderMead optimizer with the default adaptive parameters
* scheme as used in "Implementing the Nelder-Mead simplex algorithm with
* adaptive parameters" by F. Gao and L. Han (2010). These are based on the
* dimensionality of the problem, and are given by
*
* alpha = 1
* beta = 1 + 2 / n
* gamma = 0.75 - 0.5 / n
* delta = 1 - 1 / n
*
* where n is the dimensionality of the problem.
*
* The defaults here are not necessarily good for the given problem, so it is
* suggested that the values used be tailored to the task at hand. The
* maximum number of iterations refers to the maximum number of points that
* are processed (i.e., one iteration equals one point; one iteration does not
* equal one pass over the dataset).
*
* @param maxIterations Maximum number of iterations allowed (0 means no
* limit).
* @param tolerance Maximum absolute tolerance to terminate algorithm.
* @param simplexer The simplex policy used to construct the initial simplex.
*/
NelderMeadType(const size_t maxIterations = 100000,
const double tolerance = 1e-15,
const SimplexerType& simplexer = SimplexerType());
/**
* Construct the NelderMead optimizer with the given parameters. The defaults
* here are not necessarily good for the given problem, so it is suggested
* that the values used be tailored to the task at hand. The maximum number
* of iterations refers to the maximum number of points that are processed
* (i.e., one iteration equals one point; one iteration does not equal one
* pass over the dataset).
Comment on lines +96 to +98
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this comment apply here to an arbitrary function?

*
* @param alpha The reflection parameter.
* @param beta The expansion parameter.
* @param gamma The contraction parameter.
* @param delta The shrink step parameter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be pointing out that the ANMS paper uses different notation here than the Wikipedia description of Nelder-Mead; on Wikipedia, gamma is the expansion parameter, and in the ANMS paper, beta is... I don't think we need to change the implementation, but we should probably point that out. 👍

* @param maxIterations Maximum number of iterations allowed (0 means no
* limit).
* @param tolerance Maximum absolute tolerance to terminate algorithm.
* @param simplexer The simplex policy used to construct the initial simplex.
*/
NelderMeadType(const double alpha,
const double beta,
const double gamma,
const double delta,
const size_t maxIterations = 100000,
const double tolerance = 1e-15,
const SimplexerType& simplexer = SimplexerType());

/**
* Optimize the given function using NelderMead. The given starting point will
* be modified to store the finishing point of the algorithm, and the final
* objective value is returned.
*
* @tparam ArbitraryFunctionType Type of the function to be optimized.
* @tparam MatType Type of matrix to optimize.
* @tparam CallbackTypes Types of callback functions.
* @param function Function to optimize.
* @param iterate Starting point (will be modified).
* @param callbacks Callback functions.
* @return Objective value of the final point.
*/
template<typename ArbitraryFunctionType,
typename MatType,
typename... CallbackTypes>
typename MatType::elem_type Optimize(ArbitraryFunctionType& function,
MatType& iterate,
CallbackTypes&&... callbacks);

//! Get the reflection parameter.
double Alpha() const { return alpha; }
//! Modify the reflection parameter.
double& Alpha() { return alpha; }

//! Get the expansion parameter.
double Beta() const { return beta; }
//! Modify the expansion parameter.
double& Beta() { return beta; }

//! Get the contraction parameter.
double Gamma() const { return gamma; }
//! Modify the contraction parameter.
double& Gamma() { return gamma; }

//! Get the shrink step parameter.
double Delta() const { return delta; }
//! Modify the shrink step parameter.
double& Delta() { return delta; }

//! Get the maximum number of iterations (0 indicates no limit).
size_t MaxIterations() const { return maxIterations; }
//! Modify the maximum number of iterations (0 indicates no limit).
size_t& MaxIterations() { return maxIterations; }

//! Get the tolerance for termination.
double Tolerance() const { return tolerance; }
//! Modify the tolerance for termination.
double& Tolerance() { return tolerance; }

//! Get the simplexer policy.
const SimplexerType& Simplexer() const { return simplexer; }
//! Modify the simplexer policy.
SimplexerType& Simplexer() { return simplexer; }

private:
/**
* Helper function to shift the given order based on the given function
* objective.
*
* @param order The current order of the function objectives.
* @param fx The current function values.
* @param value Function value to be inserted into the current order.
* @return Position of the given objective in the given order.
*/
size_t ShiftOrder(arma::uvec& order,
const arma::rowvec& fx,
const double value)
{
// Figure out at which index we have to put the new value.
size_t i = 0;
for (; i < fx.n_elem; ++i)
if (fx(order(i)) > value) break;

// Shift every value after 'i' by one to the right.
const size_t tmp = order(fx.n_elem - 1);
for (size_t d = fx.n_elem - 1; d > i; --d)
order(d) = order(d - 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can do order.subvec(i, fx.n_elem - 1) = order.subvec(i - 1, fx.n_elem - 2) and this will perform a memcpy() at the lowest level. I'm sure this isn't a bottleneck in the code but it could be slightly more efficient.

order(i) = tmp;

return i;
}

//! The reflection parameter.
double alpha;

//! The expansion parameter.
double beta;

//! The contraction parameter.
double gamma;

//! The shrink step parameter.
double delta;

//! The maximum number of allowed iterations.
size_t maxIterations;

//! The tolerance for termination.
double tolerance;

//! Locally stored simplexer.
SimplexerType simplexer;

//! Whether to use the default adaptive parameters from the "Implementing the
//! Nelder-Mead simplex algorithmwith adaptive parameters" paper.
bool adaptiveDefault;
};

// Convenience typedef.
using NelderMead = NelderMeadType<AffineSimplexer>;

} // namespace ens

// Include implementation.
#include "nelder_mead_impl.hpp"

#endif
Loading