Skip to content

Commit

Permalink
Merge pull request #692 from automl/development
Browse files Browse the repository at this point in the history
Merge into master
  • Loading branch information
mfeurer authored Sep 25, 2020
2 parents 9890e0c + 6e68d5f commit 9d7d09d
Show file tree
Hide file tree
Showing 72 changed files with 6,334 additions and 1,521 deletions.
2 changes: 0 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,6 @@ matrix:

include:
# Unit tests
- os: linux
env: TESTSUITE=run_unittests.sh PYTHON_VERSION="3.5" MINICONDA_URL="https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh"
- os: linux
env: TESTSUITE=run_unittests.sh PYTHON_VERSION="3.6" MINICONDA_URL="https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh"
- os: linux
Expand Down
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Copyright (C) 2016-2018 [AutoML Group](http://www.automl.org/)

__Attention__: This package is a re-implementation of the original SMAC tool
__Attention__: This package is a reimplementation of the original SMAC tool
(see reference below).
However, the reimplementation slightly differs from the original SMAC.
For comparisons against the original SMAC, we refer to a stable release of SMAC (v2) in Java
Expand All @@ -16,7 +16,7 @@ Status for master branch:
[![Codacy Badge](https://api.codacy.com/project/badge/Grade/58f47a4bd25e45c9a4901ebca68118ff?branch=master)](https://www.codacy.com/app/automl/SMAC3?utm_source=github.com&utm_medium=referral&utm_content=automl/SMAC3&utm_campaign=Badge_Grade)
[![codecov Status](https://codecov.io/gh/automl/SMAC3/branch/master/graph/badge.svg)](https://codecov.io/gh/automl/SMAC3)

Status for development branch
Status for the development branch

[![Build Status](https://travis-ci.org/automl/SMAC3.svg?branch=development)](https://travis-ci.org/automl/SMAC3)
[![Codacy Badge](https://api.codacy.com/project/badge/Grade/58f47a4bd25e45c9a4901ebca68118ff?branch=development)](https://www.codacy.com/app/automl/SMAC3?utm_source=github.com&utm_medium=referral&utm_content=automl/SMAC3&utm_campaign=Badge_Grade)
Expand All @@ -27,8 +27,8 @@ Status for development branch
SMAC is a tool for algorithm configuration to optimize the parameters of
arbitrary algorithms across a set of instances. This also includes
hyperparameter optimization of ML algorithms. The main core consists of
Bayesian Optimization in combination with a aggressive racing mechanism to
efficiently decide which of two configuration performs better.
Bayesian Optimization in combination with an aggressive racing mechanism to
efficiently decide which of two configurations performs better.

For a detailed description of its main idea,
we refer to
Expand All @@ -38,7 +38,7 @@ we refer to
In: Proceedings of the conference on Learning and Intelligent OptimizatioN (LION 5)


SMAC v3 is written in Python3 and continuously tested with python3.5 and
SMAC v3 is written in Python3 and continuously tested with Python 3.6 and
python3.6. Its [Random Forest](https://github.com/automl/random_forest_run)
is written in C++.

Expand Down Expand Up @@ -97,7 +97,7 @@ pip install smac[gp]
pip install .[gp,lhd]
```

For convenience there is also an `all` meta-dependency that installs all optional dependencies:
For convenience, there is also an `all` meta-dependency that installs all optional dependencies:
```
pip install smac[all]
```
Expand Down
18 changes: 18 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,21 @@
# 0.13.0

## Major Changes
* Split choosing next challenger from evaluating challenger (#663)
* Implemented parallel SMAC using dask (#675, #677, #681, #685, #686)
* Drop support for Python 3.5

## Minor Changes
* Update Readme
* Remove runhistory from TAE (#663)
* Store SMAC's internal config id in the configuration object (#679)
* Introduce Status Type STOP (#690)

## Bug Fixes
* Only validate restriction of Sobol Sequence when choosing Sobol Sequence (#664)
* Fix wrong initialization of list in local search (#680)
* Fix setting random seed with a too small range in Latin Hypercube design (#688)

# 0.12.3

## Minor Changes
Expand Down
1 change: 1 addition & 0 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -323,4 +323,5 @@
# compile execute examples in the examples dir
'filename_pattern': '.*example.py$|.*tutorial.py$',
# TODO: fix back/forward references for the examples.
'ignore_pattern': '.*_func.py'
}
2 changes: 1 addition & 1 deletion doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ Contents:
| howpublished={\\url{https://github.com/automl/SMAC3}}
| }
SMAC3 is mainly written in Python 3 and continuously tested with Python 3.5-3.6.
SMAC3 is mainly written in Python 3 and continuously tested with Python 3.6-3.8.
Its `Random Forest <https://github.com/automl/random_forest_run>`_ is written in
C++11.

Expand Down
45 changes: 45 additions & 0 deletions examples/fmin_rosenbrock_parallel.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
"""
============================================
Parallel Intensifier with No Intensification
============================================
This example showcases how to use dask to
launch parallel configurations via n_workers
"""

import logging

from smac.intensification.simple_intensifier import SimpleIntensifier
from smac.facade.func_facade import fmin_smac

# --------------------------------------------------------------
# We need to provide a pickable function and use __main__
# to be compliant with multiprocessing API
# Below is a work around to have a packaged function called
# rosenbrock_2d
# --------------------------------------------------------------
import os
import sys
sys.path.append(os.path.join(os.path.dirname(__file__)))
from rosenbrock_2d_delayed_func import rosenbrock_2d # noqa: E402
# --------------------------------------------------------------

if __name__ == '__main__':

# debug output
logging.basicConfig(level=20)
logger = logging.getLogger("Optimizer") # Enable to show Debug outputs

# fmin_smac assumes that the function is deterministic
# and uses under the hood the SMAC4HPO
# n_workers tells the SMBO loop to execute in parallel
x, cost, smac = fmin_smac(
func=rosenbrock_2d,
intensifier=SimpleIntensifier,
x0=[-3, -4],
bounds=[(-5, 10), (-5, 10)],
maxfun=25,
rng=3,
n_jobs=4,
) # Passing a seed makes fmin_smac determistic
print("Best x: %s; with cost: %f" % (str(x), cost))
59 changes: 8 additions & 51 deletions examples/hyperband_mlp.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,64 +11,21 @@
"""

import logging
import warnings

import numpy as np
from ConfigSpace.hyperparameters import CategoricalHyperparameter, \
UniformFloatHyperparameter, UniformIntegerHyperparameter
from sklearn.datasets import load_digits
from sklearn.exceptions import ConvergenceWarning
from sklearn.model_selection import cross_val_score, StratifiedKFold
from sklearn.neural_network import MLPClassifier

import numpy as np

from smac.configspace import ConfigurationSpace
from smac.facade.hyperband_facade import HB4AC
from smac.scenario.scenario import Scenario

digits = load_digits()


# Target Algorithm
# The signature of the function determines what arguments are passed to it
# i.e., budget is passed to the target algorithm if it is present in the signature
def mlp_from_cfg(cfg, seed, instance, budget, **kwargs):
"""
Creates a MLP classifier from sklearn and fits the given data on it.
This is the function-call we try to optimize. Chosen values are stored in
the configuration (cfg).
Parameters
----------
cfg: Configuration
configuration chosen by smac
seed: int or RandomState
used to initialize the rf's random generator
instance: str
used to represent the instance to use (just a placeholder for this example)
budget: float
used to set max iterations for the MLP
Returns
-------
float
"""

with warnings.catch_warnings():
warnings.filterwarnings('ignore', category=ConvergenceWarning)

mlp = MLPClassifier(
hidden_layer_sizes=[cfg["n_neurons"]] * cfg["n_layer"],
batch_size=cfg['batch_size'],
activation=cfg['activation'],
learning_rate_init=cfg['learning_rate_init'],
max_iter=int(np.ceil(budget)),
random_state=seed)

# returns the cross validation accuracy
cv = StratifiedKFold(n_splits=5, random_state=seed, shuffle=True) # to make CV splits consistent
score = cross_val_score(mlp, digits.data, digits.target, cv=cv, error_score='raise')

return 1 - np.mean(score) # Because minimize!
# --------------------------------------------------------------
import os
import sys
sys.path.append(os.path.join(os.path.dirname(__file__)))
from mlp_from_cfg_func import mlp_from_cfg # noqa: E402
# --------------------------------------------------------------


logger = logging.getLogger("MLP-example")
Expand Down
56 changes: 56 additions & 0 deletions examples/mlp_from_cfg_func.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
import warnings

import numpy as np

from sklearn.datasets import load_digits
from sklearn.exceptions import ConvergenceWarning
from sklearn.model_selection import cross_val_score, StratifiedKFold
from sklearn.neural_network import MLPClassifier


# A common function to be optimized by a Real valued Intensifier
digits = load_digits()


# Target Algorithm
# The signature of the function determines what arguments are passed to it
# i.e., budget is passed to the target algorithm if it is present in the signature
def mlp_from_cfg(cfg, seed, instance, budget, **kwargs):
"""
Creates a MLP classifier from sklearn and fits the given data on it.
This is the function-call we try to optimize. Chosen values are stored in
the configuration (cfg).
Parameters
----------
cfg: Configuration
configuration chosen by smac
seed: int or RandomState
used to initialize the rf's random generator
instance: str
used to represent the instance to use (just a placeholder for this example)
budget: float
used to set max iterations for the MLP
Returns
-------
float
"""

with warnings.catch_warnings():
warnings.filterwarnings('ignore', category=ConvergenceWarning)

mlp = MLPClassifier(
hidden_layer_sizes=[cfg["n_neurons"]] * cfg["n_layer"],
batch_size=cfg['batch_size'],
activation=cfg['activation'],
learning_rate_init=cfg['learning_rate_init'],
max_iter=int(np.ceil(budget)),
random_state=seed)

# returns the cross validation accuracy
# to make CV splits consistent
cv = StratifiedKFold(n_splits=5, random_state=seed, shuffle=True)
score = cross_val_score(mlp, digits.data, digits.target, cv=cv, error_score='raise')

return 1 - np.mean(score) # Because minimize!
99 changes: 99 additions & 0 deletions examples/parallel_sh_mlp.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
"""
================================================
Optimizing an MLP with Parallel SuccesiveHalving
================================================
An example for the usage of a model-free SuccessiveHalving intensifier in SMAC,
for parallel execution. The configurations are randomly sampled.
This examples uses a real-valued SuccessiveHalving through epochs.
4 workers are allocated for this run. As soon as any worker is idle,
SMAC internally creates more SuccessiveHalving instances to take
advantage of the idle resources.
"""

import logging

import numpy as np
from ConfigSpace.hyperparameters import CategoricalHyperparameter, \
UniformFloatHyperparameter, UniformIntegerHyperparameter

from smac.configspace import ConfigurationSpace
from smac.facade.roar_facade import ROAR
from smac.scenario.scenario import Scenario
from smac.intensification.successive_halving import SuccessiveHalving
from smac.initial_design.random_configuration_design import RandomConfigurations

# --------------------------------------------------------------
# We need to provide a pickable function and use __main__
# to be compliant with multiprocessing API
# Below is a work around to have a packaged function called
# mlp_from_cfg_func
# --------------------------------------------------------------
import os
import sys
sys.path.append(os.path.join(os.path.dirname(__file__)))
from mlp_from_cfg_func import mlp_from_cfg # noqa: E402
# --------------------------------------------------------------

if __name__ == '__main__':

logger = logging.getLogger("MLP-example")
logging.basicConfig(level=logging.INFO)

# Build Configuration Space which defines all parameters and their ranges.
# To illustrate different parameter types,
# we use continuous, integer and categorical parameters.
cs = ConfigurationSpace()

# We can add multiple hyperparameters at once:
n_layer = UniformIntegerHyperparameter("n_layer", 1, 4, default_value=1)
n_neurons = UniformIntegerHyperparameter("n_neurons", 8, 512, log=True, default_value=10)
activation = CategoricalHyperparameter("activation", ['logistic', 'tanh', 'relu'],
default_value='tanh')
batch_size = UniformIntegerHyperparameter('batch_size', 30, 300, default_value=200)
learning_rate_init = UniformFloatHyperparameter('learning_rate_init', 0.0001, 1.0, default_value=0.001, log=True)
cs.add_hyperparameters([n_layer, n_neurons, activation, batch_size, learning_rate_init])

# SMAC scenario object
scenario = Scenario({"run_obj": "quality", # we optimize quality (alternative to runtime)
"wallclock-limit": 100, # max duration to run the optimization (in seconds)
"cs": cs, # configuration space
"deterministic": "true",
"limit_resources": True, # Uses pynisher to limit memory and runtime
# Alternatively, you can also disable this.
# Then you should handle runtime and memory yourself in the TA
"cutoff": 20, # runtime limit for target algorithm
"memory_limit": 3072, # adapt this to reasonable value for your hardware
})

# Intensification parameters
# Intensifier will allocate from 5 to a maximum of 25 epochs to each configuration
# Successive Halving child-instances are created to prevent idle
# workers.
intensifier_kwargs = {'initial_budget': 5, 'max_budget': 25, 'eta': 3,
'min_chall': 1, 'instance_order': 'shuffle_once'}

# To optimize, we pass the function to the SMAC-object
smac = ROAR(scenario=scenario, rng=np.random.RandomState(42),
tae_runner=mlp_from_cfg,
intensifier=SuccessiveHalving,
intensifier_kwargs=intensifier_kwargs,
initial_design=RandomConfigurations,
n_jobs=4)

# Example call of the function with default values
# It returns: Status, Cost, Runtime, Additional Infos
def_value = smac.get_tae_runner().run(config=cs.get_default_configuration(),
instance='1', budget=25, seed=0)[1]
print("Value for default configuration: %.4f" % def_value)

# Start optimization
try:
incumbent = smac.optimize()
finally:
incumbent = smac.solver.incumbent

inc_value = smac.get_tae_runner().run(config=incumbent, instance='1',
budget=25, seed=0)[1]
print("Optimized Value: %.4f" % inc_value)
Loading

0 comments on commit 9d7d09d

Please sign in to comment.