Feature/hpo #74

ffelten · 2023-10-26T11:37:11Z

Recreating from #57

Solves #13

Feature Description

This feature introduces a new script to perform a sweep of multi-objective reinforcement learning (MORL) algorithms and environments. The script runs a series of experiments, collects performance metrics, and logs the results to Weights & Biases (W&B).

The training is performed with multiple seeds in parallel, leveraging the ProcessPoolExecutor to run each agent with a different seed concurrently. By running the training on a series of seeds, the script accounts for the variability in the learning process and provides a more comprehensive evaluation of the algorithms' performance. The average hypervolume metric, obtained from the results of training on different seeds, is computed and logged to Weights & Biases.

Components Description

The main components of the feature are:

Argument parsing: Parse command-line arguments for the algorithm, environment ID, reference point, W&B entity, project name, number of seeds, and training hyperparameters.
Worker classes: Define classes to handle worker setup and results, including WorkerInitData and WorkerDoneData.
Train function: Implement a train function to instantiate the selected algorithm, train the agent, and return the hypervolume metric.
Main function: Initialize W&B, create a process pool of workers, submit tasks to the workers, collect results, compute the average hypervolume, and log the metrics to W&B.
Sweep setup and execution: Load the sweep configuration, set up the sweep with W&B, and run the sweep agent using the main function.

The script allows users to easily perform a sweep of MORL algorithms and environments, exploring different hyperparameters and logging the results to W&B for further analysis.

Usage

An example usage:

python experiments/hyperparameter_search/launch_sweep.py \
--algo envelope \
--env-id minecart-v0 \
--ref-point 0 0 -200 \
--sweep-count 100 \
--num-seeds 3 \
--train-hyperparams num_eval_weights_for_front:100 reset_num_timesteps:False eval_freq:10000 total_timesteps:10000

The configs with the ranges of hyperparameters for the sweep should be placed in configs directory with the corresponding algorithm name, such as envelope.yaml.

Other Changes

Additionally, the PR does a reorg of file structure and moves some of the functions that are used by both launch_experiment.py and launch_sweep.py into common/experiments.py.

experiments
├── hyperparameter_search
│   ├── launch_sweep.py
│   └── configs
│       ├── envelope.yaml
│       └── pgmorl.yaml
└── benchmark
    └── launch_experiment.py

…ypalace/morl-baselines into hyperparameter-optimization

LucasAlegre

Looks great! :D

docs/features/hpo.md

lowlypalace and others added 30 commits April 24, 2023 16:36

Add support for groups

7413842

Add hyperparameter search example

b239616

Define an array of seeds on top level

acf80a1

Remove print statements

5e8767c

Refactor code

91d2714

Remove unused sweep_run_name

3b631be

Add verbose flag

e2570f1

Add a run.summary() workaround

a87af65

Move seed init to the top level

444ce72

Use absolute path for config

1f3e51d

Move to concurrent.futures

74055ac

Replace namedtuple with dataclasses

08e050c

Make logging for each worker

1454a01

Move to experiments, add CLI

bf25d3d

Add support for multiple algos

636855c

Move gamma to yaml

22345a0

Bring back gamma

aeda87c

Move gamma to yaml

81eed90

Remove unused imports

925565c

Remove commented out code

880c7e0

New values after analysis of previous sweeps

fb1b420

Replace loggers for metrics

be7e5de

Merge branch 'hyperparameter-optimization' of https://github.com/lowl…

1e68d84

…ypalace/morl-baselines into hyperparameter-optimization

Remove the workaround

88580bf

Remove print statement

41d0a06

Refactor

8e397b6

Modify print statement

c01a506

New ranges

549b734

Modify configs

7b4f0a0

Merge

a196af4

lowlypalace and others added 24 commits May 31, 2023 23:16

Update config for envelope

056ada5

Add spacing

78baf17

Add more batch sizes

13a90d2

Modify pgmorl config

8f7b396

Modify launcher

50f054b

Make verbose not an Optional

c0f3735

Add docstring to reset_wandb_env()

cdea71d

Change docstring

62970a5

Fix launcher script

2fc4154

Log igd in sweep for known env

e82b627

Remove train.py

3ec0069

Make seeds from 0 to n-1

9c6237a

Fix seeds

7d0fe77

Modify envelope config

77b1857

Merge main into branch

6156026

Remove experiments from pydocstyle

fc68059

Fix pre commit

c12c1f2

Cleanup sweep utils

b7e723d

Fix imports

b6758b6

Fix config name

d0626e4

Update doc

d37cf67

Merge

1d8abe2

Merge main

2ad651d

Add back minecart deterministici

025184d

ffelten mentioned this pull request Oct 26, 2023

Hyperparameter optimization #57

Closed

ffelten requested a review from LucasAlegre October 26, 2023 11:59

LucasAlegre requested changes Oct 30, 2023

View reviewed changes

docs/features/hpo.md Outdated Show resolved Hide resolved

Add more doc on config file

935463b

ffelten merged commit dab73c7 into main Nov 2, 2023
2 checks passed

ffelten deleted the feature/hpo branch November 2, 2023 07:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/hpo #74

Feature/hpo #74

ffelten commented Oct 26, 2023 •

edited

Loading

LucasAlegre left a comment

Feature/hpo #74

Feature/hpo #74

Conversation

ffelten commented Oct 26, 2023 • edited Loading

Feature Description

Components Description

Usage

Other Changes

LucasAlegre left a comment

Choose a reason for hiding this comment

ffelten commented Oct 26, 2023 •

edited

Loading