Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/hpo #74

Merged
merged 56 commits into from
Nov 2, 2023
Merged

Feature/hpo #74

merged 56 commits into from
Nov 2, 2023

Conversation

ffelten
Copy link
Collaborator

@ffelten ffelten commented Oct 26, 2023

Recreating from #57

Solves #13

Paper: https://arxiv.org/abs/2310.16487

Feature Description

This feature introduces a new script to perform a sweep of multi-objective reinforcement learning (MORL) algorithms and environments. The script runs a series of experiments, collects performance metrics, and logs the results to Weights & Biases (W&B).

The training is performed with multiple seeds in parallel, leveraging the ProcessPoolExecutor to run each agent with a different seed concurrently. By running the training on a series of seeds, the script accounts for the variability in the learning process and provides a more comprehensive evaluation of the algorithms' performance. The average hypervolume metric, obtained from the results of training on different seeds, is computed and logged to Weights & Biases.

Components Description

The main components of the feature are:

  • Argument parsing: Parse command-line arguments for the algorithm, environment ID, reference point, W&B entity, project name, number of seeds, and training hyperparameters.
  • Worker classes: Define classes to handle worker setup and results, including WorkerInitData and WorkerDoneData.
  • Train function: Implement a train function to instantiate the selected algorithm, train the agent, and return the hypervolume metric.
  • Main function: Initialize W&B, create a process pool of workers, submit tasks to the workers, collect results, compute the average hypervolume, and log the metrics to W&B.
  • Sweep setup and execution: Load the sweep configuration, set up the sweep with W&B, and run the sweep agent using the main function.

The script allows users to easily perform a sweep of MORL algorithms and environments, exploring different hyperparameters and logging the results to W&B for further analysis.

Usage

An example usage:

python experiments/hyperparameter_search/launch_sweep.py \
--algo envelope \
--env-id minecart-v0 \
--ref-point 0 0 -200 \
--sweep-count 100 \
--num-seeds 3 \
--train-hyperparams num_eval_weights_for_front:100 reset_num_timesteps:False eval_freq:10000 total_timesteps:10000

The configs with the ranges of hyperparameters for the sweep should be placed in configs directory with the corresponding algorithm name, such as envelope.yaml.

Other Changes

Additionally, the PR does a reorg of file structure and moves some of the functions that are used by both launch_experiment.py and launch_sweep.py into common/experiments.py.

experiments
├── hyperparameter_search
│   ├── launch_sweep.py
│   └── configs
│       ├── envelope.yaml
│       └── pgmorl.yaml
└── benchmark
    └── launch_experiment.py

Copy link
Owner

@LucasAlegre LucasAlegre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! :D

docs/features/hpo.md Outdated Show resolved Hide resolved
@ffelten ffelten merged commit dab73c7 into main Nov 2, 2023
2 checks passed
@ffelten ffelten deleted the feature/hpo branch November 2, 2023 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants