Skip to content

Commit

Permalink
New docs (#220)
Browse files Browse the repository at this point in the history
* work

* Update setup.py

* lots of docs

* work

* new theme and work

* readthedocs

* work

* bug fix

* fix

* work
  • Loading branch information
StoneT2000 authored Mar 2, 2024
1 parent 23a5ef6 commit e2f0330
Show file tree
Hide file tree
Showing 25 changed files with 340 additions and 95 deletions.
32 changes: 32 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the OS, Python version and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.9"
# You can also specify other tool versions:
# nodejs: "19"
# rust: "1.64"
# golang: "1.19"

# Build documentation in the "docs/" directory with Sphinx
sphinx:
configuration: docs/source/conf.py

# Optionally build your docs in additional formats such as PDF and ePub
# formats:
# - pdf
# - epub

# Optional but recommended, declare the Python requirements required
# to build your documentation
# See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
python:
install:
- requirements: docs/requirements.txt
12 changes: 12 additions & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
sphinx==6.2.1
sphinx-autobuild
sphinx-book-theme
# For spelling
sphinxcontrib.spelling
# Type hints support
sphinx-autodoc-typehints
# Copy button for code snippets
sphinx_copybutton
# Markdown parser
myst-parser
sphinx-subfigure
9 changes: 9 additions & 0 deletions docs/source/additional_resources/education.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Educational Resources

TODO: Things to collate

- Course Materials/Slides from other universities
- Tutorials
- Leaderboard details / Work with kaggle to run in class competitions with ManiSkill?
- Simple, visually cool looking games? for fun? (not reseaerch necessarily)
-
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Performance Benchmarking
39 changes: 39 additions & 0 deletions docs/source/algorithms_and_models/baselines.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Baselines

ManiSkill has a number of baseline Reinforcement Learning (RL), Learning from Demonstrations (LfD) / Imitation Learning (IL) algorithms implemented that are easily runnable and reproducible for ManiSkill tasks. All baselines have their own standalone folders that you can download and run the code without having. The tables in the subsequent sections list out the implemented baselines, where they can be found, as well as results of running that code with tuned hyperparameters on some relevant ManiSkill tasks.

<!-- TODO: Add pretrained models? -->

<!-- Acknowledgement: This neat categorization of algorithms is taken from https://github.com/tinkoff-ai/CORL -->

## Offline Only Methods
These are algorithms that do not use online interaction with the environment to be trained and only learn from demonstration data.
<!-- Note that some of these algorithms can be trained offline and online and are marked with a \* and discussed in a [following section](#offline--online-methods) -->

| Baseline | Source | Results |
| ---------------------------------------------------------- | -------------------------------------------------------------------------------------------------- | --------------------- |
| Behavior Cloning | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/behavior-cloning) | [results](#baselines) |
| [Decision Transformer](https://arxiv.org/abs/2106.01345) | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/decision-transformer) | [results](#baselines) |
| [Decision Diffusers](https://arxiv.org/abs/2211.15657.pdf) | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/decision-diffusers) | [results](#baselines) |


## Online Only Methods
These are online only algorithms that do not learn from demonstrations and optimize based on feedback from interacting with the environment. These methods also benefit from GPU simulation which can massively accelerate training time

| Baseline | Source | Results |
| ---------------------------------------------------------------------- | ---------------------------------------------------------------------------------- | --------------------- |
| [Proximal Policy Optimization (PPO)](https://arxiv.org/abs/1707.06347) | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/ppo) | [results](#baselines) |
| [Soft Actor Critic (SAC)](https://arxiv.org/abs/1801.01290) | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/sac) | [results](#baselines) |
| [REDQ](https://arxiv.org/abs/2101.05982) | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/redq) | [results](#baselines) |


## Offline + Online Methods
These are baselines that can train on offline demonstration data as well as use online data collected from interacting with an environment.

| Baseline | Source | Results |
| ----------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------- | --------------------- |
| [Soft Actor Critic (SAC)](https://arxiv.org/abs/1801.01290) with demonstrations in buffer | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/sac) | [results](#baselines) |
| [MoDem](https://arxiv.org/abs/2212.05698) | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/modem) | [results](#baselines) |
| [RLPD](https://arxiv.org/abs/2302.02948) | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/rlpd) | [results](#baselines) |


7 changes: 7 additions & 0 deletions docs/source/algorithms_and_models/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Algorithms and Models
```{toctree}
:titlesonly:
:glob:
*
```
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Submission
# Online Leaderboard

To participate in the ManiSkill2 challenge, please register on the [challenge website](https://sapien.ucsd.edu/challenges/maniskill/). After registering an account, [create/join a team](https://sapien.ucsd.edu/challenges/maniskill/challenges/ms2/team). After creating/joining a team, you will be allowed to create submissions.
We currently run an online leaderboard where anyone can submit their models/algorithms and have it be evaluated publically. First register on the [challenge website](https://sapien.ucsd.edu/challenges/maniskill/). After registering an account, [create/join a team](https://sapien.ucsd.edu/challenges/maniskill/challenges/ms2/team). After creating/joining a team, you will be allowed to create submissions.

To submit to the challenge, you need to submit a URL to your docker image which contains your codes and dependencies (e.g., model weights). Before submitting, you should test the submission docker locally. Instructions for local evaluation and online submission are provided below.
To submit to the challenge, you need to submit a URL to your docker image which contains your code and dependencies (e.g., python packages, model weights). Before submitting, you should test the submission docker locally. Instructions for local evaluation and online submission are provided below.

In brief, you need to:

Expand Down Expand Up @@ -92,7 +92,7 @@ docker kill ${CONTAINER_NAME}

## Online Submission

Once you have built and pushed a docker image, you are ready to submit to the competition. Go to the competition [submissions page](https://sapien.ucsd.edu/challenges/maniskill/challenges/ms2/submit) and give your submission a name and enter the docker image name+tag (format: `registry.hub.docker.com/USERNAME/IMG_NAME:TAG`; Do not use the `latest` tag). Then select which track you are submitting to. Lastly, tick/untick which tasks you would like to evaluate your submission on.
Once you have built and pushed a docker image, you are ready to submit to the competition. Go to the competition [submissions page](https://sapien.ucsd.edu/challenges/maniskill/challenges/ms2-ongoing/submit) and give your submission a name and enter the docker image name+tag (format: `registry.hub.docker.com/USERNAME/IMG_NAME:TAG`; Do not use the `latest` tag). Then select which track you are submitting to. Lastly, tick/untick which tasks you would like to evaluate your submission on.

To ensure reproducibility, we do not allow you to submit the same docker image and tag twice, we require you to give a new tag to your image before submitting. You can create a new tag as so `docker tag <image_name> <image_name>:<tag_name>`

Expand Down
2 changes: 1 addition & 1 deletion docs/source/concepts/controllers.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Controllers
# Controllers / Action Spaces

Controllers are interfaces between policies and robots. The policy outputs actions to the controller, and the controller converts actions to control signals to the robot. For example, the `arm_pd_ee_delta_pose` controller takes the relative movement of the end-effector as input, and uses [inverse kinematics](https://en.wikipedia.org/wiki/Inverse_kinematics) to convert input actions to target positions of robot joints. The robot uses a [PD controller](https://en.wikipedia.org/wiki/PID_controller) to drive motors to achieve target joint positions.

Expand Down
4 changes: 2 additions & 2 deletions docs/source/concepts/environments.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
- Success metric: The cube is within 2.5 cm of the goal position, and the robot is static.
- Goal specification: 3D goal position.
- Demonstration: 1000 successful trajectories.
- Evaluaion protocol: 100 episodes with different initial joint positions of the robot and initial cube pose.
- Evaluation protocol: 100 episodes with different initial joint positions of the robot and initial cube pose.

```{image} thumbnails/PickCube-v0.gif
---
Expand All @@ -28,7 +28,7 @@ alt: PickCube-v0
- Objective: Pick up a red cube and place it onto a green one.
- Success metric: The red cube is placed on top of the green one stably and it is not grasped.
- Demonstration: 1000 successful trajectories.
- Evaluaion protocol: 100 episodes with different initial joint positions of the robot and initial poses of both cubes.
- Evaluation protocol: 100 episodes with different initial joint positions of the robot and initial poses of both cubes.

```{image} thumbnails/StackCube-v0.gif
---
Expand Down
7 changes: 7 additions & 0 deletions docs/source/concepts/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Concepts
```{toctree}
:titlesonly:
:glob:
*
```
2 changes: 1 addition & 1 deletion docs/source/concepts/observation.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ In addition to `agent` and `extra`, `image` and `camera_param` are introduced.
- `extrinsic_cv`: [4, 4], camera extrinsic (OpenCV convention)
- `intrinsic_cv`: [3, 3], camera intrinsic (OpenCV convention)

Unless specified otherwise, there are two cameras: *base_camera* (fixed relative to the robot base) and *hand_camera* (mounted on the robot hand). Environments migrated from ManiSkill1 use 3 cameras mounted above the robot: *overhead_camera_{i}*.
Unless specified otherwise, there is usually at least one camera called the *base_camera* (fixed relative to the robot base). Some robots have additional sensor configurations that add more cameras such as a *hand_camera* mounted on the robot hand. Environments migrated from ManiSkill1 use 3 cameras mounted above the robot: *overhead_camera_{i}*.

### rgbd

Expand Down
21 changes: 10 additions & 11 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,18 +6,19 @@
# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

project = "ManiSkill2"
copyright = "2023, ManiSkill2 Contributors"
author = "ManiSkill2 Contributors"
release = "0.5.0"
project = "ManiSkill3"
copyright = "2024, ManiSkill3 Contributors"
author = "ManiSkill3 Contributors"
release = "3.0.0"
version = "3.0.0"

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

extensions = [
"sphinx_rtd_theme",
"sphinx.ext.autodoc",
"sphinx.ext.mathjax",
"sphinx.ext.viewcode",
"sphinx_copybutton",
"myst_parser",
"sphinx_subfigure",
Expand All @@ -35,14 +36,12 @@
# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = "sphinx_rtd_theme"
# html_static_path = ["_static"]
html_theme = "sphinx_book_theme"

# replace "view page source" with "edit on github" in Read The Docs theme
# * https://github.com/readthedocs/sphinx_rtd_theme/issues/529
html_context = {
"display_github": True,
"github_user": "haosulab",
"github_repo": "ManiSkill2",
"github_version": "main/docs/source/",
}
"github_version": "main",
"conf_py_path": "/source/"
}
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
# Demonstrations
# Datasets

High-quality demonstration datasets are one of the features of ManiSkill2. Demonstrations can be used to facilitate learning-from-demonstrations approaches, e.g., [Shen et al](https://arxiv.org/pdf/2203.02107.pdf).

Most demonstrations are generated by motion planning with privileged information. Some demonstrations are generated by [model predictive control](https://en.wikipedia.org/wiki/Model_predictive_control) (MPC) or state-based Reinforcement Learning (RL) given our dense rewards.
ManiSkill has a wide variety of demonstrations from different sources including RL, human teleoperation, and motion-planning.

## Download

We provide a command line tool (`mani_skill2.utils.download_demo`) to download demonstrations from Google Drive. The full datasets are available on [Google Drive](https://drive.google.com/drive/folders/1hVdUNPGCHh0OULPCowBClPYIXSwsx-J9). Please refer to [Environments](../concepts/environments.md) for all supported environments. Please see our [notes](https://docs.google.com/document/d/1bBKmsR-R_7tR9LwaT1c3J26SjIWw27tWSLdHnfBR01c/edit?usp=sharing) about the details of the demonstrations.
We provide a command line tool to download demonstrations directly from our [Hugging Face 🤗 dataset page](https://huggingface.co/datasets/haosulab/ManiSkill2) which are done by environment ID. The tool will download the demonstration files to a folder and also a few demonstration videos visualizing what the demonstrations look like. See [Environments](../concepts/environments.md) for a list of all supported environments.

<!-- TODO: add a table here detailing the data info in detail -->
<!-- Please see our [notes](https://docs.google.com/document/d/1bBKmsR-R_7tR9LwaT1c3J26SjIWw27tWSLdHnfBR01c/edit?usp=sharing) about the details of the demonstrations. -->

```bash
# Download the full datasets
Expand All @@ -19,41 +20,39 @@ python -m mani_skill2.utils.download_demo rigid_body -o ./demos
python -m mani_skill2.utils.download_demo soft_body
```

For those who cannot access Google Drive, the datasets can be downloaded from [ScienceDB.cn](http://doi.org/10.57760/sciencedb.02239).

## Format

All demonstrations for an environment are saved in HDF5 format. Each HDF5 dataset is named `trajectory.{obs_mode}.{control_mode}.h5`, and is associated with a JSON file with the same base name. The JSON file stores meta information. Unless otherwise specified, `trajectory.h5` is short for `trajectory.none.pd_joint_pos.h5`, which contains the original demonstrations generated by the `pd_joint_pos` controller with the `none` observation mode (empty observations). However, there may exist demonstrations generated by other controllers. **Thus, please check the associated JSON to ensure which controller is used.**

All demonstrations for an environment are saved in the HDF5 format openable by [h5py](https://github.com/h5py/h5py). Each HDF5 dataset is named `trajectory.{obs_mode}.{control_mode}.h5`, and is associated with a JSON metadata file with the same base name. Unless otherwise specified, `trajectory.h5` is short for `trajectory.none.pd_joint_pos.h5`, which contains the original demonstrations generated by the `pd_joint_pos` controller with the `none` observation mode (empty observations). However, there may exist demonstrations generated by other controllers. **Thus, please check the associated JSON to ensure which controller is used.**
<!--
:::{note}
For `PickSingleYCB-v0`, `TurnFaucet-v0`, the dataset is named `{model_id}.h5` for each asset. It is due to some legacy issues, and might be changed in the future.
For `OpenCabinetDoor-v1`, `OpenCabinetDrawer-v1`, `PushChair-v1`, `MoveBucket-v1`, which are migrated from [ManiSkill1](https://github.com/haosulab/ManiSkill), trajectories are generated by the RL and `base_pd_joint_vel_arm_pd_joint_vel` controller.
:::
::: -->

### Meta Information (JSON)

Each JSON file contains:

- env_info (`Dict`): environment information, which can be used to initialize the environment
- env_id: environment id
- max_episode_steps
- env_kwargs: keyword arguments to initialize the environment. **Essential to reproduce the trajectory.**
- episodes (`List[Dict]`): episode information
- `env_info` (Dict): environment information, which can be used to initialize the environment
- `env_id` (str): environment id
- `max_episode_steps` (int)
- `env_kwargs` (Dict): keyword arguments to initialize the environment. **Essential to recreate the environment.**
- `episodes` (List[Dict]): episode information

The episode information (the element of `episodes`) includes:

- episode_id: a unique id to index the episode
- reset_kwargs: keyword arguments to reset the environment. **Essential to reproduce the trajectory.**
- control_mode: control mode used for the episode.
- elapsed_steps: trajectory length
- info: information at the end of the episode.
- `episode_id` (int): a unique id to index the episode
- `reset_kwargs` (Dict): keyword arguments to reset the environment. **Essential to reproduce the trajectory.**
- `control_mode` (str): control mode used for the episode.
- `elapsed_steps` (int): trajectory length
- `info` (Dict): information at the end of the episode.

To reproduce the environment for the trajectory:
With just the meta data, you can reproduce the environment the same way it was created when the trajectories were collected as so:

```python
env = gym.make(env_info["env_id"], **env_info["env_kwargs"])
episode = env_info["episodes"][0]
episode = env_info["episodes"][0] # picks the first
env.reset(**episode["reset_kwargs"])
```

Expand All @@ -69,7 +68,7 @@ Each trajectory is an `h5py.Group`, which contains:
- env_init_state: [D], `np.float32`. The initial environment state. It is used for soft-body environments, since their states (particle positions) can use too much space.
- obs (optional): observations. If the observation is a `dict`, the value will be stored in `obs/{key}`. The convention is applied recursively for nested dict.

## Usage
## Replaying/Converting Demonstration data

To replay the demonstrations (without changing the observation mode and control mode):

Expand All @@ -85,7 +84,7 @@ python -m mani_skill2.trajectory.replay_trajectory --traj-path demos/rigid_body/
The script requires `trajectory.h5` and `trajectory.json` to be both under the same directory.
:::

The raw demonstration files contain all the necessary information (e.g. initial states, actions, seeds) to reproduce a trajectory. Observations are not included since they can lead to large file sizes without postprocessing. In addition, actions in these files do not cover all control modes. Therefore, you need to convert our raw files into your desired observation and control modes. We provide a utility script that works as follows:
The raw demonstration files contain all the necessary information (e.g. initial states, actions, seeds) to reproduce a trajectory. Observations are not included since they can lead to large file sizes without postprocessing. In addition, actions in these files do not cover all control modes. Therefore, you need to convert the raw files into your desired observation and control modes. We provide a utility script that works as follows:

```bash
# Replay demonstrations with control_mode=pd_joint_delta_pos
Expand Down
7 changes: 7 additions & 0 deletions docs/source/datasets/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Datasets
```{toctree}
:titlesonly:
:glob:
*
```
1 change: 1 addition & 0 deletions docs/source/datasets/teleoperation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Teleoperation
Loading

0 comments on commit e2f0330

Please sign in to comment.