New docs (#220)

* work * Update setup.py * lots of docs * work * new theme and work * readthedocs * work * bug fix * fix * work
haosulab · Mar 2, 2024 · e2f0330 · e2f0330
1 parent 23a5ef6
commit e2f0330
Show file tree

Hide file tree

Showing 25 changed files with 340 additions and 95 deletions.
diff --git a/.readthedocs.yaml b/.readthedocs.yaml
@@ -0,0 +1,32 @@
+# .readthedocs.yaml
+# Read the Docs configuration file
+# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
+
+# Required
+version: 2
+
+# Set the OS, Python version and other tools you might need
+build:
+  os: ubuntu-22.04
+  tools:
+    python: "3.9"
+    # You can also specify other tool versions:
+    # nodejs: "19"
+    # rust: "1.64"
+    # golang: "1.19"
+
+# Build documentation in the "docs/" directory with Sphinx
+sphinx:
+  configuration: docs/source/conf.py
+
+# Optionally build your docs in additional formats such as PDF and ePub
+# formats:
+#    - pdf
+#    - epub
+
+# Optional but recommended, declare the Python requirements required
+# to build your documentation
+# See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
+python:
+   install:
+   - requirements: docs/requirements.txt
diff --git a/docs/requirements.txt b/docs/requirements.txt
@@ -0,0 +1,12 @@
+sphinx==6.2.1
+sphinx-autobuild
+sphinx-book-theme
+# For spelling
+sphinxcontrib.spelling
+# Type hints support
+sphinx-autodoc-typehints
+# Copy button for code snippets
+sphinx_copybutton
+# Markdown parser
+myst-parser
+sphinx-subfigure
diff --git a/docs/source/additional_resources/education.md b/docs/source/additional_resources/education.md
@@ -0,0 +1,9 @@
+# Educational Resources
+
+TODO: Things to collate
+
+- Course Materials/Slides from other universities
+- Tutorials
+- Leaderboard details / Work with kaggle to run in class competitions with ManiSkill?
+- Simple, visually cool looking games? for fun? (not reseaerch necessarily)
+- 
diff --git a/docs/source/additional_resources/performance_benchmarking.md b/docs/source/additional_resources/performance_benchmarking.md
@@ -0,0 +1 @@
+# Performance Benchmarking
diff --git a/docs/source/algorithms_and_models/baselines.md b/docs/source/algorithms_and_models/baselines.md
@@ -0,0 +1,39 @@
+# Baselines
+
+ManiSkill has a number of baseline Reinforcement Learning (RL), Learning from Demonstrations (LfD) / Imitation Learning (IL) algorithms implemented that are easily runnable and reproducible for ManiSkill tasks. All baselines have their own standalone folders that you can download and run the code without having. The tables in the subsequent sections list out the implemented baselines, where they can be found, as well as results of running that code with tuned hyperparameters on some relevant ManiSkill tasks.
+
+<!-- TODO: Add pretrained models? -->
+
+<!-- Acknowledgement: This neat categorization of algorithms is taken from https://github.com/tinkoff-ai/CORL -->
+
+## Offline Only Methods
+These are algorithms that do not use online interaction with the environment to be trained and only learn from demonstration data. 
+<!-- Note that some of these algorithms can be trained offline and online and are marked with a \* and discussed in a [following section](#offline--online-methods) -->
+
+| Baseline                                                   | Source                                                                                             | Results               |
+| ---------------------------------------------------------- | -------------------------------------------------------------------------------------------------- | --------------------- |
+| Behavior Cloning                                           | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/behavior-cloning)     | [results](#baselines) |
+| [Decision Transformer](https://arxiv.org/abs/2106.01345)   | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/decision-transformer) | [results](#baselines) |
+| [Decision Diffusers](https://arxiv.org/abs/2211.15657.pdf) | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/decision-diffusers)   | [results](#baselines) |
+
+
+## Online Only Methods
+These are online only algorithms that do not learn from demonstrations and optimize based on feedback from interacting with the environment. These methods also benefit from GPU simulation which can massively accelerate training time
+
+| Baseline                                                               | Source                                                                             | Results               |
+| ---------------------------------------------------------------------- | ---------------------------------------------------------------------------------- | --------------------- |
+| [Proximal Policy Optimization (PPO)](https://arxiv.org/abs/1707.06347) | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/ppo)  | [results](#baselines) |
+| [Soft Actor Critic (SAC)](https://arxiv.org/abs/1801.01290)            | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/sac)  | [results](#baselines) |
+| [REDQ](https://arxiv.org/abs/2101.05982)                               | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/redq) | [results](#baselines) |
+
+
+## Offline + Online Methods
+These are baselines that can train on offline demonstration data as well as use online data collected from interacting with an environment.
+
+| Baseline                                                                                  | Source                                                                              | Results               |
+| ----------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------- | --------------------- |
+| [Soft Actor Critic (SAC)](https://arxiv.org/abs/1801.01290) with demonstrations in buffer | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/sac)   | [results](#baselines) |
+| [MoDem](https://arxiv.org/abs/2212.05698)                                                 | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/modem) | [results](#baselines) |
+| [RLPD](https://arxiv.org/abs/2302.02948)                                                  | [source](https://github.com/haosulab/ManiSkill2/tree/main/examples/baselines/rlpd)  | [results](#baselines) |
+
+
diff --git a/docs/source/algorithms_and_models/index.md b/docs/source/algorithms_and_models/index.md
@@ -0,0 +1,7 @@
+# Algorithms and Models
+```{toctree}
+:titlesonly:
+:glob:
+
+*
+```
diff --git a/docs/source/benchmark/submission.md → docs/source/benchmark/online_leaderboard.md b/docs/source/benchmark/submission.md → docs/source/benchmark/online_leaderboard.md
@@ -1,8 +1,8 @@
-# Submission
+# Online Leaderboard
 
-To participate in the ManiSkill2 challenge, please register on the [challenge website](https://sapien.ucsd.edu/challenges/maniskill/). After registering an account, [create/join a team](https://sapien.ucsd.edu/challenges/maniskill/challenges/ms2/team). After creating/joining a team, you will be allowed to create submissions.
+We currently run an online leaderboard where anyone can submit their models/algorithms and have it be evaluated publically. First register on the [challenge website](https://sapien.ucsd.edu/challenges/maniskill/). After registering an account, [create/join a team](https://sapien.ucsd.edu/challenges/maniskill/challenges/ms2/team). After creating/joining a team, you will be allowed to create submissions.
 
-To submit to the challenge, you need to submit a URL to your docker image which contains your codes and dependencies (e.g., model weights). Before submitting, you should test the submission docker locally. Instructions for local evaluation and online submission are provided below.
+To submit to the challenge, you need to submit a URL to your docker image which contains your code and dependencies (e.g., python packages, model weights). Before submitting, you should test the submission docker locally. Instructions for local evaluation and online submission are provided below.
 
 In brief, you need to:
 
@@ -92,7 +92,7 @@ docker kill ${CONTAINER_NAME}
 
 ## Online Submission
 
-Once you have built and pushed a docker image, you are ready to submit to the competition. Go to the competition [submissions page](https://sapien.ucsd.edu/challenges/maniskill/challenges/ms2/submit) and give your submission a name and enter the docker image name+tag (format: `registry.hub.docker.com/USERNAME/IMG_NAME:TAG`; Do not use the `latest` tag). Then select which track you are submitting to. Lastly, tick/untick which tasks you would like to evaluate your submission on.
+Once you have built and pushed a docker image, you are ready to submit to the competition. Go to the competition [submissions page](https://sapien.ucsd.edu/challenges/maniskill/challenges/ms2-ongoing/submit) and give your submission a name and enter the docker image name+tag (format: `registry.hub.docker.com/USERNAME/IMG_NAME:TAG`; Do not use the `latest` tag). Then select which track you are submitting to. Lastly, tick/untick which tasks you would like to evaluate your submission on.
 
 To ensure reproducibility, we do not allow you to submit the same docker image and tag twice, we require you to give a new tag to your image before submitting. You can create a new tag as so `docker tag <image_name> <image_name>:<tag_name>`
 

diff --git a/docs/source/concepts/controllers.md b/docs/source/concepts/controllers.md
@@ -1,4 +1,4 @@
-# Controllers
+# Controllers / Action Spaces
 
 Controllers are interfaces between policies and robots. The policy outputs actions to the controller, and the controller converts actions to control signals to the robot. For example, the `arm_pd_ee_delta_pose` controller takes the relative movement of the end-effector as input, and uses [inverse kinematics](https://en.wikipedia.org/wiki/Inverse_kinematics) to convert input actions to target positions of robot joints. The robot uses a [PD controller](https://en.wikipedia.org/wiki/PID_controller) to drive motors to achieve target joint positions.
 

diff --git a/docs/source/concepts/environments.md b/docs/source/concepts/environments.md
@@ -14,7 +14,7 @@
 - Success metric: The cube is within 2.5 cm of the goal position, and the robot is static.
 - Goal specification: 3D goal position.
 - Demonstration: 1000 successful trajectories.
-- Evaluaion protocol: 100 episodes with different initial joint positions of the robot and initial cube pose.
+- Evaluation protocol: 100 episodes with different initial joint positions of the robot and initial cube pose.
 
 ```{image} thumbnails/PickCube-v0.gif
 ---
@@ -28,7 +28,7 @@ alt: PickCube-v0
 - Objective: Pick up a red cube and place it onto a green one.
 - Success metric: The red cube is placed on top of the green one stably and it is not grasped.
 - Demonstration: 1000 successful trajectories.
-- Evaluaion protocol: 100 episodes with different initial joint positions of the robot and initial poses of both cubes.
+- Evaluation protocol: 100 episodes with different initial joint positions of the robot and initial poses of both cubes.
 
 ```{image} thumbnails/StackCube-v0.gif
 ---

diff --git a/docs/source/concepts/index.md b/docs/source/concepts/index.md
@@ -0,0 +1,7 @@
+# Concepts
+```{toctree}
+:titlesonly:
+:glob:
+
+*
+```
diff --git a/docs/source/concepts/observation.md b/docs/source/concepts/observation.md
@@ -38,7 +38,7 @@ In addition to `agent` and `extra`, `image` and `camera_param` are introduced.
   - `extrinsic_cv`: [4, 4], camera extrinsic (OpenCV convention)
   - `intrinsic_cv`: [3, 3], camera intrinsic (OpenCV convention)
 
-Unless specified otherwise, there are two cameras: *base_camera* (fixed relative to the robot base) and *hand_camera* (mounted on the robot hand). Environments migrated from ManiSkill1 use 3 cameras mounted above the robot: *overhead_camera_{i}*.
+Unless specified otherwise, there is usually at least one camera called the *base_camera* (fixed relative to the robot base). Some robots have additional sensor configurations that add more cameras such as a *hand_camera* mounted on the robot hand. Environments migrated from ManiSkill1 use 3 cameras mounted above the robot: *overhead_camera_{i}*.
 
 ### rgbd
 

diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -6,18 +6,19 @@
 # -- Project information -----------------------------------------------------
 # https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
 
-project = "ManiSkill2"
-copyright = "2023, ManiSkill2 Contributors"
-author = "ManiSkill2 Contributors"
-release = "0.5.0"
+project = "ManiSkill3"
+copyright = "2024, ManiSkill3 Contributors"
+author = "ManiSkill3 Contributors"
+release = "3.0.0"
+version = "3.0.0"
 
 # -- General configuration ---------------------------------------------------
 # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
 
 extensions = [
-    "sphinx_rtd_theme",
     "sphinx.ext.autodoc",
     "sphinx.ext.mathjax",
+    "sphinx.ext.viewcode",
     "sphinx_copybutton",
     "myst_parser",
     "sphinx_subfigure",
@@ -35,14 +36,12 @@
 # -- Options for HTML output -------------------------------------------------
 # https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
 
-html_theme = "sphinx_rtd_theme"
-# html_static_path = ["_static"]
+html_theme = "sphinx_book_theme"
 
-# replace "view page source" with "edit on github" in Read The Docs theme
-#  * https://github.com/readthedocs/sphinx_rtd_theme/issues/529
 html_context = {
     "display_github": True,
     "github_user": "haosulab",
     "github_repo": "ManiSkill2",
-    "github_version": "main/docs/source/",
-}
+    "github_version": "main",
+    "conf_py_path": "/source/"
+}
diff --git a/docs/source/concepts/demonstrations.md → docs/source/datasets/datasets.md b/docs/source/concepts/demonstrations.md → docs/source/datasets/datasets.md
@@ -1,12 +1,13 @@
-# Demonstrations
+# Datasets
 
-High-quality demonstration datasets are one of the features of ManiSkill2. Demonstrations can be used to facilitate learning-from-demonstrations approaches, e.g., [Shen et al](https://arxiv.org/pdf/2203.02107.pdf).
-
-Most demonstrations are generated by motion planning with privileged information. Some demonstrations are generated by [model predictive control](https://en.wikipedia.org/wiki/Model_predictive_control) (MPC) or state-based Reinforcement Learning (RL) given our dense rewards.
+ManiSkill has a wide variety of demonstrations from different sources including RL, human teleoperation, and motion-planning.
 
 ## Download
 
-We provide a command line tool (`mani_skill2.utils.download_demo`) to download demonstrations from Google Drive. The full datasets are available on [Google Drive](https://drive.google.com/drive/folders/1hVdUNPGCHh0OULPCowBClPYIXSwsx-J9). Please refer to [Environments](../concepts/environments.md) for all supported environments. Please see our [notes](https://docs.google.com/document/d/1bBKmsR-R_7tR9LwaT1c3J26SjIWw27tWSLdHnfBR01c/edit?usp=sharing) about the details of the demonstrations.
+We provide a command line tool to download demonstrations directly from our [Hugging Face 🤗 dataset page](https://huggingface.co/datasets/haosulab/ManiSkill2) which are done by environment ID. The tool will download the demonstration files to a folder and also a few demonstration videos visualizing what the demonstrations look like. See [Environments](../concepts/environments.md) for a list of all supported environments.
+
+<!-- TODO: add a table here detailing the data info in detail -->
+<!-- Please see our [notes](https://docs.google.com/document/d/1bBKmsR-R_7tR9LwaT1c3J26SjIWw27tWSLdHnfBR01c/edit?usp=sharing) about the details of the demonstrations. -->
 
 ```bash
 # Download the full datasets
@@ -19,41 +20,39 @@ python -m mani_skill2.utils.download_demo rigid_body -o ./demos
 python -m mani_skill2.utils.download_demo soft_body
 ```
 
-For those who cannot access Google Drive, the datasets can be downloaded from [ScienceDB.cn](http://doi.org/10.57760/sciencedb.02239).
-
 ## Format
 
-All demonstrations for an environment are saved in HDF5 format. Each HDF5 dataset is named `trajectory.{obs_mode}.{control_mode}.h5`, and is associated with a JSON file with the same base name. The JSON file stores meta information. Unless otherwise specified, `trajectory.h5` is short for `trajectory.none.pd_joint_pos.h5`, which contains the original demonstrations generated by the `pd_joint_pos` controller with the `none` observation mode (empty observations). However, there may exist demonstrations generated by other controllers. **Thus, please check the associated JSON to ensure which controller is used.**
-
+All demonstrations for an environment are saved in the HDF5 format openable by [h5py](https://github.com/h5py/h5py). Each HDF5 dataset is named `trajectory.{obs_mode}.{control_mode}.h5`, and is associated with a JSON metadata file with the same base name. Unless otherwise specified, `trajectory.h5` is short for `trajectory.none.pd_joint_pos.h5`, which contains the original demonstrations generated by the `pd_joint_pos` controller with the `none` observation mode (empty observations). However, there may exist demonstrations generated by other controllers. **Thus, please check the associated JSON to ensure which controller is used.**
+<!-- 
 :::{note}
 For `PickSingleYCB-v0`, `TurnFaucet-v0`, the dataset is named `{model_id}.h5` for each asset. It is due to some legacy issues, and might be changed in the future.
 
 For `OpenCabinetDoor-v1`, `OpenCabinetDrawer-v1`, `PushChair-v1`, `MoveBucket-v1`, which are migrated from [ManiSkill1](https://github.com/haosulab/ManiSkill), trajectories are generated by the RL and `base_pd_joint_vel_arm_pd_joint_vel` controller.
-:::
+::: -->
 
 ### Meta Information (JSON)
 
 Each JSON file contains:
 
-- env_info (`Dict`): environment information, which can be used to initialize the environment
-  - env_id: environment id
-  - max_episode_steps
-  - env_kwargs: keyword arguments to initialize the environment. **Essential to reproduce the trajectory.**
-- episodes (`List[Dict]`): episode information
+- `env_info` (Dict): environment information, which can be used to initialize the environment
+  - `env_id` (str): environment id
+  - `max_episode_steps` (int)
+  - `env_kwargs` (Dict): keyword arguments to initialize the environment. **Essential to recreate the environment.**
+- `episodes` (List[Dict]): episode information
 
 The episode information (the element of `episodes`) includes:
 
-- episode_id: a unique id to index the episode
-- reset_kwargs: keyword arguments to reset the environment. **Essential to reproduce the trajectory.**
-- control_mode: control mode used for the episode.
-- elapsed_steps: trajectory length
-- info: information at the end of the episode.
+- `episode_id` (int): a unique id to index the episode
+- `reset_kwargs` (Dict): keyword arguments to reset the environment. **Essential to reproduce the trajectory.**
+- `control_mode` (str): control mode used for the episode.
+- `elapsed_steps` (int): trajectory length
+- `info` (Dict): information at the end of the episode.
 
-To reproduce the environment for the trajectory:
+With just the meta data, you can reproduce the environment the same way it was created when the trajectories were collected as so:
 
 ```python
 env = gym.make(env_info["env_id"], **env_info["env_kwargs"])
-episode = env_info["episodes"][0]
+episode = env_info["episodes"][0] # picks the first
 env.reset(**episode["reset_kwargs"])
 ```
 
@@ -69,7 +68,7 @@ Each trajectory is an `h5py.Group`, which contains:
 - env_init_state: [D], `np.float32`. The initial environment state. It is used for soft-body environments, since their states (particle positions) can use too much space.
 - obs (optional): observations. If the observation is a `dict`, the value will be stored in `obs/{key}`. The convention is applied recursively for nested dict.
 
-## Usage
+## Replaying/Converting Demonstration data
 
 To replay the demonstrations (without changing the observation mode and control mode):
 
@@ -85,7 +84,7 @@ python -m mani_skill2.trajectory.replay_trajectory --traj-path demos/rigid_body/
 The script requires `trajectory.h5` and `trajectory.json` to be both under the same directory.
 :::
 
-The raw demonstration files contain all the necessary information (e.g. initial states, actions, seeds) to reproduce a trajectory. Observations are not included since they can lead to large file sizes without postprocessing. In addition, actions in these files do not cover all control modes. Therefore, you need to convert our raw files into your desired observation and control modes. We provide a utility script that works as follows:
+The raw demonstration files contain all the necessary information (e.g. initial states, actions, seeds) to reproduce a trajectory. Observations are not included since they can lead to large file sizes without postprocessing. In addition, actions in these files do not cover all control modes. Therefore, you need to convert the raw files into your desired observation and control modes. We provide a utility script that works as follows:
 
 ```bash
 # Replay demonstrations with control_mode=pd_joint_delta_pos

diff --git a/docs/source/datasets/index.md b/docs/source/datasets/index.md
@@ -0,0 +1,7 @@
+# Datasets
+```{toctree}
+:titlesonly:
+:glob:
+
+*
+```
diff --git a/docs/source/datasets/teleoperation.md b/docs/source/datasets/teleoperation.md
@@ -0,0 +1 @@
+# Teleoperation