Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Camera Benchmark Tool and Allow Correct Unprojection of distance_to_camera depth image #976

Merged
merged 54 commits into from
Sep 25, 2024
Merged
Show file tree
Hide file tree
Changes from 47 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
7391a79
add method to convert distance_to_camera data to distance_to_image_pl…
glvov-bdai Sep 10, 2024
e3d2889
first draft
glvov-bdai Sep 10, 2024
bc8b72a
small fixes
glvov-bdai Sep 10, 2024
6e5e743
small fixes to meet checklists: fix docstring, and add to contributors
glvov-bdai Sep 10, 2024
bcd2acf
Merge branch 'main' of https://github.com/glvov-bdai/IsaacLab into fe…
glvov-bdai Sep 10, 2024
0c79677
change docstring to be consistent
glvov-bdai Sep 10, 2024
247add6
formatting
glvov-bdai Sep 11, 2024
95b744e
add comment
glvov-bdai Sep 11, 2024
44875e2
add change to log
glvov-bdai Sep 11, 2024
9c66b6c
include changelog and docstrings
glvov-bdai Sep 11, 2024
86483cf
spelling
glvov-bdai Sep 11, 2024
21819e1
some more details
glvov-bdai Sep 11, 2024
134ba01
pull main into fork
glvov-bdai Sep 11, 2024
ba62753
Update docs/source/how-to/estimate_how_many_cameras_can_run.rst
glvov-bdai Sep 17, 2024
1995022
Update source/extensions/omni.isaac.lab/docs/CHANGELOG.rst
glvov-bdai Sep 17, 2024
08624d9
Update source/extensions/omni.isaac.lab/omni/isaac/lab/utils/math.py
glvov-bdai Sep 17, 2024
216ad85
Update source/standalone/tutorials/04_sensors/benchmark_cameras.py
glvov-bdai Sep 17, 2024
63d47a6
Update source/extensions/omni.isaac.lab/omni/isaac/lab/utils/math.py
glvov-bdai Sep 17, 2024
8c3ccb3
first restructuring
glvov-bdai Sep 17, 2024
9fd0aa4
remove reps from doc string
glvov-bdai Sep 17, 2024
b3d9103
shorted conversion method name
glvov-bdai Sep 17, 2024
c4b627e
tiny docstring tweak
glvov-bdai Sep 17, 2024
2ad7835
add warning about more than one camera type at once
glvov-bdai Sep 17, 2024
c0094b7
add whitespace for list to render correctly
glvov-bdai Sep 17, 2024
4f7db7f
Merge branch 'main' into feature/tiled_camera_examples
glvov-bdai Sep 17, 2024
1187c6c
Allow preserving last single dim
glvov-bdai Sep 17, 2024
6ebf5ff
Update docs/source/how-to/estimate_how_many_cameras_can_run.rst
glvov-bdai Sep 18, 2024
24a184a
Update source/extensions/omni.isaac.lab/omni/isaac/lab/utils/math.py
glvov-bdai Sep 18, 2024
f3932df
Update source/extensions/omni.isaac.lab/docs/CHANGELOG.rst
glvov-bdai Sep 18, 2024
cc9224a
Update docs/source/how-to/estimate_how_many_cameras_can_run.rst
glvov-bdai Sep 18, 2024
ebc2baf
Update source/standalone/tutorials/04_sensors/benchmark_cameras.py
glvov-bdai Sep 18, 2024
1c1b41c
Update docs/source/how-to/estimate_how_many_cameras_can_run.rst
glvov-bdai Sep 18, 2024
69b5b79
Update docs/source/how-to/estimate_how_many_cameras_can_run.rst
glvov-bdai Sep 18, 2024
773ef65
changelog clarity
glvov-bdai Sep 18, 2024
18aa427
remove list of replicator types from benchmark
glvov-bdai Sep 18, 2024
3065c80
change raycaster default
glvov-bdai Sep 18, 2024
3c8e43f
remove viz
glvov-bdai Sep 18, 2024
a3c410a
add injection into scene
glvov-bdai Sep 18, 2024
a44f15d
load scene
glvov-bdai Sep 19, 2024
c47ebb5
fix autotune; update docs ;)
garylvov Sep 20, 2024
97ab5a6
CI
glvov-bdai Sep 20, 2024
f001216
Merge branch 'main' into feature/tiled_camera_examples
glvov-bdai Sep 20, 2024
9bef60c
Merge branch 'main' into feature/tiled_camera_examples
glvov-bdai Sep 20, 2024
f51a1cc
Update benchmark_cameras.py for 1.2
garylvov Sep 21, 2024
3649952
Merge branch 'isaac-sim:main' into feature/tiled_camera_examples
glvov-bdai Sep 23, 2024
712e3bd
formatting
glvov-bdai Sep 23, 2024
e98514f
Merge branch 'isaac-sim:main' into feature/tiled_camera_examples
glvov-bdai Sep 23, 2024
17c74f3
Update source/extensions/omni.isaac.lab/omni/isaac/lab/utils/math.py
glvov-bdai Sep 24, 2024
4f68566
Update source/extensions/omni.isaac.lab/omni/isaac/lab/utils/math.py
glvov-bdai Sep 24, 2024
3e749e6
Merge branch 'main' into feature/tiled_camera_examples
glvov-bdai Sep 24, 2024
fbfb191
formatting
glvov-bdai Sep 25, 2024
5ec0967
Merge branch 'main' into feature/tiled_camera_examples
glvov-bdai Sep 25, 2024
0e62b8b
get rid of docs warning
glvov-bdai Sep 25, 2024
eb0de81
Merge branch 'feature/tiled_camera_examples' of https://github.com/gl…
glvov-bdai Sep 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CONTRIBUTORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ Guidelines for modifications:
* Calvin Yu
* Chenyu Yang
* David Yang
* Gary Lvov
* HoJin Jeon
* Jean Tampon
* Jia Lin Yuan
Expand Down
119 changes: 119 additions & 0 deletions docs/source/how-to/estimate_how_many_cameras_can_run.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
.. _how-to-estimate-how-cameras-can-run:


Find How Many/What Cameras You Should Train With
================================================

.. currentmodule:: omni.isaac.lab

Currently in Isaac Lab, there are several camera types; USD Cameras (standard), Tiled Cameras,
and Ray Caster cameras. These camera types differ in functionality and performance. The ``benchmark_cameras.py``
script can be used to understand the difference in cameras types, as well to characterize their relative performance
at different parameters such as camera quantity, image dimensions, and data types.

This utility is provided so that one easily can find the camera type/parameters that are the most performant
while meeting the requirements of the user's scenario. This utility also helps estimate
the maximum number of cameras one can realistically run, assuming that one wants to maximize the number
of environments while minimizing step time.

This utility can inject cameras into an existing task from the gym registry,
which can be useful for benchmarking cameras in a specific scenario. Also,
if you install ``pynvml``, you can let this utility automatically find the maximum
numbers of cameras that can run in your task environment up to a
certain specified system resource utilization threshold (without training; taking zero actions
at each timestep).

This guide accompanies the ``benchmark_cameras.py`` script in the ``IsaacLab/source/standalone/tutorials/04_sensors``
directory.

.. dropdown:: Code for benchmark_cameras.py
:icon: code

.. literalinclude:: ../../../source/standalone/tutorials/04_sensors/benchmark_cameras.py
:language: python
:linenos:


Possible Parameters
-------------------

First, run

.. code-block:: bash

./isaaclab.sh -p source/standalone/tutorials/04_sensors/benchmark_cameras.py -h

to see all possible parameters you can vary with this utility.


See the command line parameters related to ``autotune`` for more information about
automatically determining maximum camera count.


Compare Performance in Task Environments and Automatically Determine Task Max Camera Count
------------------------------------------------------------------------------------------

Currently, tiled cameras are the most performant camera that can handle multiple dynamic objects.

For example, to see how your system could handle 100 tiled cameras in
the cartpole environment, with 2 cameras per environment (so 50 environments total)
only in RGB mode, run

.. code-block:: bash
./isaaclab.sh -p source/standalone/tutorials/04_sensors/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb

If you have pynvml installed, (``./isaaclab.sh -p -m pip install pynvml``), you can also
find the maximum number of cameras that you could run in the specified environment up to
a certain performance threshold (specified by max CPU utilization percent, max RAM utilization percent,
max GPU compute percent, and max GPU memory percent). For example, to find the maximum number of cameras
you can run with cartpole, you could run:

.. code-block:: bash
./isaaclab.sh -p source/standalone/tutorials/04_sensors/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb --autotune \
--autotune_max_percentage_util 100 80 50 50

Autotune may lead to the program crashing, which means that it tried to run too many cameras at once.
However, the max percentage utilization parameter is meant to prevent this from happening.

The output of the benchmark doesn't include the overhead of training the network, so consider
decreasing the maximum utilization percentages to account for this overhead. The final output camera
count is for all cameras, so to get the total number of environments, divide the output camera count
by the number of cameras per environment.


Compare Camera Type and Performance (Without a Specified Task)
--------------------------------------------------------------

This tool can also asses performance without a task environment.
For example, to view 100 random objects with 2 standard cameras, one could run

.. code-block:: bash

./isaaclab.sh -p source/standalone/tutorials/04_sensors/benchmark_cameras.py \
--height 100 --width 100 --num_standard_cameras 2 \
--standard_camera_data_types instance_segmentation_fast normals --num_objects 100 \
--experiment_length 100

If your system cannot handle this due to performance reasons, then the process will be killed.
It's recommended to monitor CPU/RAM utilization and GPU utilization while running this script, to get
an idea of how many resources rendering the desired camera requires. In Ubuntu, you can use tools like ``htop`` and ``nvtop``
to live monitor resources while running this script, and in Windows, you can use the Task Manager.

If your system has a hard time handling the desired cameras, you can try the following

- Switch to headless mode (supply ``--headless``)
glvov-bdai marked this conversation as resolved.
Show resolved Hide resolved
- Ensure you are using the GPU pipeline not CPU!
- If you aren't using Tiled Cameras, switch to Tiled Cameras
- Decrease camera resolution
- Decrease how many data_types there are for each camera.
- Decrease the number of cameras
- Decrease the number of objects in the scene

If your system is able to handle the amount of cameras, then the time statistics will be printed to the terminal.
After the simulations stops it can be closed with CTRL C.
11 changes: 11 additions & 0 deletions docs/source/how-to/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,17 @@ This guide explains how to save the camera output in Isaac Lab.

save_camera_output

Estimate How Many Cameras Can Run On Your Machine
-------------------------------------------------

This guide demonstrates how to estimate the number of cameras one can run on their machine under the desired parameters.

.. toctree::
:maxdepth: 1

estimate_how_many_cameras_can_run


Drawing Markers
---------------

Expand Down
12 changes: 12 additions & 0 deletions source/extensions/omni.isaac.lab/docs/CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,18 @@ Changelog
---------


0.24.14 (2024-09-20)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: You need to update BOTH extension.toml and CHANGELOG.rst

~~~~~~~~~~~~~~~~~~~~

Added
^^^^^

* Added :meth:`convert_perspective_depth_to_orthogonal_depth`. :meth:`unproject_depth` assumes
that the input depth image is orthogonal. The new :meth:`convert_perspective_depth_to_orthogonal_depth`
can be used to convert a perspective depth image into an orthogonal depth image, so that the point cloud
can be unprojected correctly with :meth:`unproject_depth`.


0.24.13 (2024-09-08)
~~~~~~~~~~~~~~~~~~~~

Expand Down
106 changes: 105 additions & 1 deletion source/extensions/omni.isaac.lab/omni/isaac/lab/utils/math.py
Original file line number Diff line number Diff line change
Expand Up @@ -988,7 +988,12 @@ def transform_points(

@torch.jit.script
def unproject_depth(depth: torch.Tensor, intrinsics: torch.Tensor) -> torch.Tensor:
r"""Unproject depth image into a pointcloud.
r"""Unproject depth image into a pointcloud. This method assumes that depth
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow google doc-style. The first line here is a one-line sumary. Everything else moves to a new para.

is provided orthagonally relative to the image plane, as opposed to absolutely relative to the camera's
glvov-bdai marked this conversation as resolved.
Show resolved Hide resolved
principal point (perspective depth). To unproject a perspective depth image, use
:meth:`convert_perspective_depth_to_orthogonal_depth` to convert
to an orthogonal depth image prior to calling this method as otherwise the
created point cloud will be distorted, especially around the edges.
glvov-bdai marked this conversation as resolved.
Show resolved Hide resolved

This function converts depth images into points given the calibration matrix of the camera.

Expand Down Expand Up @@ -1059,6 +1064,105 @@ def unproject_depth(depth: torch.Tensor, intrinsics: torch.Tensor) -> torch.Tens
return points_xyz


@torch.jit.script
def convert_perspective_depth_to_orthogonal_depth(
perspective_depth: torch.Tensor, intrinsics: torch.Tensor
) -> torch.Tensor:
r"""Provided depth image(s) where depth is provided as the distance to the principal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r""" is only used when you have math equations. In all other cases, please resort to double ticks.

point of the camera (perspective depth), this function converts it so that depth
is provided as the distance to the camera's image plane (orthogonal depth).

This is helpful because `unproject_depth` assumes that depth is expressed in
the orthogonal depth format.

If `perspective_depth` is a batch of depth images and `intrinsics` is a single intrinsic matrix,
the same calibration matrix is applied to all depth images in the batch.

The function assumes that the width and height are both greater than 1.

Args:
perspective_depth: The depth measurement obtained with the distance_to_camera replicator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is a math util, any mention of replicator does not make sense.

Shape is (H, W) or or (H, W, 1) or (N, H, W) or (N, H, W, 1).
intrinsics: A tensor providing camera's calibration matrix. Shape is (3, 3) or (N, 3, 3).

Returns:
The depth image as if obtained by the distance_to_image_plane replicator. Shape
matches the input shape of depth

Raises:
ValueError: When depth is not of shape (H, W) or (H, W, 1) or (N, H, W) or (N, H, W, 1).
ValueError: When intrinsics is not of shape (3, 3) or (N, 3, 3).
"""

# Clone inputs to avoid in-place modifications
perspective_depth_batch = perspective_depth.clone()
intrinsics_batch = intrinsics.clone()

# Check if inputs are batched
is_batched = perspective_depth_batch.dim() == 4 or (
perspective_depth_batch.dim() == 3 and perspective_depth_batch.shape[-1] != 1
)

# Track whether the last dimension was singleton
add_last_dim = False
if perspective_depth_batch.dim() == 4 and perspective_depth_batch.shape[-1] == 1:
add_last_dim = True
perspective_depth_batch = perspective_depth_batch.squeeze(dim=3) # (N, H, W, 1) -> (N, H, W)
if perspective_depth_batch.dim() == 3 and perspective_depth_batch.shape[-1] == 1:
add_last_dim = True
perspective_depth_batch = perspective_depth_batch.squeeze(dim=2) # (H, W, 1) -> (H, W)

if perspective_depth_batch.dim() == 2:
perspective_depth_batch = perspective_depth_batch[None] # (H, W) -> (1, H, W)

if intrinsics_batch.dim() == 2:
intrinsics_batch = intrinsics_batch[None] # (3, 3) -> (1, 3, 3)

if is_batched and intrinsics_batch.shape[0] == 1:
intrinsics_batch = intrinsics_batch.expand(perspective_depth_batch.shape[0], -1, -1) # (1, 3, 3) -> (N, 3, 3)

# Validate input shapes
if perspective_depth_batch.dim() != 3:
raise ValueError(f"Expected perspective_depth to have 2, 3, or 4 dimensions; got {perspective_depth.shape}.")
if intrinsics_batch.dim() != 3:
raise ValueError(f"Expected intrinsics to have shape (3, 3) or (N, 3, 3); got {intrinsics.shape}.")

# Image dimensions
im_height, im_width = perspective_depth_batch.shape[1:]

# Get the intrinsics parameters
fx = intrinsics_batch[:, 0, 0].view(-1, 1, 1)
fy = intrinsics_batch[:, 1, 1].view(-1, 1, 1)
cx = intrinsics_batch[:, 0, 2].view(-1, 1, 1)
cy = intrinsics_batch[:, 1, 2].view(-1, 1, 1)

# Create meshgrid of pixel coordinates
u_grid = torch.arange(im_width, device=perspective_depth.device, dtype=perspective_depth.dtype)
v_grid = torch.arange(im_height, device=perspective_depth.device, dtype=perspective_depth.dtype)
u_grid, v_grid = torch.meshgrid(u_grid, v_grid, indexing="xy")

# Expand the grids for batch processing
u_grid = u_grid.unsqueeze(0).expand(perspective_depth_batch.shape[0], -1, -1)
v_grid = v_grid.unsqueeze(0).expand(perspective_depth_batch.shape[0], -1, -1)

# Compute the squared terms for efficiency
x_term = ((u_grid - cx) / fx) ** 2
y_term = ((v_grid - cy) / fy) ** 2

# Calculate the orthogonal (normal) depth
normal_depth = perspective_depth_batch / torch.sqrt(1 + x_term + y_term)

# Restore the last dimension if it was present in the input
if add_last_dim:
normal_depth = normal_depth.unsqueeze(-1)

# Return to original shape if input was not batched
if not is_batched:
normal_depth = normal_depth.squeeze(0)

return normal_depth


@torch.jit.script
def project_points(points: torch.Tensor, intrinsics: torch.Tensor) -> torch.Tensor:
r"""Projects 3D points into 2D image plane.
Expand Down
18 changes: 18 additions & 0 deletions source/extensions/omni.isaac.lab/test/utils/test_math.py
Original file line number Diff line number Diff line change
Expand Up @@ -376,6 +376,24 @@ def iter_old_quat_rotate_inverse(q: torch.Tensor, v: torch.Tensor) -> torch.Tens
iter_old_quat_rotate_inverse(q_rand, v_rand),
)

def test_depth_perspective_conversion(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please always add a docstring for a test so the description also is visible when you run the test.

# Create a sample perspective depth image (N, H, W)
perspective_depth = torch.tensor([[[10.0, 0.0, 100.0], [0.0, 3000.0, 0.0], [100.0, 0.0, 100.0]]])

# Create sample intrinsic matrix (3, 3)
intrinsics = torch.tensor([[500.0, 0.0, 5.0], [0.0, 500.0, 5.0], [0.0, 0.0, 1.0]])

# Convert perspective depth to orthogonal depth
orthogonal_depth = math_utils.convert_perspective_depth_to_orthogonal_depth(perspective_depth, intrinsics)

# Manually compute expected orthogonal depth based on the formula for comparison
expected_orthogonal_depth = torch.tensor(
[[[9.9990, 0.0000, 99.9932], [0.0000, 2999.8079, 0.0000], [99.9932, 0.0000, 99.9964]]]
)

# Assert that the output is close to the expected result
torch.testing.assert_close(orthogonal_depth, expected_orthogonal_depth)


if __name__ == "__main__":
run_tests()
Loading
Loading