Measure Anything: Real-time, Multi-stage Vision-based Dimensional Measurement using Segment Anything
Measure Anything is an interactive / automated dimensional measurement tool that leverages the Segment Anything Model (SAM) 2 to segment objects of interest and provide real-time, diameter, length and volume measurements. Our streamlined pipeline comprises five stages: 1) segmentation, 2) binary mask processing, 3) skeleton construction, 4) line segment and depth identification and 5) 2D-3D transform and measurement. We envision that this pipeline can be easily adapted to other fully automated or minimally human-assisted, vision-based measurement tasks.
conda create --name <environment> python=3.12
conda activate <environment>
This repository was tested with the ZED SDK 4.2.1 and CUDA 12 on Ubuntu 22.04. To set up the ZED SDK, follow these steps:
- Install dependencies
pip install cython numpy==1.26.4 opencv-python==4.9.0.80 pyopengl
- Download the ZED SDK from the official website
- Run the ZED SDK Installer
cd path/to/download/folder
sudo apt install zstd
chmod +x ZED_SDK_Ubuntu22_cuda12.1_v4.2.1.zstd.run
/ZED_SDK_Ubuntu22_cuda12.1_v4.2.1.zstd.run
- To install the Python API, press Y to the following when running the installer:
Do you want to install the Python API (recommended) [Y/n] ?
or alternatively, you can install the Python API separately running the following script:
cd "/usr/local/zed/"
python3 get_python_api.py
- Follow instructions on the official website
- For example, to install Pytorch with CUDA 12.1:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
We use the SAM 2 API provided by Ultralytics. The Segment Anything Model checkpoint will be downloaded automatically when running the the demo script for the first time.
pip install ultralytics
pip install scikit-image scikit-learn pillow plyfile
The interactive demo requires .svo
files collected using a ZED stereo camera. Example .svo
files can be found here. Run the demo and follow onscreen instructions.
python interactive_demo.py --input_svo path/to/svo/file.svo --stride 10 --thin_and_long
--thin_and_long
is a flag variable that decides the skeleton construction method. Toggling this flag will construct the skeleton based on skeletonization (recommended for rod-like geometries).--stride (int)
is an optional parameter that determines the distance between consecutive measurements. The default value is 10.- Red line indicate valid measurements.
- Blue line segments indicate invalid measurements, due to unavailable depth data.
- The calculated stem diameters are available as a numpy file in
./output/{svo_file_name}/{frame}/diameters.npy
ordered from the bottommost to the topmost line measurements.
The keypoint detection weights can be specified using the --weights
parameter. This enables automated point prompting for segmentation. If the automated segmentations are inaccurate, users can manually intervene to refine the results. The keypoints can be configured as positive prompts or a combination of positive and negative prompts, depending on the specific requirements of the video.
You can run the demo and follow the instructions on the window for interaction.
python interactive_demo.py --input_svo path/to/svo/ --weights path/to/checkpoint file.svo --stride 10 --thin_and_long
Interactive Automated Demo Examples
Measure Anything can be used to provide geometric priors for obtaining optimized grasping points according to a model. In our experiments we are using a simple stability model based on form-closure and perpendicular distance from CoM. However this stability model can be switched with a SOTA deep learning model.
Check interactive_demo_clubs_3d.py
to run an interactive demonstration on Clubs-3D dataset.
python interactive_demo_clubs_3d.py --input_image path/to/input/image --depth_image path/to/depth/image --sensor <sensor_name>
Robotic Grasping Demo
To make the Clubs-3D files ready for demo, you can follow download from Clubs-3D dataset, and follow the commands below. Also use the registered depth images in the demo above.
cd clubs_dataset_python/
python register_depth_images.py --scene_folder path/to/scene/folder
python generate_point_clouds.py --scene_folder path/to/scene/folder --use_registered_depth