📦 Datasets

3D Point Cloud

Training Data

ULIP-ShapeNet Triplets: follow ULIP to download and prepare the data.
ULIP2-Objaverse Triplets: follow ULIP to download and prepare the data.
OpenShape Triplets: follow OpenShape to download and prepare the data.

Downstream Data

ModelNet40: we prepare two versions following ULIP (when training models on ULIP-ShapeNet Triplets and ULIP2-Objaverse Triplets) and OpenShape (when training models on OpenShape Triplets).
ScanObjectNN: follow OpenShape to download and prepare the data.
Objaverse LVIS: follow OpenShape to download and prepare the data.

We provide meta data for 3D point cloud datasets in vitlens/src/open_clip/modal_3d/data.

Depth

Training Data

SUN-RGBD: We use the SUN-RGBD (train split) for training. Download the data from this website through this link.

Downstream Data

SUN-RGBD: We use the SUN-RGBD (test split) for testing. Download the data from this website through this link.
NYUv2: We use the NYUv2 (test split) for testing. Download the data from this website through this link. Use the NYU data in the downloaded dataset.

Note that we follow ImageBind to convert depth to disparity for model input. Please refer to this piece of code. We also provide a copy of processed data here.

** Disclaimer **

Users of this data are required to adhere to the usage and distribution policies outlined by the original dataset providers/hosts. Any usage or distribution of this data should comply with the terms and conditions set forth by the original dataset providers/hosts. The creators of this open-source project shall not be held responsible for any misuse or violation of the original dataset providers'/hosts' terms and conditions. Users are advised to review and comply with the terms of use and licensing agreements provided by the original dataset providers/hosts before utilizing this data for any purpose. See https://rgbd.cs.princeton.edu/.

We provide meta data for RGBD/Depth datasets in vitlens/src/open_clip/modal_depth/data.

Audio

Training data

Audioset: We use the training splits of the Audioset for training. We download the data according to the meta data provided in the official website. Since some videos are no longer available, we do not obtain all the videos listed. We list the videos used in our experiments in vitlens/src/open_clip/modal_audio/data/audioset_*.json.
VGGSound: In our later experiments, we combind VGGSound(train split) and Audioset for training. We download the data according to the meta data provided in this page. Similar to Audioset, some videos are no longer available, we do not obtain all the videos listed. We list the videos used in our experiments in vitlens/src/open_clip/modal_audio/data/vggsound_*.json.

Downstream data

Audioset: We use the val split for testing. We list the videos used in our experiments in audioset_val.json.
VGGSound: We use the val split for testing. We list the videos used in our experiments in vggsound_audio-only_val.json. The audio data could be obtained from this link provided by ONE-PEACE.
ESC: We use all the ESC data for testing. We download the data through this link provided by ONE-PEACE.
Clotho: We use the eval/val split for testing. We download the data through this link provided by ONE-PEACE.
AudioCaps. We use the test split provided in this link. We download the data through this link provided by ONE-PEACE, and use the data listed in the downloaded split.

We provide meta data for audio datasets in vitlens/src/open_clip/modal_audio/data.

Tactile

Training data

Touch-and-Go: We use the train split of the Touch-and-Go dataset for training. We download the data following the official website.

Downstream data

Touch-and-Go: We use test-material, test-hard/soft, test-rough/smooth splits for testing. We download the data following the official website.

We provide meta data for audio datasets in vitlens/src/open_clip/modal_tactile/data.

EEG

Training data

ImageNet EEG: We use the train split in ImageNet EEG dataset for training. We follow this website to download the EEG data. We download the corresponding images following this page.

Downstream data

ImageNet EEG: We use val/test splits in ImageNet EEG for testing. We follow the training split to obtain the data.

For spliting used in experiments, please refer to datasets.py.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DATASETS.md

DATASETS.md

📦 Datasets

3D Point Cloud

Training Data

Downstream Data

Depth

Training Data

Downstream Data

Audio

Training data

Downstream data

Tactile

Training data

Downstream data

EEG

Training data

Downstream data

Files

DATASETS.md

Latest commit

History

DATASETS.md

File metadata and controls

📦 Datasets

3D Point Cloud

Training Data

Downstream Data

Depth

Training Data

Downstream Data

Audio

Training data

Downstream data

Tactile

Training data

Downstream data

EEG

Training data

Downstream data