Structure from Silence: Learning Scene Structure from Ambient Sound

Ziyang Chen, Xixi Hu, Andrew Owens
University of Michigan
CoRL 2021 (Oral)

This repository contains the official codebase for Structure from Silence: Learning Scene Structure from Ambient Sound. [Project Page]

Environment

To setup the environment, please simply run

conda env create -f environment.yml
conda activate SFS

Quiet Campus Dataset

Static Recordings

We collected Static recording data in 46 classrooms from 12 buildings on The University of Michigan’s campus, amounting to approximately 200 minutes audio. Inside each classroom, we selected 16 − 30 positions and recorded 10 secs. of audio at each one. The camera and microphone are pointed toward the nearest wall when recording so that the distance is well-defined. Our data can be downloaded from Here. You can simply download this processed dataset by running

cd Dataset/SFS-Static
sh ./download_static.sh

Motion Recordings

We collected Motion recording data with approximate 90 minutes of videos in motion (222 videos total). During recording, the microphone and RGB-D camera move toward or away from a wall. Our data can be downloaded from Here. You can simply download this processed dataset by running

cd Dataset/SFS-Motion
sh ./download_motion.sh

Model Zoo

We release several models trained with our proposed methods and our collected dataset. We hope it could benefit our research communities.

Model	Dataset	url
static-obstacle-detection	Static Recording	url
static-relative-depth-order	Motion Recording	url
motion-obstacle-detection	Motion Recording	url
motion-relative-depth-order	Motion Recording	url
AV-order-pretext	Motion Recording	url

To download all the checkpoints, simply run

sh ./scripts/download_models.sh

Train & Evaluation

We provide training and evaluation scripts under scripts, please check each bash file and run chmod +x xx.sh before running it.

Training

Training obstacle detection model on Static recordings, run:
```
./scripts/training/train-static-obstacle.sh
```
Training relative depth order model on Static recording, run:
```
./scripts/training/train-static-relative-depth.sh
```
Training obstacle detection model on Motion recordings, run:
```
./scripts/training/train-motion-obstacle.sh
```
Training relative depth order model on Motion recording, run:
```
./scripts/training/train-motion-relative-depth.sh
```
Training AV-Order model on Motion recording, run:
```
./scripts/training/train-motion-avorder.sh
```

To perform linear probing experiment on downstream tasks, here is an example for relative depth order task:

CUDA_VISIBLE_DEVICES=0 python main_lincls.py --exp='Audio-LinCLS-relative-depth' \
  --epochs=40  --setting='av_lincls_RD' --input='audio' \
  --batch_size=256 --num_workers=8 --save_step=1 --valid_step=1 --lr=0.01 \
  --optim='SGD' --repeat=50 --schedule='cos' --aug_wave --freeze \
  --resume='pretrained-models/motion-avorder.pth.tar'

Evaluation

To evaluate our static-obstacle-detection model, simply run: ./scripts/test/test-static-obstacle.sh under parent path. You can change the checkpoint in the bash file.
To evaluate our static-relative-depth-order model, simply run: ./scripts/test/test-static-relative-depth.sh under parent path. You can change the checkpoint in the bash file.
To evaluate our motion-obstacle-detection model, simply run: ./scripts/test/test-motion-obstacle.sh under parent path. You can change the checkpoint in the bash file.
To evaluate our motion-relative-depth-order model, simply run: ./scripts/test/test-motion-relative-depth.sh under parent path. You can change the checkpoint in the bash file.

Citation

If you find our project useful, please consider citing:

@inproceedings{
    chen2021structure,
    title={Structure from Silence: Learning Scene Structure from Ambient Sound},
    author={Ziyang Chen and Xixi Hu and Andrew Owens},
    booktitle={5th Annual Conference on Robot Learning },
    year={2021},
    url={https://openreview.net/forum?id=ht3aHpc1hUt}
}

Acknowledgment

This work was funded in part by DARPA Semafor and Cisco Systems. The views, opinions and/or findings expressed are those of the authors and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Dataset		Dataset
config		config
data		data
images		images
models		models
scripts		scripts
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
main.py		main.py
main_lincls.py		main_lincls.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Structure from Silence: Learning Scene Structure from Ambient Sound

Ziyang Chen, Xixi Hu, Andrew Owens
University of Michigan
CoRL 2021 (Oral)

Environment

Quiet Campus Dataset

Static Recordings

Motion Recordings

Model Zoo

Train & Evaluation

Training

Evaluation

Citation

Acknowledgment

About

Releases

Packages

Languages

License

IFICL/structure-from-silence

Folders and files

Latest commit

History

Repository files navigation

Structure from Silence: Learning Scene Structure from Ambient Sound

Ziyang Chen*, Xixi Hu*, Andrew Owens University of Michigan CoRL 2021 (Oral)

Environment

Quiet Campus Dataset

Static Recordings

Motion Recordings

Model Zoo

Train & Evaluation

Training

Evaluation

Citation

Acknowledgment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Ziyang Chen, Xixi Hu, Andrew Owens
University of Michigan
CoRL 2021 (Oral)

Packages