Skip to content

linukc/BeyondBareQueries

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Beyond Bare Queries:
Open-Vocabulary Object Grounding
with 3D Scene Graph

Linok Sergey · Tatiana Zemskova · Svetlana Ladanova · Roman Titkov · Dmitry Yudin
Maxim Monastyrny · Aleksei Valenkov

Getting Started

System Requirements

10GB+ vRAM to run mapping and 16GB+ vRAM to run local LLM and vLLM.

Data Preparation

Replica

Download the Replica RGB-D scan dataset using the downloading script in Nice-SLAM. It contains rendered trajectories using the mesh models provided by the original Replica datasets.

ScanNet

For ScanNet, please follow the instructions in ScanNet.

Environment Setup

Build docker image and create container:

./docker/build.sh
./docker/start.sh <path_to_data_folder>
./docker/into.sh

Install bbq library, call this once for container:

pip install -e .

Run BBQ

Mapping

First, build 3D scene representation. Check config before run. Inside container call script:

python3 main.py --config_path=examples/configs/replica/room0.yaml #Replica
python3 main.py --config_path=examples/configs/scannet/scene0011_00.yaml #ScanNet

To visualize construction process:

python3 main.py --config_path=examples/configs/replica/room0.yaml --save_path=output
python3 visualize/show_construction.py --animation_folder=output

Object Grounding

Llama3-8B

Setup Meta-Llama-3-8B-Instruct according to the docs.

# Llama3-8B
python3 query.py --scene_file=examples/scenes/replica/room0.json --model_path=<your_path>/Meta-Llama-3-8B-Instruct #Replica
python3 query.py --scene_file=examples/scenes/scannet/scene0011_00.json --model_path=<your_path>/Meta-Llama-3-8B-Instruct #ScanNet

Acknowledgement

We base our work on the following paper codebase: ConceptGraphs.

Citation

If you find this work helpful, please consider citing our work as:

@misc{linok2024barequeriesopenvocabularyobject,
      title={Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graph}, 
      author={Sergey Linok and Tatiana Zemskova and Svetlana Ladanova and Roman Titkov and Dmitry Yudin and Maxim Monastyrny and Aleksei Valenkov},
      year={2024},
      eprint={2406.07113},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2406.07113}, 
}

Contact

Please create an issue on this repository for questions, comments and reporting bugs. Send an email to Linok Sergey for other inquiries.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published