LM-Gaussian: Boost Sparse-view 3D Gaussian Splatting with Large Model Priors

Paper | Project Page | Video

Official implementation of LM-Gaussian: Boost Sparse-view 3D Gaussian Splatting with Large Model Priors

Abstract: We aim to address sparse-view reconstruction of a 3D scene by leveraging priors from large-scale vision models. While recent advancements such as 3D Gaussian Splatting (3DGS) have demonstrated remarkable success in 3D reconstruction, these methods typically necessitate hundreds of input images that densely capture the underlying scene, making them time-consuming and impractical for real-world applications. However, sparse-view reconstruction is inherently ill-posed and under-constrained, often resulting in inferior and incomplete outcomes. This is due to issues such as failed initialization, overfitting to input images, and a lack of detail. To mitigate these challenges, we introduce LM-Gaussian, a method capable of generating high-quality reconstructions from a limited number of images. Specifically, we propose a robust initialization module that leverages stereo priors to aid in the recovery of camera poses and the reliable initialization of point clouds. Additionally, a diffusion-based refinement is iteratively applied to incorporate image diffusion priors into the Gaussian optimization process to preserve intricate scene details. Finally, we utilize video diffusion priors to further enhance the rendered images for realistic visual effects. Overall, our approach significantly reduces the data acquisition requirements compared to previous 3DGS methods. We validate the effectiveness of our framework through experiments on various public datasets, demonstrating its potential for high- quality 360-degree scene reconstruction.

Method

Our method takes unposed sparse images as inputs. For example, we select 8 images from the Horse Scene to cover a 360-degree view. Initially, we utilize a Background-Aware Depth-guided Initialization Module to generate dense point clouds and camera poses (see Section IV-B). These variables act as the initialization for the Gaussian kernels. Subsequently, in the Multi-modal Regularized Gaussian Reconstruction Module (see Section IV-C), we collectively optimize the Gaussian network through depth, normal, and virtual-view regularizations. After this stage, we train a Gaussian Repair model capable of enhancing Gaussian-rendered new view images. These improved images serve as guides for the training network, iteratively restoring Gaussian details (see Section IV-D). Finally, we employ a scene enhancement module to further enhance the rendered images for realistic visual effects (see Section IV-E).

TODO List

Support 2D-GS
Support Scaffold-gs
Add Increamental Test pose alignment module
Support controlnet-tile-sdxl-1.0

🚀 Setup

CUDA

LM-Gaussian is tested with CUDA 11.8.

Cloning the Repository

Clone LM-Gaussian and download relevant models:

git clone https://github.com/hanyangyu1021/LMGaussian.git --recursive

Create the environment: LM-Gaussian is tested on Python 3.10.12. Requirements are listed in requirements.txt. You can install them with
```
conda env create --file environment.yml
    
```


Get Monocular Depth/Normal Maps

Put unposed sparse images in the './data/{dataset_name}/train/images/' folder. Checkpoints can be found at:

    
        Marigold Depth LCM v1.0
    
    
        Marigold Normals LCM v0.1
    

Download the relevant checkpoints to 
./Marigold/checkpoint/marigold-depth-lcm-v1-0/ and 
./Marigold/checkpoint/marigold-normals-lcm-v0-1/

python Marigold/getmonodepthnormal.py -s data/horse16
Dense Initialization


Download the dust3r checkpoint "DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth" and place it into
'./dust3r/checkpoints/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth'.


    
        DUSt3R Checkpoint
    

Here we provide a simple example of horse scene in TNT. You can find the data in ./data
python dust3r/coarse_initialization.py -s data/horse16
Multi-modal Regularized Reconstruction

python stage1_360.py -s data/horse16 --save outputs/horse16
Train Repair model

To set up the model, download the following checkpoints to the ./models folder:  
    
        
            v1-5-pruned.ckpt
        
        
            control_v11f1e_sd15_tile.pth
        
        
            control_v11f1e_sd15_tile.yaml
        
    
Download clip-vit-large-patch14 model to ./openai
python train_repairmodel.py   --exp_name outputs/controlnet_finetune/horse16 --prompt "any prompt describe the scene" --resolution 1 --gs_dir outputs/horse16 --data_dir data/horse16   --bg_white 
Iterative refinement

python stage2_360.py  -s data/horse16  --exp_name outputs/controlnet_finetune/horse16 --prompt "any prompt describe the scene" --bg_white  --start_checkpoint "outputs/horse16/chkpnt12000.pth"
python stage2_forward.py  -s data/barn3 --exp_name outputs/controlnet_finetune/barn3 --prompt "Houses, playground, outdoor" --bg_white  --start_checkpoint "outputs/barn3/chkpnt6000.pth"
Render video & Scene enhancement

python render_interpolate.py -s data/horse16 --start_checkpoint outputs/horse16/chkpnt30000.pth
Checkpoints can be found at:

    
        Zeroscope_v2_XL
    

Download the checkpoints to 
./models/zeroscope_v2_XL/

python scene_enhance.py --model_path ./models/zeroscope_v2_XL --input_path outputs/horse16/30000_render_video.mp4
🤗Acknowledgement

This work is built on many amazing research works and open-source projects, thanks a lot to all the authors for sharing!

DUSt3R
GaussianObject
RaDe-GS

🌏Citation

If you find our work useful in your research, please consider giving a star :star: and citing the following paper :pencil:.
@misc{yu2024lmgaussianboostsparseview3d,
      title={LM-Gaussian: Boost Sparse-view 3D Gaussian Splatting with Large Model Priors}, 
      author={Hanyang Yu and Xiaoxiao Long and Ping Tan},
      year={2024},
      eprint={2409.03456},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2409.03456}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Marigold		Marigold
annotator		annotator
arguments		arguments
assets		assets
cldm		cldm
data		data
dust3r		dust3r
gaussian_renderer		gaussian_renderer
images/README		images/README
ldm		ldm
lpipsPyTorch		lpipsPyTorch
scene		scene
submodules		submodules
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE.md		LICENSE.md
README.md		README.md
environment.yml		environment.yml
render_interpolate.py		render_interpolate.py
scene_enhance.py		scene_enhance.py
stage1_360.py		stage1_360.py
stage1_forward.py		stage1_forward.py
stage2_360.py		stage2_360.py
stage2_forward.py		stage2_forward.py
train_repairmodel.py		train_repairmodel.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LM-Gaussian: Boost Sparse-view 3D Gaussian Splatting with Large Model Priors

Method

TODO List

🚀 Setup

CUDA

Cloning the Repository

Get Monocular Depth/Normal Maps

Dense Initialization

Multi-modal Regularized Reconstruction

Train Repair model

Iterative refinement

Render video & Scene enhancement

🤗Acknowledgement

🌏Citation

About

Releases

Packages

Contributors 2

Languages

License

hanyangyu1021/LMGaussian

Folders and files

Latest commit

History

Repository files navigation

LM-Gaussian: Boost Sparse-view 3D Gaussian Splatting with Large Model Priors

Method

TODO List

🚀 Setup

CUDA

Cloning the Repository

Get Monocular Depth/Normal Maps

Dense Initialization

Multi-modal Regularized Reconstruction

Train Repair model

Iterative refinement

Render video & Scene enhancement

🤗Acknowledgement

🌏Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages