Our volumetric capture system captures a completely clothed human body (including the back) using a single RGB webcam and in real time.
- Python 3.7
- PyOpenGL 3.1.5 (need X server in Ubuntu)
- PyTorch tested on 1.4.0
- ImplicitSegCUDA
- human_inst_seg
- streamer_pytorch
- human_det
We run the demo with 2 GeForce RTX 2080Ti GPUs, the memory usage is as follows (~3.4GB at GPU1, ~9.7GB at GPU2):
Note: The last four dependencies are also developed by our team, and are all in active maintenance. If you meet any installation problems specifically regarding those tools, we recommend you to file the issue in the corresponding repo. (You don't need to install them manually here as they are included in the requirements.txt)
First you need to download the model:
sh scripts/download_model.sh
Then install all the dependencies:
pip install -r requirements.txt
# if you want to use the input from a webcam:
python RTL/main.py --use_server --ip <YOUR_IP_ADDRESS> --port 5555 --camera -- netG.ckpt_path ./data/PIFu/net_G netC.ckpt_path ./data/PIFu/net_C
# or if you want to use the input from a image folder:
python RTL/main.py --use_server --ip <YOUR_IP_ADDRESS> --port 5555 --image_folder <IMAGE_FOLDER> -- netG.ckpt_path ./data/PIFu/net_G netC.ckpt_path ./data/PIFu/net_C
# or if you want to use the input from a video:
python RTL/main.py --use_server --ip <YOUR_IP_ADDRESS> --port 5555 --videos <VIDEO_PATH> -- netG.ckpt_path ./data/PIFu/net_G netC.ckpt_path ./data/PIFu/net_C
If everything goes well, you should be able to see those logs after waiting for a few seconds:
loading networkG from ./data/PIFu/net_G ...
loading networkC from ./data/PIFu/net_C ...
initialize data streamer ...
Using cache found in /home/rui/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub
Using cache found in /home/rui/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub
* Serving Flask app "main" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: on
* Running on http://<YOUR_IP_ADDRESS>:5555/ (Press CTRL+C to quit)
Open the page http://<YOUR_IP_ADDRESS>:5555/
on a web browser from any device (Desktop/IPad/IPhone), You should be able to see the MonoPort VR Demo page on that device, and at the same time you should be able to see the a screen poping up on your desktop, showing the reconstructed normal and texture image.
MonoPort is based on Monocular Real-Time Volumetric Performance Capture(ECCV'20), authored by Ruilong Li*(@liruilong940607), Yuliang Xiu*(@yuliangxiu), Shunsuke Saito(@shunsukesaito), Zeng Huang(@ImaginationZ) and Kyle Olszewski(@kyleolsz), Hao Li is the corresponding author.
@inproceedings{li2020monoport,
title={Monocular Real-Time Volumetric Performance Capture},
author={Li, Ruilong and Xiu, Yuliang and Saito, Shunsuke and Huang, Zeng and Olszewski, Kyle and Li, Hao},
booktitle={European Conference on Computer Vision},
pages={49--67},
year={2020},
organization={Springer}
}
@incollection{li2020monoportRTL,
title={Volumetric human teleportation},
author={Li, Ruilong and Olszewski, Kyle and Xiu, Yuliang and Saito, Shunsuke and Huang, Zeng and Li, Hao},
booktitle={ACM SIGGRAPH 2020 Real-Time Live},
pages={1--1},
year={2020}
}
PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization (ICCV 2019)
Shunsuke Saito*, Zeng Huang*, Ryota Natsume*, Shigeo Morishima, Angjoo Kanazawa, Hao Li
The original work of Pixel-Aligned Implicit Function for geometry and texture reconstruction, unifying sigle-view and multi-view methods.
PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization (CVPR 2020)
Shunsuke Saito, Tomas Simon, Jason Saragih, Hanbyul Joo
They further improve the quality of reconstruction by leveraging multi-level approach!
ARCH: Animatable Reconstruction of Clothed Humans (CVPR 2020)
Zeng Huang, Yuanlu Xu, Christoph Lassner, Hao Li, Tony Tung
Learning PIFu in canonical space for animatable avatar generation!
Robust 3D Self-portraits in Seconds (CVPR 2020)
Zhe Li, Tao Yu, Chuanyu Pan, Zerong Zheng, Yebin Liu
They extend PIFu to RGBD + introduce "PIFusion" utilizing PIFu reconstruction for non-rigid fusion.
Real-time VR PhD Defense
Dr. Zeng Huang defensed his PhD virtually using our system. (Media in Chinese)