This is an official Pytorch implementation of PS-Transformer [1] for estimating surface normals from images captured under different known directional lightings.
[1] S. Ikehata, "PS-Transformer: Learning Sparse Photometric Stereo Network using Self-Attention Mechanism", BMVC2021
(top) Surface normal estimation result from 10 images, (bottom) ground truth.
- Python3
- torch
- tensorboard
- cv2
- scipy
Tested on:
- Ubuntu 20.04/Windows10, Python 3.7.5, Pytorch 1.6.0, CUDA 10.2
- GPU: NvidiaQuadroRTX8000 (64GB)
For testing the network on DiLiGenT benchmark by Boxin Shi [2], please download DiLiGenT dataset (DiLiGenT.zip) and extract it at [USER_PATH].
Then, please run main.py with the DiLiGenT path as an argument.
python main.py --diligent [USER_PATH]/DiLiGenT/pmsData
You can change the number of test images (default:10) as
python main.py --diligent [USER_PATH]/DiLiGenT/pmsData --n_testimg 5
Please note that the lighting directions are randomly chosen, therefore the results are different every time.
The pretrained model (our "full" configuration) is available at https://www.dropbox.com/s/64i4srb2vue9zrn/pretrained.zip?dl=0. Please extract it at "PS-Transformer-BMVC2021/pretrained".
If the program properly works, you will get average angular errors (in degrees) for each dataset.
You can use TensorBoard for visualizing your output. The log file will be saved at
[LOGFILE] = 'Tensorboard/[SESSION_NAME (default:eval)]'
Then, please run TensorBoard as
tensorboard --logdir [YOURLOGFILE]
As is commonly known, "bear" dataset in DiLiGenT has problem and the first 20 images in bearPNG are skipped.
If you want to run this code on ohter datasets, please allocate your own data just in the same manner with DiLiGenT. The required files are
- images (.png format in default, but you can easily change the code for other formats)
- lights (light_directions.txt, light_intensities.txt)
- normals (normal.txt, if no ground truth surface normal is available, you can simply set all the values by zero)
The training script is NOT supported yet (will be available soon!). However, the training dataset is alraedy available. Please send a request to [email protected]
This project is licensed under the MIT License - see the LICENSE.md file for details
[2] Boxin Shi, Zhipeng Mo, Zhe Wu, Dinglong Duan, Sai-Kit Yeung, and Ping Tan, "A Benchmark Dataset and Evaluation for Non-Lambertian and Uncalibrated Photometric Stereo", In IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018.
Honestly, the major reason why this work was aimed at "sparse" set-up is simply because the model size is huge and I didn't have sufficient gpu resources for training my model on "dense" iamges (though test on dense images using the model trained on sparse images is possible as shown in the paper). I am confident that this model also benefits the dense photometric stereo task and if you have any ideas to reduce the training cost, they are very appreciated!