Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kitti depth shows worse result #14

Closed
hyBlue opened this issue Aug 25, 2020 · 3 comments
Closed

Kitti depth shows worse result #14

hyBlue opened this issue Aug 25, 2020 · 3 comments

Comments

@hyBlue
Copy link

hyBlue commented Aug 25, 2020

Hi, I think your work is really interesting so I'm trying to reproduce your results, both of those in the paper and of pretrained model you provided but the model I've trained shows worse result. I want to figure out how to train properly.
(The results below are all based on ImageNet-pretrained Resnet18)

In the code, the number of iterations for every stage is set to 200,000 in default but it's different from the paper

Firstly, we only train optical flow network in
an unsupervised manner via image reconstruction loss. After 20 epochs, we freeze optical flow network and train the
depth network for another 20 epochs. Finally, we jointly
train both networks for 10 epochs.

so I followed the paper, Stage1 for 20 epochs, stage-2 for 20, and jointly train for 10 epochs. But the result was bad such as:

abs_rel,     sq_rel,        rms,    log_rms,         a1,         a2,         a3
0.1248,     0.8381,      4.806,      0.195,      0.852,      0.956,      0.983

so I tried to find out the best checkpoint for each stage, and used it for the next stage training.

Then I could get the better result.

abs_rel,     sq_rel,        rms,    log_rms,         a1,         a2,         a3
0.1194,     0.7705,      4.896,      0.192,      0.856,      0.957,      0.983

However, this is still far from the result reported, so I changed the weight of pt_loss ins stage_3

w_pt_depth: 0.0

, from zero to 0.3 then the result became

abs_rel,     sq_rel,        rms,    log_rms,         a1,         a2,         a3
0.1183,     0.7463,      4.758,      0.191,      0.861,      0.958,      0.983
For flow, 
epe,    epe_noc,    epe_occ,   epe_move, epe_static, move_err_rate, static_err_rate,   err_rate 
6.8279,     4.3253,    16.3770,     7.3393,     6.0999,     0.2155,     0.1894,     0.2034

but it is still worse than your result.

Therefore, I want to ask you two questions,

  1. What is the proper number of steps for each stage to get the best result?
  2. Why did you set the weight of pt_loss zero, which has maybe the most reliable supervision signal among others during stage 3?

I'm really looking forward to your feedback. Thanks!

@thuzhaowang
Copy link
Collaborator

Hi @hyBlue, thanks for the attention.

  1. Use the default parameter setting in the config file (i.e. 200k iters or more) for each stage.
  2. The third stage is beneficial mostly for the optical flow (especially in the occluded regions), and the improvement for depth prediction is minor. Pt_loss occasionally makes the training of optical flow unstable so we disable it without further tuning. But you could tune the loss weights for the third stage and see whether it's useful.

@harishkool
Copy link

harishkool commented Mar 16, 2021

@hyBlue I am getting all loss values zeros in the stage 2 training (#26). Have you faced this issue?

@hyBlue
Copy link
Author

hyBlue commented Mar 16, 2021

@harishkool
Nope. I think your training has converged into trivial solution. You'd better try different hyper parameters for your custom dataset. Also you can check the overlap between frames and compare it with that of Kitti. It's very easy for loss to become zero when the motion slightly exists between frames.

@hyBlue hyBlue closed this as completed Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants