Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about scale consistency #20

Open
pleasegostraight opened this issue Oct 26, 2020 · 3 comments
Open

Question about scale consistency #20

pleasegostraight opened this issue Oct 26, 2020 · 3 comments

Comments

@pleasegostraight
Copy link

pleasegostraight commented Oct 26, 2020

Hi, I have some question about the scale consistency.
I think to solve the scale consistensy is to align depth(t) and depth(t-1) consistent. But in the paper or in this issue #8 (comment)_, it seems to make the predict depth and pose consistent. How it can make depth in temporal be scale consistent?

@pleasegostraight pleasegostraight changed the title 1. The translation length from 8-Point algorithm is always a unit (up-to-scale). So for the training stage, we align the predicted depth to the pose (triangulation depth) to make the depth and pose consistent for calculating the loss. For the inference stage of visual odometry, we need the scale consistency over the whole sequence thus we align the scale of pose translation to the predicted depth. Problem about scale consistency Oct 26, 2020
@pleasegostraight
Copy link
Author

And 8 points methods are not satisfied the scale consistency, so the pose is not scale consistency. Align the predict depth to the not scale consistented pose seems cannot solve the problem of depth consistency?

@pleasegostraight pleasegostraight changed the title Problem about scale consistency Question about scale consistency Oct 26, 2020
@scott89
Copy link

scott89 commented Nov 12, 2020

I have the same question regarding the temporal scale consistency. It seems that the only reasonable explanation for the predicted depth to be scale consistent is training with depth reprojection loss L_pd. However, it is still hard to interpret that the L_pd loss penalizes inconsistency between the scale-transformed depth D_a and D_b which are designed to be scale-invariant.

@SenZHANG-GitHub
Copy link

SenZHANG-GitHub commented Apr 14, 2021

The writing of the paper is bit confusing. It would be easier to follow if the author can separate the two kinds of consistency and make it clear for both.

First, if we only considers scale from two-view triangulation which assumes an unit translation for every frame-pair, then it can only make sure the consistency between depth and pose at each time step separately while cannot account for the temporal consistency. Since the translation between two frames varies along the trajectory and the pose is aligned with the depth from triangulation, the unit translation assumption in triangulation leads to temporal inconsistency.

Second, the temporal consistency is embodied in L_pd in Equation (6), this depth reprojection loss is borrowed from Bian's 2019 NIPS paper "Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video", though the loss form is modified a little bit. This loss enforces the local temporal consistency, and since the clips are overlapped, we assume this local consistency will propagate to global temporal consistency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants