Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schur Complement for Bundle Adjustment #407

Open
MGBzH opened this issue Nov 21, 2024 · 5 comments
Open

Schur Complement for Bundle Adjustment #407

MGBzH opened this issue Nov 21, 2024 · 5 comments

Comments

@MGBzH
Copy link

MGBzH commented Nov 21, 2024

Hi, first of all thanks for your amazing work!

I'm using symforce to solve classical bundle adjustment with 6 DoF poses, 3D points and projection factors. The optimization works as expected, however is the Schur Complement trick used automatically or do I need to do something to enable it to speed up the optimisation?

Thank you

@aaron-skydio
Copy link
Member

The default configuration uses METIS to compute the ordering used for elimination. The resulting orderings have the same asymptotic complexity as the Schur complement trick (linear in the number of 3D points in the problem), so it should be pretty fast and it'll scale as you expect with the numbers of poses and points, same as if you were explicitly using the Schur complement trick

For really big problems there are other tricks you can use and lots of linear solvers to choose from; the linear solver is a template parameter that you can change to any solver that implements the interface SymForce expects for linear solvers; for example, you can use any of Eigen's sparse linear solvers with the wrapper in eigen_sparse_solver.h, or you could plug in a solver from another library by implementing a similar adapter.

@MGBzH
Copy link
Author

MGBzH commented Nov 22, 2024

Thank you for your answer.
I am working on a relatively small BA problem (35 poses, 2300 points and around 19k edges).
To understand how code generation works I've generated a reprojection residual. I've modified the bundle_adjustment_in_the_large.py example by creating a reprojection residual between a sf.Pose3 and a sf.V3 but using a pinhole camera model with sf.LinearCameraCal. Those intrinsics are known and fixed so I did not include them in the which_args parameter list of Codegen.
I have then built a graph with the generated factor, following the implementation of bundle_adjustment_in_the_large.cc but using my own data and optimized using sym::Optimizerd with default params and fixed the first keyframe of the graph.
I did a similar approch but using g2o (with analytical jacobians and the Schur complement trick) and I get to the same optimum as with symforce.
However while symforce converges in fewer (6) iterations than g2o (10), the time per iteration of g2o is lower: around 0.02-0.04s against 0.1 for symforce, making the overall optimisation take 0.2s for g2o against 0.6 for symforce.

Have you been able to compare against g2o in your experiments and is this difference something you have observed? Or on the contrary have you seen that symforce was faster, suggesting an error on my side?

Thank you

@aaron-skydio
Copy link
Member

It would help if you could post more detailed timing for what part of the iteration time is slower; there are very different reasons depending on that. SymForce should print a table on exit of timing for different parts of the optimization problem; I haven't used g2o much so I'm not sure what timing information they have available

@MGBzH
Copy link
Author

MGBzH commented Dec 2, 2024

Thanks for your answer, here are the detailed timings for symforce on my BA graph

SymForce TicToc Results:
   Name                                                          :     Count      | Total Time (s) | Mean Time (s)  |  Max Time (s)  |  Min Time (s) 
-----------------------------------------------------------------+----------------+----------------+----------------+----------------+----------------
Optimizer<BundleAdjustmentOptimizer>::Optimize                   :       1        |    0.58763     |    0.58763     |    0.58763     |    0.58763    
Optimizer<BundleAdjustmentOptimizer>::IterateToConvergence       :       1        |    0.58437     |    0.58437     |    0.58437     |    0.58437    
LM<BundleAdjustmentOptimizer>::Iterate()                         :       6        |    0.58357     |    0.097262    |    0.21112     |    0.072172   
LM<BundleAdjustmentOptimizer>: SparseFactorize                   :       6        |    0.35627     |    0.059378    |    0.061376    |    0.057262   
LM<BundleAdjustmentOptimizer>: EvaluateFirst                     :       1        |    0.091917    |    0.091917    |    0.091917    |    0.091917   
Linearizer<BundleAdjustmentOptimizer>::Relinearize::First()      :       1        |    0.091861    |    0.091861    |    0.091861    |    0.091861   
Linearizer<BundleAdjustmentOptimizer>::Relinearize::NonFirst()   :       6        |    0.078893    |    0.013149    |    0.013604    |    0.012917   
LM<BundleAdjustmentOptimizer>: AnalyzePattern                    :       1        |    0.041681    |    0.041681    |    0.041681    |    0.041681   
LM<BundleAdjustmentOptimizer>: SparseSolve                       :       6        |   0.0088856    |   0.0014809    |   0.0015008    |   0.0014434   
LM<BundleAdjustmentOptimizer>: Update                            :       6        |   0.0035052    |   0.00058419   |    0.00285     |   0.00012218  
LM<BundleAdjustmentOptimizer>::ResetState                        :       1        |   0.0020572    |   0.0020572    |   0.0020572    |   0.0020572   
LM<BundleAdjustmentOptimizer>: DampHessian                       :       6        |   0.00079118   |   0.00013186   |   0.00013784   |   0.00012834  
Optimizer<BundleAdjustmentOptimizer>::CopyValuesAndLinearization :       1        |    0.000788    |    0.000788    |    0.000788    |    0.000788   
LM<BundleAdjustmentOptimizer>: ResetHessianDiagonal              :       6        |   0.00050138   |   8.3563e-05   |   8.5391e-05   |   8.166e-05   
LM<BundleAdjustmentOptimizer>: IterationStats                    :       6        |   0.00038089   |   6.3482e-05   |   0.00015885   |   4.3121e-05  
LM<BundleAdjustmentOptimizer>: IterationStats - Print            :       6        |   0.0003704    |   6.1734e-05   |   0.00015399   |   4.1811e-05  
LM<BundleAdjustmentOptimizer>: LinearErrorFromValues             :       6        |   5.859e-05    |   9.765e-06    |   1.201e-05    |    8.63e-06   
LM<BundleAdjustmentOptimizer>: accept_update bookkeeping         :       6        |   3.577e-05    |   5.9617e-06   |    9.45e-06    |    5.06e-06   
LM<BundleAdjustmentOptimizer>: FirstIterationStats               :       1        |   2.847e-05    |   2.847e-05    |   2.847e-05    |   2.847e-05   
LM<BundleAdjustmentOptimizer>: StateStep                         :       6        |    1.78e-06    |   2.9667e-07   |    1.34e-06    |     8e-08

with the following cost changes:

LM<BundleAdjustmentOptimizer> [iter    0] lambda: 1.000e+00, error prev/linear/new: 1.155e+06/1.054e+04/1.745e+05, rel reduction: 8.48895e-01, gain ratio: 8.56718e-01
LM<BundleAdjustmentOptimizer> [iter    1] lambda: 6.369e-01, error prev/linear/new: 1.745e+05/1.442e+04/2.998e+04, rel reduction: 8.28198e-01, gain ratio: 9.02807e-01
LM<BundleAdjustmentOptimizer> [iter    2] lambda: 3.039e-01, error prev/linear/new: 2.998e+04/2.339e+04/2.709e+04, rel reduction: 9.62094e-02, gain ratio: 4.37668e-01
LM<BundleAdjustmentOptimizer> [iter    3] lambda: 3.045e-01, error prev/linear/new: 2.709e+04/2.328e+04/2.338e+04, rel reduction: 1.36873e-01, gain ratio: 9.73054e-01
LM<BundleAdjustmentOptimizer> [iter    4] lambda: 1.015e-01, error prev/linear/new: 2.338e+04/2.328e+04/2.329e+04, rel reduction: 3.85861e-03, gain ratio: 8.35394e-01
LM<BundleAdjustmentOptimizer> [iter    5] lambda: 7.086e-02, error prev/linear/new: 2.329e+04/2.328e+04/2.329e+04, rel reduction: 2.46218e-04, gain ratio: 5.07957e-01
LM<BundleAdjustmentOptimizer> Optimization finished with status: SUCCESS

Sadly g2o does not provide detailed timings like symforce, only time per iteration:

iteration= 0     chi2= 349229.953631     time= 0.041154          cumTime= 0.041154       edges= 18559    schur= 1        lambda= 1462.611335     levenbergIter= 1
iteration= 1     chi2= 59875.579867      time= 0.0209294         cumTime= 0.0620834      edges= 18559    schur= 1        lambda= 693.552774      levenbergIter= 1
iteration= 2     chi2= 49773.811349      time= 0.0204836         cumTime= 0.082567       edges= 18559    schur= 1        lambda= 355.477057      levenbergIter= 1
iteration= 3     chi2= 48134.734576      time= 0.020891          cumTime= 0.103458       edges= 18559    schur= 1        lambda= 236.984705      levenbergIter= 1
iteration= 4     chi2= 47323.525489      time= 0.022424          cumTime= 0.125882       edges= 18559    schur= 1        lambda= 78.994902       levenbergIter= 1
iteration= 5     chi2= 46952.319544      time= 0.021809          cumTime= 0.147691       edges= 18559    schur= 1        lambda= 52.663268       levenbergIter= 1
iteration= 6     chi2= 46717.745904      time= 0.0212841         cumTime= 0.168975       edges= 18559    schur= 1        lambda= 17.554423       levenbergIter= 1
iteration= 7     chi2= 46621.837960      time= 0.0209105         cumTime= 0.189886       edges= 18559    schur= 1        lambda= 9.669294        levenbergIter= 1
iteration= 8     chi2= 46579.484844      time= 0.0216752         cumTime= 0.211561       edges= 18559    schur= 1        lambda= 3.223098        levenbergIter= 1
iteration= 9     chi2= 46572.518880      time= 0.020375          cumTime= 0.231936       edges= 18559    schur= 1        lambda= 1.074366        levenbergIter= 1

Symforce converges to an error of 23290 while g2o converges to 46572, because g2o reports the sum of the squared residuals without multiplying by 1/2 so they both converge to very close points.

I've also run experiments on datasets from bundle adjustment in the large using the example code provided here for symforce and here for g2o. I deactivated the outlier camera filtering in the example of symforce and used only small problems to ensure good initialisation of camera poses.
On problem problem-21-11315-pre.txt:
Symforce:

[info] LM<sym::Optimize> [iter    0] lambda: 1.000e+00, error prev/linear/new: 4.413e+06/3.324e+04/3.606e+04, rel reduction: 9.91828e-01, gain ratio: 9.99355e-01
[info] LM<sym::Optimize> [iter    1] lambda: 3.333e-01, error prev/linear/new: 3.606e+04/3.184e+04/3.193e+04, rel reduction: 1.14602e-01, gain ratio: 9.78897e-01
[info] LM<sym::Optimize> [iter    2] lambda: 1.111e-01, error prev/linear/new: 3.193e+04/3.092e+04/3.106e+04, rel reduction: 2.72915e-02, gain ratio: 8.66084e-01
[info] LM<sym::Optimize> [iter    3] lambda: 6.750e-02, error prev/linear/new: 3.106e+04/3.059e+04/3.065e+04, rel reduction: 1.30267e-02, gain ratio: 8.58549e-01
[info] LM<sym::Optimize> [iter    4] lambda: 4.261e-02, error prev/linear/new: 3.065e+04/3.045e+04/3.048e+04, rel reduction: 5.76214e-03, gain ratio: 8.59555e-01
[info] LM<sym::Optimize> [iter    5] lambda: 2.676e-02, error prev/linear/new: 3.048e+04/3.040e+04/3.040e+04, rel reduction: 2.42377e-03, gain ratio: 9.11241e-01
[info] LM<sym::Optimize> [iter    6] lambda: 1.187e-02, error prev/linear/new: 3.040e+04/3.038e+04/3.038e+04, rel reduction: 7.09399e-04, gain ratio: 9.60908e-01
[info] LM<sym::Optimize> [iter    7] lambda: 3.958e-03, error prev/linear/new: 3.038e+04/3.038e+04/3.038e+04, rel reduction: 1.19584e-04, gain ratio: 1.00111e+00
[info] LM<sym::Optimize> [iter    8] lambda: 1.319e-03, error prev/linear/new: 3.038e+04/3.038e+04/3.038e+04, rel reduction: 1.45385e-05, gain ratio: 9.71796e-01
[info] LM<sym::Optimize> [iter    9] lambda: 4.397e-04, error prev/linear/new: 3.038e+04/3.038e+04/3.038e+04, rel reduction: 1.30698e-06, gain ratio: 9.96947e-01
[info] LM<sym::Optimize> [iter   10] lambda: 1.466e-04, error prev/linear/new: 3.038e+04/3.038e+04/3.038e+04, rel reduction: 1.34243e-08, gain ratio: 1.00926e+00
[info] LM<sym::Optimize> Optimization finished with status: SUCCESS
[info] Finished in 12 iterations

SymForce TicToc Results:
   Name                                              :     Count      | Total Time (s) | Mean Time (s)  |  Max Time (s)  |  Min Time (s) 
-----------------------------------------------------+----------------+----------------+----------------+----------------+----------------
Optimizer<sym::Optimize>::Optimize                   :       1        |     2.2704     |     2.2704     |     2.2704     |     2.2704    
Optimizer<sym::Optimize>::IterateToConvergence       :       1        |     2.2639     |     2.2639     |     2.2639     |     2.2639    
LM<sym::Optimize>::Iterate()                         :       11       |     2.2628     |    0.20571     |    0.77213     |    0.14439    
LM<sym::Optimize>: SparseFactorize                   :       11       |     1.1795     |    0.10723     |    0.11094     |    0.10191    
LM<sym::Optimize>: EvaluateFirst                     :       1        |    0.41074     |    0.41074     |    0.41074     |    0.41074    
Linearizer<sym::Optimize>::Relinearize::First()      :       1        |    0.41073     |    0.41073     |    0.41073     |    0.41073    
Linearizer<sym::Optimize>::Relinearize::NonFirst()   :       11       |     0.3899     |    0.035445    |    0.03725     |    0.035094   
LM<sym::Optimize>: AnalyzePattern                    :       1        |    0.20313     |    0.20313     |    0.20313     |    0.20313    
LM<sym::Optimize>: SparseSolve                       :       11       |    0.061015    |   0.0055468    |   0.0059079    |   0.0052565   
LM<sym::Optimize>::ResetState                        :       1        |    0.005575    |    0.005575    |    0.005575    |    0.005575   
LM<sym::Optimize>: DampHessian                       :       11       |   0.0054902    |   0.00049911   |   0.00053406   |   0.00047392  
LM<sym::Optimize>: Update                            :       11       |    0.004895    |    0.000445    |   0.0027416    |   0.00020451  
LM<sym::Optimize>: ResetHessianDiagonal              :       11       |   0.0031256    |   0.00028414   |   0.0003161    |   0.00027928  
Optimizer<sym::Optimize>::CopyValuesAndLinearization :       1        |   0.00099664   |   0.00099664   |   0.00099664   |   0.00099664  
LM<sym::Optimize>: IterationStats                    :       11       |   0.00062766   |   5.706e-05    |   0.00015624   |   4.105e-05   
LM<sym::Optimize>: IterationStats - Print            :       11       |   0.00061179   |   5.5617e-05   |   0.00015188   |   3.989e-05   
LM<sym::Optimize>: LinearErrorFromValues             :       11       |   0.00052006   |   4.7278e-05   |   5.634e-05    |   4.5121e-05  
LM<sym::Optimize>: accept_update bookkeeping         :       11       |   0.00027735   |   2.5214e-05   |    2.68e-05    |    2.39e-05   
LM<sym::Optimize>: FirstIterationStats               :       1        |   8.476e-05    |   8.476e-05    |   8.476e-05    |   8.476e-05   
LM<sym::Optimize>: StateStep                         :       11       |    2.09e-06    |    1.9e-07     |    1.2e-06     |     8e-08 

G2o:

iteration= 0     chi2= 259573.622860     time= 0.150437  cumTime= 0.150437       edges= 36455    schur= 1        lambda= 296108.136588   levenbergIter= 1
iteration= 1     chi2= 192655.302782     time= 0.118389  cumTime= 0.268827       edges= 36455    schur= 1        lambda= 98702.712196    levenbergIter= 1
iteration= 2     chi2= 161749.478828     time= 0.115093  cumTime= 0.38392        edges= 36455    schur= 1        lambda= 32900.904065    levenbergIter= 1
iteration= 3     chi2= 137525.241796     time= 0.119697  cumTime= 0.503617       edges= 36455    schur= 1        lambda= 10966.968022    levenbergIter= 1
iteration= 4     chi2= 116382.786417     time= 0.109891  cumTime= 0.613508       edges= 36455    schur= 1        lambda= 3655.656007     levenbergIter= 1
iteration= 5     chi2= 97905.364697      time= 0.122131  cumTime= 0.735638       edges= 36455    schur= 1        lambda= 1218.552002     levenbergIter= 1
iteration= 6     chi2= 84400.509395      time= 0.117372  cumTime= 0.853011       edges= 36455    schur= 1        lambda= 406.184001      levenbergIter= 1
iteration= 7     chi2= 76247.892534      time= 0.107551  cumTime= 0.960561       edges= 36455    schur= 1        lambda= 135.394667      levenbergIter= 1
iteration= 8     chi2= 72139.884638      time= 0.120393  cumTime= 1.08095        edges= 36455    schur= 1        lambda= 45.131556       levenbergIter= 1
iteration= 9     chi2= 70791.269665      time= 0.116394  cumTime= 1.19735        edges= 36455    schur= 1        lambda= 15.043852       levenbergIter= 1
iteration= 10    chi2= 69739.254369      time= 0.0965692         cumTime= 1.29392        edges= 36455    schur= 1        lambda= 5.014617        levenbergIter= 1
iteration= 11    chi2= 68323.973708      time= 0.0933185         cumTime= 1.38724        edges= 36455    schur= 1        lambda= 1.671539        levenbergIter= 1
iteration= 12    chi2= 66512.610298      time= 0.0926209         cumTime= 1.47986        edges= 36455    schur= 1        lambda= 0.557180        levenbergIter= 1
iteration= 13    chi2= 64398.037038      time= 0.0923828         cumTime= 1.57224        edges= 36455    schur= 1        lambda= 0.185727        levenbergIter= 1
iteration= 14    chi2= 62541.759271      time= 0.0935335         cumTime= 1.66577        edges= 36455    schur= 1        lambda= 0.061909        levenbergIter= 1
iteration= 15    chi2= 61634.396853      time= 0.0922514         cumTime= 1.75802        edges= 36455    schur= 1        lambda= 0.041273        levenbergIter= 1
iteration= 16    chi2= 61024.263916      time= 0.0927281         cumTime= 1.85075        edges= 36455    schur= 1        lambda= 0.025962        levenbergIter= 1
iteration= 17    chi2= 60820.848997      time= 0.0925179         cumTime= 1.94327        edges= 36455    schur= 1        lambda= 0.012472        levenbergIter= 1
iteration= 18    chi2= 60766.794064      time= 0.0922666         cumTime= 2.03554        edges= 36455    schur= 1        lambda= 0.004157        levenbergIter= 1
iteration= 19    chi2= 60758.333420      time= 0.0923991         cumTime= 2.12794        edges= 36455    schur= 1        lambda= 0.001386        levenbergIter= 1

On problem 39-18060-pre.txt:
Symforce:

[info] LM<sym::Optimize> [iter    0] lambda: 1.000e+00, error prev/linear/new: 8.409e+06/7.178e+04/7.707e+04, rel reduction: 9.90834e-01, gain ratio: 9.99365e-01
[info] LM<sym::Optimize> [iter    1] lambda: 3.333e-01, error prev/linear/new: 7.707e+04/6.610e+04/6.618e+04, rel reduction: 1.41344e-01, gain ratio: 9.92939e-01
[info] LM<sym::Optimize> [iter    2] lambda: 1.111e-01, error prev/linear/new: 6.618e+04/6.282e+04/6.299e+04, rel reduction: 4.82298e-02, gain ratio: 9.51647e-01
[info] LM<sym::Optimize> [iter    3] lambda: 3.704e-02, error prev/linear/new: 6.299e+04/6.159e+04/4.817e+128, rel reduction: -7.64812e+123, gain ratio: -3.45717e+125
[info] LM<sym::Optimize> [iter    4] lambda: 7.407e-02, error prev/linear/new: 6.299e+04/6.187e+04/2.300e+127, rel reduction: -3.65125e+122, gain ratio: -2.05215e+124
[info] LM<sym::Optimize> [iter    5] lambda: 2.963e-01, error prev/linear/new: 6.299e+04/6.234e+04/6.659e+04, rel reduction: -5.72002e-02, gain ratio: -5.58225e+00
[info] LM<sym::Optimize> [iter    6] lambda: 2.370e+00, error prev/linear/new: 6.299e+04/6.271e+04/6.278e+04, rel reduction: 3.21710e-03, gain ratio: 7.26956e-01
[info] LM<sym::Optimize> [iter    7] lambda: 2.149e+00, error prev/linear/new: 6.278e+04/6.262e+04/6.263e+04, rel reduction: 2.49568e-03, gain ratio: 9.72698e-01
[info] LM<sym::Optimize> [iter    8] lambda: 7.162e-01, error prev/linear/new: 6.263e+04/6.243e+04/6.243e+04, rel reduction: 3.13925e-03, gain ratio: 9.83589e-01
[info] LM<sym::Optimize> [iter    9] lambda: 2.387e-01, error prev/linear/new: 6.243e+04/6.209e+04/1.337e+120, rel reduction: -2.14124e+115, gain ratio: -3.91084e+117
[info] LM<sym::Optimize> [iter   10] lambda: 4.775e-01, error prev/linear/new: 6.243e+04/6.222e+04/6.223e+04, rel reduction: 3.22158e-03, gain ratio: 9.61881e-01
[info] LM<sym::Optimize> [iter   11] lambda: 1.592e-01, error prev/linear/new: 6.223e+04/6.189e+04/2.788e+126, rel reduction: -4.48069e+121, gain ratio: -8.30995e+123
[info] LM<sym::Optimize> [iter   12] lambda: 3.183e-01, error prev/linear/new: 6.223e+04/6.202e+04/4.647e+123, rel reduction: -7.46736e+118, gain ratio: -2.20406e+121
[info] LM<sym::Optimize> [iter   13] lambda: 1.273e+00, error prev/linear/new: 6.223e+04/6.216e+04/6.220e+04, rel reduction: 4.36058e-04, gain ratio: 3.78051e-01
[info] LM<sym::Optimize> [iter   14] lambda: 1.292e+00, error prev/linear/new: 6.220e+04/6.210e+04/8.991e+122, rel reduction: -1.44546e+118, gain ratio: -8.98504e+120
[info] LM<sym::Optimize> [iter   15] lambda: 2.584e+00, error prev/linear/new: 6.220e+04/6.213e+04/1.021e+121, rel reduction: -1.64072e+116, gain ratio: -1.38597e+119
[info] LM<sym::Optimize> [iter   16] lambda: 1.033e+01, error prev/linear/new: 6.220e+04/6.215e+04/1.278e+118, rel reduction: -2.05440e+113, gain ratio: -2.44828e+116
[info] LM<sym::Optimize> [iter   17] lambda: 8.267e+01, error prev/linear/new: 6.220e+04/6.216e+04/1.351e+115, rel reduction: -2.17163e+110, gain ratio: -2.95943e+113
[info] LM<sym::Optimize> [iter   18] lambda: 1.323e+03, error prev/linear/new: 6.220e+04/6.216e+04/6.223e+04, rel reduction: -5.09143e-04, gain ratio: -7.07684e-01
[info] LM<sym::Optimize> [iter   19] lambda: 4.233e+04, error prev/linear/new: 6.220e+04/6.216e+04/6.216e+04, rel reduction: 7.17706e-04, gain ratio: 9.98971e-01
[info] LM<sym::Optimize> [iter   20] lambda: 1.411e+04, error prev/linear/new: 6.216e+04/6.216e+04/6.216e+04, rel reduction: 8.42919e-07, gain ratio: 9.98889e-01
[info] LM<sym::Optimize> Optimization finished with status: SUCCESS
[info] Finished in 22 iterations

SymForce TicToc Results:
   Name                                              :     Count      | Total Time (s) | Mean Time (s)  |  Max Time (s)  |  Min Time (s) 
-----------------------------------------------------+----------------+----------------+----------------+----------------+----------------
Optimizer<sym::Optimize>::Optimize                   :       1        |     8.6397     |     8.6397     |     8.6397     |     8.6397    
Optimizer<sym::Optimize>::IterateToConvergence       :       1        |     8.6291     |     8.6291     |     8.6291     |     8.6291    
LM<sym::Optimize>::Iterate()                         :       21       |     8.6274     |    0.41083     |     1.4774     |    0.34525    
LM<sym::Optimize>: SparseFactorize                   :       21       |     5.8408     |    0.27813     |    0.29005     |    0.26338    
Linearizer<sym::Optimize>::Relinearize::NonFirst()   :       21       |     1.3261     |    0.063145    |    0.069488    |    0.062319   
LM<sym::Optimize>: EvaluateFirst                     :       1        |    0.74185     |    0.74185     |    0.74185     |    0.74185    
Linearizer<sym::Optimize>::Relinearize::First()      :       1        |    0.74184     |    0.74184     |    0.74184     |    0.74184    
LM<sym::Optimize>: AnalyzePattern                    :       1        |    0.38182     |    0.38182     |    0.38182     |    0.38182    
LM<sym::Optimize>: SparseSolve                       :       21       |    0.28493     |    0.013568    |    0.014506    |    0.013143   
LM<sym::Optimize>: DampHessian                       :       21       |    0.016569    |   0.00078901   |   0.00083139   |   0.00077227  
LM<sym::Optimize>: Update                            :       21       |    0.011642    |   0.00055438   |   0.0050041    |   0.00032775  
LM<sym::Optimize>: ResetHessianDiagonal              :       21       |   0.0096815    |   0.00046103   |   0.00052488   |   0.00045333  
LM<sym::Optimize>::ResetState                        :       1        |   0.0089456    |   0.0089456    |   0.0089456    |   0.0089456   
Optimizer<sym::Optimize>::CopyValuesAndLinearization :       1        |   0.0017495    |   0.0017495    |   0.0017495    |   0.0017495   
LM<sym::Optimize>: LinearErrorFromValues             :       21       |   0.0015504    |   7.383e-05    |   7.706e-05    |   7.1192e-05  
LM<sym::Optimize>: IterationStats                    :       21       |   0.0011176    |   5.3219e-05   |   0.00015264   |    4.2e-05    
LM<sym::Optimize>: IterationStats - Print            :       21       |   0.0010895    |   5.1881e-05   |   0.00014845   |   4.083e-05   
LM<sym::Optimize>: accept_update bookkeeping         :       21       |   0.00039723   |   1.8916e-05   |   4.441e-05    |    2.1e-07    
LM<sym::Optimize>: FirstIterationStats               :       1        |    9.37e-05    |    9.37e-05    |    9.37e-05    |    9.37e-05   
LM<sym::Optimize>: StateStep                         :       21       |    3.43e-06    |   1.6333e-07   |    1.53e-06    |     8e-08

G2o:

iteration= 0     chi2= 469843.736152     time= 0.244323  cumTime= 0.244323       edges= 63551    schur= 1        lambda= 317978.794274   levenbergIter= 1
iteration= 1     chi2= 345626.275825     time= 0.170076  cumTime= 0.414399       edges= 63551    schur= 1        lambda= 105992.931425   levenbergIter= 1
iteration= 2     chi2= 297055.769229     time= 0.170647  cumTime= 0.585046       edges= 63551    schur= 1        lambda= 35330.977142    levenbergIter= 1
iteration= 3     chi2= 260403.337709     time= 0.172586  cumTime= 0.757632       edges= 63551    schur= 1        lambda= 11776.992381    levenbergIter= 1
iteration= 4     chi2= 227845.824462     time= 0.170453  cumTime= 0.928085       edges= 63551    schur= 1        lambda= 3925.664127     levenbergIter= 1
iteration= 5     chi2= 199127.178045     time= 0.168919  cumTime= 1.097  edges= 63551    schur= 1        lambda= 1308.554709     levenbergIter= 1
iteration= 6     chi2= 177309.864702     time= 0.172991  cumTime= 1.27   edges= 63551    schur= 1        lambda= 436.184903      levenbergIter= 1
iteration= 7     chi2= 165796.243039     time= 0.170644  cumTime= 1.44064        edges= 63551    schur= 1        lambda= 145.394968      levenbergIter= 1
iteration= 8     chi2= 161109.696408     time= 0.171958  cumTime= 1.6126         edges= 63551    schur= 1        lambda= 48.464989       levenbergIter= 1
iteration= 9     chi2= 158370.104693     time= 0.174957  cumTime= 1.78756        edges= 63551    schur= 1        lambda= 16.154996       levenbergIter= 1
iteration= 10    chi2= 155345.960923     time= 0.175806  cumTime= 1.96336        edges= 63551    schur= 1        lambda= 5.384999        levenbergIter= 1
iteration= 11    chi2= 151295.138116     time= 0.171235  cumTime= 2.1346         edges= 63551    schur= 1        lambda= 1.795000        levenbergIter= 1
iteration= 12    chi2= 144506.937536     time= 0.173147  cumTime= 2.30774        edges= 63551    schur= 1        lambda= 0.598333        levenbergIter= 1
iteration= 13    chi2= 135449.989827     time= 0.172167  cumTime= 2.47991        edges= 63551    schur= 1        lambda= 0.199444        levenbergIter= 1
iteration= 14    chi2= 127967.181726     time= 0.169876  cumTime= 2.64979        edges= 63551    schur= 1        lambda= 0.066481        levenbergIter= 1
iteration= 15    chi2= 124958.687989     time= 0.236957  cumTime= 2.88674        edges= 63551    schur= 1        lambda= 0.044321        levenbergIter= 2
iteration= 16    chi2= 123754.552889     time= 0.23867   cumTime= 3.12541        edges= 63551    schur= 1        lambda= 0.029547        levenbergIter= 2
iteration= 17    chi2= 122779.585830     time= 0.170557  cumTime= 3.29597        edges= 63551    schur= 1        lambda= 0.018805        levenbergIter= 1
iteration= 18    chi2= 122179.321785     time= 0.17036   cumTime= 3.46633        edges= 63551    schur= 1        lambda= 0.012537        levenbergIter= 1
iteration= 19    chi2= 121564.148542     time= 0.169314  cumTime= 3.63564        edges= 63551    schur= 1        lambda= 0.008358        levenbergIter= 1

On problem 21 g2o converges to 60758 in 2.12s while symforce converges to 30380 in 2.27s, the convergence is similar and the speed is similar. However contrary to symforce g2o does not implement the analytic computation of Jacobians and they rely on numerical differentiation (see their edge definition), I would expect a speed up from the implementation.
On problem 39 g2o converges to 121564 in 3.63s while symforce converges to 62160 in 8.63s.
Looking at the detailed timings for symforce it seems that SparseFactorize is taking a large amount of time compared to the whole convergence time of g2o. Is it possible that using a different factorization algorithm could explain this difference? G2o is using CHOLMOD, could that explain the difference?
Thank you again for your time

@aaron-skydio
Copy link
Member

So afaict the difference is that g2o has a specialized schur complement implementation, see here: https://github.com/RainerKuemmerle/g2o/blob/eec325a1da1273e87bc97887d49e70570f28570c/g2o/core/block_solver.hpp

Best I can tell, their use of CHOLMOD is just for doing sparse cholesky, and CHOLMOD's sparse cholesky should have very similar performance to SymForce's (CHOLMOD also has other solvers, but I believe sparse cholesky is the one we're talking about).

In this case though, they're forming the schur complement with a specialized implementation, which will have essentially the same asymptotic complexity as SymForce using METIS and our sparse cholesky, but will be some amount faster in practice (I'm not sure how much exactly). They also support doing this in parallel with OpenMP, not sure if that's enabled in your tests, but that'll also make things faster.

It might be interesting to add an explicit schur complement solver to SymForce - we actually have one already in symforce/opt/sparse_schur_solver.h, it just doesn't quite conform to the API expected by our nonlinear optimizer, so it would be a little bit of effort to plug it in there (we currently use it for other things).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants