Add nominal beluga vs nav2 benchmark #80

hidmic · 2023-10-29T19:53:11Z

Proposed changes

This PR adds a benchmark to compare Beluga AMCL and Nav2 AMCL for a nominal configuration.

Type of change

🐛 Bugfix (change which fixes an issue)
🚀 Feature (change which adds functionality)
📚 Documentation (change which fixes or extends documentation)

Checklist

Lint and unit tests (if any) pass locally with my changes
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if appropriate)

hidmic · 2023-10-29T19:56:36Z

Draft report.pdf. Interestingly enough, Beluga does worse than Nav2.

glpuga · 2023-10-30T13:26:53Z

Draft report.pdf. Interestingly enough, Beluga does worse than Nav2.

I'm not really surprised, since it's the first time we measure it. The max errors are very different too.

I wonder if a large error at some point in the interval is skewing the results. Are the medians very different too? how about the APE Localization error distributions?

In the meantime I'll use these results in the paper, we can discuss them tomorrow.

hidmic · 2023-10-30T13:39:29Z

I wonder if a large error at some point in the interval is skewing the results.

I bet it's outliers skewing our naive mean and covariance estimates. I suspect Nav2 AMCL clustering logic is naturally shielding it from them, at the expense of under reporting covariance for true multi-modal distributions (a reasonable tradeoff when you think about it). I'll try and see if I can prove this.

Are the medians very different to? how about the APE Localization error distributions?

I didn't plot distributions for this new report (yet), but the one in #76 shows heavier tails.

glpuga · 2023-10-30T15:22:11Z

I bet it's outliers skewing our naive mean and covariance estimates. I suspect Nav2 AMCL clustering logic is naturally shielding it from them, at the expense of under reporting covariance for true multi-modal distributions (a reasonable tradeoff when you think about it). I'll try and see if I can prove this.

I think the same.

Also, did you do more than one run? if so, did the tendency remain?

Signed-off-by: Michel Hidalgo <[email protected]>

hidmic · 2023-11-02T11:48:58Z

FYI @glpuga @nahueespinosa you should be able to reproduce following the beluga_vs_nav2 README with the corresponding change in benchmark script name.

glpuga · 2023-11-03T18:58:53Z

FYI @glpuga @nahueespinosa you should be able to reproduce following the beluga_vs_nav2 README with the corresponding change in benchmark script name.

My run seems to have moved in the same general direction, but Nav2 crashed in the beam model.

report_glpuga.pdf

hidmic · 2023-11-04T15:24:33Z

I bet my fix upstream didn't make it to your Docker image ros-navigation/navigation2#3929

glpuga · 2023-11-04T22:22:45Z

I created https://github.com/ekumenlabs/kobuki_ros2_stack/pull/8 to help debug this.

hidmic · 2023-11-05T15:00:46Z

I created https://github.com/ekumenlabs/kobuki_ros2_stack/pull/8 to help debug this.

Thanks Gera! I've been replaying things manually out of laziness.

hidmic · 2023-11-05T15:00:50Z

After much fiddling, plotting distributions, going back and forth between Beluga and Nav2, I think I found the main issue: Ekumen-OS/beluga#266. Also found Ekumen-OS/beluga#267 along the way.

report.pdf now looks they way it should.

glpuga · 2023-11-05T15:42:30Z

After much fiddling, plotting distributions, going back and forth between Beluga and Nav2, I think I found the main issue: Ekumen-OS/beluga#266. Also found Ekumen-OS/beluga#267 along the way.

report.pdf now looks they way it should.

That's truly awesome!

I'll update the paper with these results, let's discuss on Tuesday whether to push forward with ERF or look forward to RSS.

I see that now nav2 has a much larger error for the first dataset. Is this random or did anything else change?

hidmic · 2023-11-05T15:47:11Z

I see that now nav2 has a much larger error for the first dataset. Is this random or did anything else change?

I saw that too. I don't have a good answer. Nothing changed. I'll run the benchmarks again tonight and for a larger number of iterations (we are at 10 iterations per variation, for statistical significance we should do 30 or 50 at least).

Edit: hmm, I think I rolled back the changes I made to Nav2 AMCL to disable clustering, but maybe I didn't? I'll nuke and rebuild everything just to be sure.

hidmic · 2023-11-06T01:08:05Z

Latest report.pdf, 30 iterations per variation. Roughly in line, worse max error. Must have left clustering disabled in the previous run. I'd be inclined to implement that for Beluga next.

glpuga · 2023-11-08T12:43:47Z

Just out of caution, can we double check these values? Reasons:

beluga and nav2 have the very same (all visible digits) metrics for likelihood sensor, except for the max.
All values for nav2 are exactly the same as the ones in a previous report, even the max values.

Given the random nature of the filters, I'd expect very close to not exactly the same values.

Latest report:

Previous report:

hidmic · 2023-11-08T13:15:51Z

Sure. This is what I get if I manually work with output data:

                                                                                      rms      mean       std       max
variation.parameters.dataset trajectory.name   variation.parameters.laser_model                                        
hallway_localization         /beluga_amcl/pose beam                              0.072345  0.058647  0.042363  0.324634
                                               likelihood_field                  0.071710  0.058152  0.041963  0.288981
                             /nav2_amcl/pose   beam                              0.067830  0.055431  0.039097  0.192166
                                               likelihood_field                  0.071573  0.057687  0.042370  0.193696
hallway_return               /beluga_amcl/pose beam                              0.077700  0.062059  0.046757  0.349237
                                               likelihood_field                  0.074153  0.060620  0.042708  0.333891
                             /nav2_amcl/pose   beam                              0.073039  0.058055  0.044323  0.256825
                                               likelihood_field                  0.070188  0.058336  0.039031  0.188596

Values are not the same but close. We are rounding to mm precision, as I didn't think anything past that point was anything but noise. I may be mistaken, though.

hidmic · 2024-01-09T13:23:46Z

We should get this in, along with #81. @glpuga @nahueespinosa comments are welcome. Otherwise, I'll merge at EOD.

glpuga · 2024-01-09T20:15:02Z

We should get this in, along with #81. @glpuga @nahueespinosa comments are welcome. Otherwise, I'll merge at EOD.

I'm ok to merge with eyes closed.

hidmic · 2024-01-09T21:20:36Z

I'm ok to merge with eyes closed.

🙈 going in!

Base automatically changed from hidmic/improve-beluga-vs-nav2 to main November 2, 2023 11:42

Add nominal beluga vs nav2 benchmark

18be294

Signed-off-by: Michel Hidalgo <[email protected]>

hidmic force-pushed the hidmic/nominal-beluga-vs-nav2 branch from 0780b44 to 18be294 Compare November 2, 2023 11:48

hidmic changed the title ~~[WIP] Add nominal beluga vs nav2 benchmark~~ Add nominal beluga vs nav2 benchmark Jan 9, 2024

hidmic merged commit ecd7daf into main Jan 9, 2024

hidmic deleted the hidmic/nominal-beluga-vs-nav2 branch January 9, 2024 21:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add nominal beluga vs nav2 benchmark #80

Add nominal beluga vs nav2 benchmark #80

hidmic commented Oct 29, 2023

hidmic commented Oct 29, 2023

glpuga commented Oct 30, 2023 •

edited

Loading

hidmic commented Oct 30, 2023

glpuga commented Oct 30, 2023

hidmic commented Nov 2, 2023

glpuga commented Nov 3, 2023 •

edited

Loading

hidmic commented Nov 4, 2023

glpuga commented Nov 4, 2023

hidmic commented Nov 5, 2023

hidmic commented Nov 5, 2023 •

edited

Loading

glpuga commented Nov 5, 2023

hidmic commented Nov 5, 2023 •

edited

Loading

hidmic commented Nov 6, 2023 •

edited

Loading

glpuga commented Nov 8, 2023

hidmic commented Nov 8, 2023 •

edited

Loading

hidmic commented Jan 9, 2024

glpuga commented Jan 9, 2024

hidmic commented Jan 9, 2024

Add nominal beluga vs nav2 benchmark #80

Add nominal beluga vs nav2 benchmark #80

Conversation

hidmic commented Oct 29, 2023

Proposed changes

Type of change

Checklist

hidmic commented Oct 29, 2023

glpuga commented Oct 30, 2023 • edited Loading

hidmic commented Oct 30, 2023

glpuga commented Oct 30, 2023

hidmic commented Nov 2, 2023

glpuga commented Nov 3, 2023 • edited Loading

hidmic commented Nov 4, 2023

glpuga commented Nov 4, 2023

hidmic commented Nov 5, 2023

hidmic commented Nov 5, 2023 • edited Loading

glpuga commented Nov 5, 2023

hidmic commented Nov 5, 2023 • edited Loading

hidmic commented Nov 6, 2023 • edited Loading

glpuga commented Nov 8, 2023

hidmic commented Nov 8, 2023 • edited Loading

hidmic commented Jan 9, 2024

glpuga commented Jan 9, 2024

hidmic commented Jan 9, 2024

glpuga commented Oct 30, 2023 •

edited

Loading

glpuga commented Nov 3, 2023 •

edited

Loading

hidmic commented Nov 5, 2023 •

edited

Loading

hidmic commented Nov 5, 2023 •

edited

Loading

hidmic commented Nov 6, 2023 •

edited

Loading

hidmic commented Nov 8, 2023 •

edited

Loading