Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nominal beluga vs nav2 benchmark #80

Merged
merged 1 commit into from
Jan 9, 2024
Merged

Conversation

hidmic
Copy link
Collaborator

@hidmic hidmic commented Oct 29, 2023

Proposed changes

This PR adds a benchmark to compare Beluga AMCL and Nav2 AMCL for a nominal configuration.

Type of change

  • 🐛 Bugfix (change which fixes an issue)
  • 🚀 Feature (change which adds functionality)
  • 📚 Documentation (change which fixes or extends documentation)

Checklist

  • Lint and unit tests (if any) pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

@hidmic
Copy link
Collaborator Author

hidmic commented Oct 29, 2023

Draft report.pdf. Interestingly enough, Beluga does worse than Nav2.

@glpuga
Copy link
Collaborator

glpuga commented Oct 30, 2023

Draft report.pdf. Interestingly enough, Beluga does worse than Nav2.

I'm not really surprised, since it's the first time we measure it. The max errors are very different too.

I wonder if a large error at some point in the interval is skewing the results. Are the medians very different too? how about the APE Localization error distributions?

In the meantime I'll use these results in the paper, we can discuss them tomorrow.

@hidmic
Copy link
Collaborator Author

hidmic commented Oct 30, 2023

I wonder if a large error at some point in the interval is skewing the results.

I bet it's outliers skewing our naive mean and covariance estimates. I suspect Nav2 AMCL clustering logic is naturally shielding it from them, at the expense of under reporting covariance for true multi-modal distributions (a reasonable tradeoff when you think about it). I'll try and see if I can prove this.

Are the medians very different to? how about the APE Localization error distributions?

I didn't plot distributions for this new report (yet), but the one in #76 shows heavier tails.

@glpuga
Copy link
Collaborator

glpuga commented Oct 30, 2023

I bet it's outliers skewing our naive mean and covariance estimates. I suspect Nav2 AMCL clustering logic is naturally shielding it from them, at the expense of under reporting covariance for true multi-modal distributions (a reasonable tradeoff when you think about it). I'll try and see if I can prove this.

I think the same.

Also, did you do more than one run? if so, did the tendency remain?

Base automatically changed from hidmic/improve-beluga-vs-nav2 to main November 2, 2023 11:42
@hidmic hidmic force-pushed the hidmic/nominal-beluga-vs-nav2 branch from 0780b44 to 18be294 Compare November 2, 2023 11:48
@hidmic
Copy link
Collaborator Author

hidmic commented Nov 2, 2023

FYI @glpuga @nahueespinosa you should be able to reproduce following the beluga_vs_nav2 README with the corresponding change in benchmark script name.

@glpuga
Copy link
Collaborator

glpuga commented Nov 3, 2023

FYI @glpuga @nahueespinosa you should be able to reproduce following the beluga_vs_nav2 README with the corresponding change in benchmark script name.

My run seems to have moved in the same general direction, but Nav2 crashed in the beam model.

report_glpuga.pdf

@hidmic
Copy link
Collaborator Author

hidmic commented Nov 4, 2023

I bet my fix upstream didn't make it to your Docker image ros-navigation/navigation2#3929

@glpuga
Copy link
Collaborator

glpuga commented Nov 4, 2023

I created https://github.com/ekumenlabs/kobuki_ros2_stack/pull/8 to help debug this.

@hidmic
Copy link
Collaborator Author

hidmic commented Nov 5, 2023

I created https://github.com/ekumenlabs/kobuki_ros2_stack/pull/8 to help debug this.

Thanks Gera! I've been replaying things manually out of laziness.

@hidmic
Copy link
Collaborator Author

hidmic commented Nov 5, 2023

After much fiddling, plotting distributions, going back and forth between Beluga and Nav2, I think I found the main issue: Ekumen-OS/beluga#266. Also found Ekumen-OS/beluga#267 along the way.

report.pdf now looks they way it should.

@glpuga
Copy link
Collaborator

glpuga commented Nov 5, 2023

After much fiddling, plotting distributions, going back and forth between Beluga and Nav2, I think I found the main issue: Ekumen-OS/beluga#266. Also found Ekumen-OS/beluga#267 along the way.

report.pdf now looks they way it should.

That's truly awesome!

I'll update the paper with these results, let's discuss on Tuesday whether to push forward with ERF or look forward to RSS.

I see that now nav2 has a much larger error for the first dataset. Is this random or did anything else change?

Screenshot from 2023-11-05 12-39-49

@hidmic
Copy link
Collaborator Author

hidmic commented Nov 5, 2023

I see that now nav2 has a much larger error for the first dataset. Is this random or did anything else change?

I saw that too. I don't have a good answer. Nothing changed. I'll run the benchmarks again tonight and for a larger number of iterations (we are at 10 iterations per variation, for statistical significance we should do 30 or 50 at least).

Edit: hmm, I think I rolled back the changes I made to Nav2 AMCL to disable clustering, but maybe I didn't? I'll nuke and rebuild everything just to be sure.

@hidmic
Copy link
Collaborator Author

hidmic commented Nov 6, 2023

Latest report.pdf, 30 iterations per variation. Roughly in line, worse max error. Must have left clustering disabled in the previous run. I'd be inclined to implement that for Beluga next.

@glpuga
Copy link
Collaborator

glpuga commented Nov 8, 2023

Just out of caution, can we double check these values? Reasons:

  • beluga and nav2 have the very same (all visible digits) metrics for likelihood sensor, except for the max.
  • All values for nav2 are exactly the same as the ones in a previous report, even the max values.

Given the random nature of the filters, I'd expect very close to not exactly the same values.

Latest report:

Screenshot from 2023-11-08 09-42-15

Previous report:

Screenshot from 2023-11-08 09-42-09

@hidmic
Copy link
Collaborator Author

hidmic commented Nov 8, 2023

Sure. This is what I get if I manually work with output data:

                                                                                      rms      mean       std       max
variation.parameters.dataset trajectory.name   variation.parameters.laser_model                                        
hallway_localization         /beluga_amcl/pose beam                              0.072345  0.058647  0.042363  0.324634
                                               likelihood_field                  0.071710  0.058152  0.041963  0.288981
                             /nav2_amcl/pose   beam                              0.067830  0.055431  0.039097  0.192166
                                               likelihood_field                  0.071573  0.057687  0.042370  0.193696
hallway_return               /beluga_amcl/pose beam                              0.077700  0.062059  0.046757  0.349237
                                               likelihood_field                  0.074153  0.060620  0.042708  0.333891
                             /nav2_amcl/pose   beam                              0.073039  0.058055  0.044323  0.256825
                                               likelihood_field                  0.070188  0.058336  0.039031  0.188596

Values are not the same but close. We are rounding to mm precision, as I didn't think anything past that point was anything but noise. I may be mistaken, though.

@hidmic hidmic changed the title [WIP] Add nominal beluga vs nav2 benchmark Add nominal beluga vs nav2 benchmark Jan 9, 2024
@hidmic
Copy link
Collaborator Author

hidmic commented Jan 9, 2024

We should get this in, along with #81. @glpuga @nahueespinosa comments are welcome. Otherwise, I'll merge at EOD.

@glpuga
Copy link
Collaborator

glpuga commented Jan 9, 2024

We should get this in, along with #81. @glpuga @nahueespinosa comments are welcome. Otherwise, I'll merge at EOD.

I'm ok to merge with eyes closed.

@hidmic
Copy link
Collaborator Author

hidmic commented Jan 9, 2024

I'm ok to merge with eyes closed.

🙈 going in!

@hidmic hidmic merged commit ecd7daf into main Jan 9, 2024
@hidmic hidmic deleted the hidmic/nominal-beluga-vs-nav2 branch January 9, 2024 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants