Planar object pose estimation from RGBD #671

s-trinh · 2019-12-04T23:55:21Z

If you want to test to see if the tag pose estimation is better.

static bool computePoseRGBD(const vpImage<float> &depthMap, const std::vector<vpImagePoint> &corners,
                            const vpCameraParameters &colorIntrinsics, double tagSize, vpHomogeneousMatrix &cMo);

depthMap is a NxM image that contains depth. The depth map must be registered with the color image (same image size, same (u,v) coordinates in the depth map represents the same physical element in the color image)
corners are the fours tag corners detected in the color image
colorIntrinsics are the color intrinsic parameters (that are equal to the depth intrinsic parameters since the depth map is registered with the color image)
tagSize is the tag size where tagSize/2 represents the half tag size
cMo the estimated tag pose
it returns true if the computation is ok

And then you can do:

std::vector<std::vector<vpImagePoint> > tags_corners = detector.getPolygon();
for (size_t i = 0; i < tags_corners.size(); i++) {
  vpHomogeneousMatrix cMo;
  if (vpPose::computePoseRGBD(depthMap, tags_corners[i], cam, tagSize, cMo)) {
    vpDisplay::displayFrame(I_color2, cMo, cam, tagSize/2, vpColor::none, 3);
  }
}

For my tests, I used the RealSense sensor that allows getting directly the depth map registered with the color image. It is also much simpler for the code.

I don't know how you will get the depth information and you may need to register the depth map with the color image manually. It should be something like:

iterate over the depth map
for each depth (u,v) compute (Xd,Yd,Zd) using the depth intrinsics and the depth information
transform to the color frame using the extrinsics between the depth and the color frame
project (Xc,Yc,Zc) into the image plane to get the corresponding color (u,v) coordinate

If the fov of the depth sensor is bigger than the one of the color sensor, you should not get too much holes in the registered depth map.

jmirabel · 2019-12-05T07:57:43Z

Thank you @s-trinh

@ksyy could you give a try to this ?

jmirabel · 2019-12-09T15:53:17Z

@ksyy tried your changes.

At the moment, we have an issue when the angle between the tag normal and the camera normal is close to 90 degrees. In this case, the extracted depth information does not belong to the tag. This is a registration issue. We will redo the experiments with calibration of the depth extrinsic parameters when we have some time.

Our tests at the moment have not been thorough enough to say whether this method works better than the RGB one. Except for the extreme case above, we couldn't find any obvious pitfalls. I will keep you updated when we go further.

s-trinh · 2019-12-09T19:34:01Z

Thanks for the feedback.
Which RGBD sensor are you using?

ksyy · 2019-12-10T10:05:46Z

We are using an Orbbec Astra Pro installed in a TALOS robot.

jmirabel · 2019-12-13T09:16:06Z

The color and depth images are not synchronized. This may be the reason of the depth issue above. We are waiting for a new camera (new January) which shouldn't suffer this issue.

Sadly, with my current simulation setup, I use RViz to generate the images and it does not provide a depth image. I opened lucasw/rviz_camera_stream#27 some time ago but did have time to implement it.

fspindle · 2019-12-18T08:14:20Z

@s-trinh I introduced an example to be able to test vpPose::computePoseRGBD() in live with a Realsense D435 camera. There is maybe a better way to get the depth map as a vector of float. From my quick tests, it couldn't find a situation where the pose estimation returned by vpDetectorAprilTag fails, even with a tag that is very small in the image and the camera that has a high angle wrt to tag X or Y axis. Since I don't have a tripod under the hand right now, I will test again later and let you know

s-trinh · 2019-12-19T01:25:34Z

This is the code I used. It is better to display in two different images.

code

#include <iostream>

#include <visp3/core/vpConfig.h>

#if defined(VISP_HAVE_REALSENSE2)
#include <visp3/core/vpIoTools.h>
#include <visp3/core/vpImageConvert.h>
#include <visp3/detection/vpDetectorAprilTag.h>
#include <visp3/gui/vpDisplayX.h>
#include <visp3/sensor/vpRealSense2.h>
#include <visp3/vision/vpPose.h>

int main(int argc, char *argv[])
{
    bool save = false;
    double tagSize = 0.0415;
    int method = 0;
    for (int i = 1; i < argc; i++) {
        if (std::string(argv[i]) == "--save") {
            save = true;
        } else if (i+1 < argc && std::string(argv[i]) == "--tagSize") {
            tagSize = atof(argv[i+1]);
        } else if (i+1 < argc && std::string(argv[i]) == "--method") {
            method = atoi(argv[i+1]);
        }
    }

    vpDetectorAprilTag::vpPoseEstimationMethod poseEstimationMethod = vpDetectorAprilTag::HOMOGRAPHY_VIRTUAL_VS;
    if (method == 1) {
        poseEstimationMethod = vpDetectorAprilTag::DEMENTHON_VIRTUAL_VS;
    } else if (method == 2) {
        poseEstimationMethod = vpDetectorAprilTag::LAGRANGE_VIRTUAL_VS;
    }

    std::cout << "save: " << save << std::endl;
    std::cout << "tagSize: " << tagSize << std::endl;
    std::cout << "poseEstimationMethod: " << poseEstimationMethod << std::endl;

    std::string save_folder = "";
    if (save) {
        save_folder = vpTime::getDateTime("%Y-%m-%d_%H-%M-%S");
        std::cout << "save_folder: " << save_folder << std::endl;
        vpIoTools::makeDirectory(save_folder);
    }

    const int width = 640, height = 480;
    vpImage<vpRGBa> I_color(height, width);
    vpImage<unsigned char> I;
    vpImage<uint16_t> I_depth_raw(height, width);
    vpImage<vpRGBa> I_depth;

    // RealSense
    vpRealSense2 g;
    rs2::config config;
    config.enable_stream(RS2_STREAM_COLOR, width, height, RS2_FORMAT_RGBA8, 30);
    config.enable_stream(RS2_STREAM_DEPTH, width, height, RS2_FORMAT_Z16, 30);
    g.open(config);

    const float depth_scale = g.getDepthScale();

    // Camera intrinsic parameters
    vpCameraParameters cam = g.getCameraParameters(rs2_stream::RS2_STREAM_COLOR);
    std::cout << "Cam:\n" << cam << std::endl;

    rs2::align align_to_color(RS2_STREAM_COLOR);
    g.acquire(reinterpret_cast<unsigned char *>(I_color.bitmap), reinterpret_cast<unsigned char *>(I_depth_raw.bitmap),
              NULL, NULL, &align_to_color);

    vpImage<vpRGBa> I_color2 = I_color;
    vpImageConvert::createDepthHistogram(I_depth_raw, I_depth);
    vpDisplayX d1(I_color, 0, 0, "Pose from Homography");
    vpDisplayX d2(I_color2, I_color.getWidth(), 0, "Pose from RGBD fusion");
    vpDisplayX d3(I_depth, 0, I_color.getHeight(), "Depth");

    // For pose estimation
    std::vector<vpPoint> pose_points;
    pose_points.push_back(vpPoint(-tagSize/2, -tagSize/2, 0));
    pose_points.push_back(vpPoint( tagSize/2, -tagSize/2, 0));
    pose_points.push_back(vpPoint( tagSize/2,  tagSize/2, 0));
    pose_points.push_back(vpPoint(-tagSize/2,  tagSize/2, 0));

    //AprilTag
    vpDetectorAprilTag::vpAprilTagFamily tagFamily = vpDetectorAprilTag::TAG_36h11;
    float quad_decimate = 1.0;
    int nThreads = 1;

    vpDetectorAprilTag detector(tagFamily);
    detector.setAprilTagQuadDecimate(quad_decimate);
    detector.setAprilTagPoseEstimationMethod(poseEstimationMethod);
    detector.setAprilTagNbThreads(nThreads);

    // Image results
    int cpt_frame = 0;

    vpImage<float> depthMap;
    while (true) {
        g.acquire(reinterpret_cast<unsigned char *>(I_color.bitmap), reinterpret_cast<unsigned char *>(I_depth_raw.bitmap),
                  NULL, NULL, &align_to_color);

        I_color2 = I_color;
        vpImageConvert::convert(I_color, I);
        vpImageConvert::createDepthHistogram(I_depth_raw, I_depth);

        vpDisplay::display(I_color);
        vpDisplay::display(I_color2);
        vpDisplay::display(I_depth);

        depthMap.resize(I_depth_raw.getHeight(), I_depth_raw.getWidth());
        for (unsigned int i = 0; i < I_depth_raw.getHeight(); i++) {
            for (unsigned int j = 0; j < I_depth_raw.getWidth(); j++) {
                if (I_depth_raw[i][j]) {
                    float Z = I_depth_raw[i][j] * depth_scale;
                    depthMap[i][j] = Z;
                } else {
                    depthMap[i][j] = 0;
                }
            }
        }

        std::vector<vpHomogeneousMatrix> cMo_vec;
        detector.detect(I, tagSize, cam, cMo_vec);

        for (size_t i = 0; i < cMo_vec.size(); i++) {
            vpDisplay::displayFrame(I_color, cMo_vec[i], cam, tagSize/2, vpColor::none, 3);
        }

        std::vector<std::vector<vpImagePoint> > tags_corners = detector.getPolygon();
        for (size_t i = 0; i < tags_corners.size(); i++) {
            vpHomogeneousMatrix cMo;
            if (vpPose::computePoseRGBD(depthMap, tags_corners[i], cam, tagSize, cMo)) {
                vpDisplay::displayFrame(I_color2, cMo, cam, tagSize/2, vpColor::none, 3);
            }
        }

        vpDisplay::displayText(I_color, 20, 20, "Pose from homography + VVS", vpColor::red);
        vpDisplay::displayText(I_color2, 20, 20, "Pose from RGBD fusion", vpColor::red);


        vpDisplay::flush(I_color);
        vpDisplay::flush(I_color2);
        vpDisplay::flush(I_depth);

        if (vpDisplay::getClick(I_color, false)) {
            break;
        }

        if (save) {
            char buffer[256];
            sprintf(buffer, "image_results_%04d.png", cpt_frame++);
            std::string filename = save_folder + "/" + buffer;
        }
    }

    return EXIT_SUCCESS;
}
#else
int main() {
    return 0;
}
#endif

It is not easy to reproduce it until you manage do it.

One way is to display this image, zoomed on a monitor. With the sensor in a region at the right of the tag (looking at the center of the tag), at some point you can find a pose where the z-axis will flip continuously.

Another option should be to print a tag but make it so that it is not flat anymore.

- rename vpPose::computePoseRGBD() into vpPose::computePlanarObjectPoseFromRGBD() - introduce doc - update tutorial to display estimate pose with and without using depth - consider the case where z_aligned parameter is modified by the user

… 3D points used to compute the equation of the tag 3D plane

fspindle · 2020-01-21T14:34:23Z

@s-trinh Ready to merge on my side

s-trinh · 2020-01-22T11:03:10Z

@fspindle I will have a look.

Did you manage to reproduce the issue?
Do you have any feedback from potential other people that have the issue and that have tried this method with success?

fspindle · 2020-01-22T11:06:13Z

Yes I was able to reproduce the issue displaying this image on my screen.

No feedback from other people

doc/tutorial/detection/tutorial-detection-apriltag.doc

s-trinh · 2020-01-23T23:27:12Z

modules/vision/src/pose-estimation/vpPoseRGBD.cpp

+  \param[in] colorIntrinsics : Camera parameters used to convert \e corners from pixel to meters.
+  \param[in] point3d : Vector of 3D points corresponding to the model of the planar object.
+  \param[out] cMo : Computed pose.
+  \param[out] confidence : Pose estimation confidence index in range [0, 1]. When values are close to zero


I don't like the word confidence. It looks too simplistic to use the depth filling ratio as a confidence measure.
I don't have recent experience with the method. If you have introduced it because you encountered sometimes bad estimated pose with this method, I would propose to reword the parameter. I don't have a good proposition though (depth filling ratio?, depth data percentage?).

There are also the M-estimator weights that should give some information.

And maybe also the plane fitting error?

Based on my experience, the "depth filling ratio" or better named the normalized number of depth data covering the tag is useful to detect when the depth sensor is occluded or partially occluded or when there are wholes in the depth map.

I agree that this is too simple to be a good confidence indicator.

A good compromise that I found is to compute the confidence index as the product between between the normalized number of depth data covering the tag and the normalized M-estimator weights returned by the robust estimation of the tag 3D plane. Results that I obtained sound good.

s-trinh · 2020-01-23T23:31:28Z

modules/vision/src/pose-estimation/vpPoseRGBD.cpp

+  \return true if pose estimation succeed, false otherwise.
+ */
+bool vpPose::computePlanarObjectPoseFromRGBD(const vpImage<float> &depthMap, const std::vector<vpImagePoint> &corners,
+                                             const vpCameraParameters &colorIntrinsics, const std::vector<vpPoint> &point3d, vpHomogeneousMatrix &cMo, double &confidence)


Are the estimated poses with this method reliable from your tests?

Generally, I don't like having to pass too much parameters. If confidence is necessary to evaluate if the estimated pose is valid, keep it otherwise I would suggest using something like double *good_parameter_name=NULL.

It is very difficult to make appear the z-axis flipping phenomena using a real tag with the Realsense D435. I succeed by crumping the tag and observing a very small tag in the image.

As mentioned in a previous discussion, this phenomena occurs also displaying this image on my screen and detecting the upper right tag with the Realsense.

In both cases, results obtained with this method were reliable.

Ok passing a pointer.

s-trinh · 2020-01-24T00:00:10Z

I think that:

vpDetectorAprilTag::getTagsId() should have been introduced much earlier
and computePose should have handled different tags size

Anyway, for getPoint3D() a proposition would be:

std::vector<std::vector<vpPoint>> getPoints3D(const std::vector<int>& tagsId, const std::map<int, double>& tagSize);

Size of ouput of getPoints3D() should match the size of tagsId.

With something like:

std::map<int, double> tagsSize;
tagsSize[-1] = 0.05; //default tag size
tagSize[3] = 0.1; //custom size for tag with id=3
//if -1 and if the tag id are not present in the map, use a default tag size of 1

//[...]
detector.detect(I);
std::vector<int> tagsId = detector.getTagsId(); //return ids for all the previously detected tags

s-trinh · 2020-01-24T00:04:59Z

modules/detection/include/visp3/detection/vpDetectorAprilTag.h

@@ -251,6 +251,7 @@ class VISP_EXPORT vpDetectorAprilTag : public vpDetectorBase
              std::vector<vpHomogeneousMatrix> &cMo_vec, std::vector<vpHomogeneousMatrix> *cMo_vec2=NULL,
              std::vector<double> *projErrors=NULL, std::vector<double> *projErrors2=NULL);

+  std::vector<std::vector<vpPoint> > getPoint3D();


Current getPoint3D() will not work if getPose() is used.
Also, it requires to actually compute the tags pose while the 3D tag coordinates can be computed easily.

I suggest to add getPoint3D() independant of the pose computation machinery in vpDetectorAprilTagDetector.
This would allow using getTagsCorners() introduced in #680 to compute the pose with OpenCV or any other library.

You are right. I will do the changes

Co-Authored-By: s-trinh <[email protected]>

…into test_pose_from_RGBD

…confidence index

Add test code for pose estimation from RGBD.

d2e7e9c

jmirabel mentioned this pull request Dec 5, 2019

Wrong estimation of tag pose for some images agimus/agimus-vision#5

Closed

Introduce example to test vpPose::computePoseRGBD()

ad36283

Make changes more generic

608f2c5

- rename vpPose::computePoseRGBD() into vpPose::computePlanarObjectPoseFromRGBD() - introduce doc - update tutorial to display estimate pose with and without using depth - consider the case where z_aligned parameter is modified by the user

fspindle changed the title ~~[Test] Pose estimation from RGBD~~ Planar object pose estimation from RGBD Jan 17, 2020

fspindle added 2 commits January 20, 2020 11:55

Introduce a pose estimation confidence index related to the number of…

c2b5ce9

… 3D points used to compute the equation of the tag 3D plane

Introduce pose from depth map in doxygen tutorial

dd18b75

fspindle marked this pull request as ready for review January 21, 2020 14:33

s-trinh commented Jan 23, 2020

View reviewed changes

doc/tutorial/detection/tutorial-detection-apriltag.doc Outdated Show resolved Hide resolved

s-trinh commented Jan 23, 2020

View reviewed changes

s-trinh commented Jan 24, 2020

View reviewed changes

fspindle and others added 7 commits January 24, 2020 15:42

Update doc/tutorial/detection/tutorial-detection-apriltag.doc

a02461a

Co-Authored-By: s-trinh <[email protected]>

Merge changes introduced in PR lagadic#680

96fd05d

Merge branch 'master' into test_pose_from_RGBD

8f09ba4

Merge branch 'test_pose_from_RGBD' of https://github.com/s-trinh/visp …

e6dc586

…into test_pose_from_RGBD

Introduce getTagsPoints3D()

5cba3ea

Update computePlanarObjectPoseFromRGBD() to return a pose estimation …

241d29d

…confidence index

Update tutorial to use last changes

36d9781

fspindle added 2 commits January 29, 2020 11:00

Update section concerning tag estimation from depth map adding images

3e04885

Merge branch 'master' into test_pose_from_RGBD

0b3101c

fspindle merged commit 3cde0ba into lagadic:master Jan 29, 2020

fspindle mentioned this pull request May 14, 2020

April Tags and depth image. #668

Closed

s-trinh mentioned this pull request Mar 14, 2022

Planar pose computation and pose ambiguity #1039

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Planar object pose estimation from RGBD #671

Planar object pose estimation from RGBD #671

s-trinh commented Dec 4, 2019

jmirabel commented Dec 5, 2019

jmirabel commented Dec 9, 2019

s-trinh commented Dec 9, 2019

ksyy commented Dec 10, 2019

jmirabel commented Dec 13, 2019

fspindle commented Dec 18, 2019

s-trinh commented Dec 19, 2019

fspindle commented Jan 21, 2020

s-trinh commented Jan 22, 2020

fspindle commented Jan 22, 2020

s-trinh Jan 23, 2020 •

edited

Loading

fspindle Jan 27, 2020

s-trinh Jan 23, 2020

fspindle Jan 27, 2020

s-trinh commented Jan 24, 2020 •

edited

Loading

s-trinh Jan 24, 2020

fspindle Jan 27, 2020

Planar object pose estimation from RGBD #671

Planar object pose estimation from RGBD #671

Conversation

s-trinh commented Dec 4, 2019

jmirabel commented Dec 5, 2019

jmirabel commented Dec 9, 2019

s-trinh commented Dec 9, 2019

ksyy commented Dec 10, 2019

jmirabel commented Dec 13, 2019

fspindle commented Dec 18, 2019

s-trinh commented Dec 19, 2019

fspindle commented Jan 21, 2020

s-trinh commented Jan 22, 2020

fspindle commented Jan 22, 2020

s-trinh Jan 23, 2020 • edited Loading

Choose a reason for hiding this comment

fspindle Jan 27, 2020

Choose a reason for hiding this comment

s-trinh Jan 23, 2020

Choose a reason for hiding this comment

fspindle Jan 27, 2020

Choose a reason for hiding this comment

s-trinh commented Jan 24, 2020 • edited Loading

s-trinh Jan 24, 2020

Choose a reason for hiding this comment

fspindle Jan 27, 2020

Choose a reason for hiding this comment

s-trinh Jan 23, 2020 •

edited

Loading

s-trinh commented Jan 24, 2020 •

edited

Loading