Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Planar object pose estimation from RGBD #671

Merged
merged 14 commits into from
Jan 29, 2020

Conversation

s-trinh
Copy link
Contributor

@s-trinh s-trinh commented Dec 4, 2019

See #668
/cc @jmirabel

If you want to test to see if the tag pose estimation is better.

static bool computePoseRGBD(const vpImage<float> &depthMap, const std::vector<vpImagePoint> &corners,
                            const vpCameraParameters &colorIntrinsics, double tagSize, vpHomogeneousMatrix &cMo);
  • depthMap is a NxM image that contains depth. The depth map must be registered with the color image (same image size, same (u,v) coordinates in the depth map represents the same physical element in the color image)
  • corners are the fours tag corners detected in the color image
  • colorIntrinsics are the color intrinsic parameters (that are equal to the depth intrinsic parameters since the depth map is registered with the color image)
  • tagSize is the tag size where tagSize/2 represents the half tag size
  • cMo the estimated tag pose
  • it returns true if the computation is ok

And then you can do:

std::vector<std::vector<vpImagePoint> > tags_corners = detector.getPolygon();
for (size_t i = 0; i < tags_corners.size(); i++) {
  vpHomogeneousMatrix cMo;
  if (vpPose::computePoseRGBD(depthMap, tags_corners[i], cam, tagSize, cMo)) {
    vpDisplay::displayFrame(I_color2, cMo, cam, tagSize/2, vpColor::none, 3);
  }
}

For my tests, I used the RealSense sensor that allows getting directly the depth map registered with the color image. It is also much simpler for the code.

I don't know how you will get the depth information and you may need to register the depth map with the color image manually. It should be something like:

  • iterate over the depth map
  • for each depth (u,v) compute (Xd,Yd,Zd) using the depth intrinsics and the depth information
  • transform to the color frame using the extrinsics between the depth and the color frame
  • project (Xc,Yc,Zc) into the image plane to get the corresponding color (u,v) coordinate

If the fov of the depth sensor is bigger than the one of the color sensor, you should not get too much holes in the registered depth map.

@jmirabel
Copy link

jmirabel commented Dec 5, 2019

Thank you @s-trinh

@ksyy could you give a try to this ?

@jmirabel
Copy link

jmirabel commented Dec 9, 2019

@ksyy tried your changes.

At the moment, we have an issue when the angle between the tag normal and the camera normal is close to 90 degrees. In this case, the extracted depth information does not belong to the tag. This is a registration issue. We will redo the experiments with calibration of the depth extrinsic parameters when we have some time.

Our tests at the moment have not been thorough enough to say whether this method works better than the RGB one. Except for the extreme case above, we couldn't find any obvious pitfalls. I will keep you updated when we go further.

@s-trinh
Copy link
Contributor Author

s-trinh commented Dec 9, 2019

Thanks for the feedback.
Which RGBD sensor are you using?

@ksyy
Copy link

ksyy commented Dec 10, 2019

We are using an Orbbec Astra Pro installed in a TALOS robot.

@jmirabel
Copy link

The color and depth images are not synchronized. This may be the reason of the depth issue above. We are waiting for a new camera (new January) which shouldn't suffer this issue.

Sadly, with my current simulation setup, I use RViz to generate the images and it does not provide a depth image. I opened lucasw/rviz_camera_stream#27 some time ago but did have time to implement it.

@fspindle
Copy link
Contributor

@s-trinh I introduced an example to be able to test vpPose::computePoseRGBD() in live with a Realsense D435 camera. There is maybe a better way to get the depth map as a vector of float. From my quick tests, it couldn't find a situation where the pose estimation returned by vpDetectorAprilTag fails, even with a tag that is very small in the image and the camera that has a high angle wrt to tag X or Y axis. Since I don't have a tripod under the hand right now, I will test again later and let you know

@s-trinh
Copy link
Contributor Author

s-trinh commented Dec 19, 2019

This is the code I used. It is better to display in two different images.

code

#include <iostream>

#include <visp3/core/vpConfig.h>

#if defined(VISP_HAVE_REALSENSE2)
#include <visp3/core/vpIoTools.h>
#include <visp3/core/vpImageConvert.h>
#include <visp3/detection/vpDetectorAprilTag.h>
#include <visp3/gui/vpDisplayX.h>
#include <visp3/sensor/vpRealSense2.h>
#include <visp3/vision/vpPose.h>

int main(int argc, char *argv[])
{
    bool save = false;
    double tagSize = 0.0415;
    int method = 0;
    for (int i = 1; i < argc; i++) {
        if (std::string(argv[i]) == "--save") {
            save = true;
        } else if (i+1 < argc && std::string(argv[i]) == "--tagSize") {
            tagSize = atof(argv[i+1]);
        } else if (i+1 < argc && std::string(argv[i]) == "--method") {
            method = atoi(argv[i+1]);
        }
    }

    vpDetectorAprilTag::vpPoseEstimationMethod poseEstimationMethod = vpDetectorAprilTag::HOMOGRAPHY_VIRTUAL_VS;
    if (method == 1) {
        poseEstimationMethod = vpDetectorAprilTag::DEMENTHON_VIRTUAL_VS;
    } else if (method == 2) {
        poseEstimationMethod = vpDetectorAprilTag::LAGRANGE_VIRTUAL_VS;
    }

    std::cout << "save: " << save << std::endl;
    std::cout << "tagSize: " << tagSize << std::endl;
    std::cout << "poseEstimationMethod: " << poseEstimationMethod << std::endl;

    std::string save_folder = "";
    if (save) {
        save_folder = vpTime::getDateTime("%Y-%m-%d_%H-%M-%S");
        std::cout << "save_folder: " << save_folder << std::endl;
        vpIoTools::makeDirectory(save_folder);
    }

    const int width = 640, height = 480;
    vpImage<vpRGBa> I_color(height, width);
    vpImage<unsigned char> I;
    vpImage<uint16_t> I_depth_raw(height, width);
    vpImage<vpRGBa> I_depth;

    // RealSense
    vpRealSense2 g;
    rs2::config config;
    config.enable_stream(RS2_STREAM_COLOR, width, height, RS2_FORMAT_RGBA8, 30);
    config.enable_stream(RS2_STREAM_DEPTH, width, height, RS2_FORMAT_Z16, 30);
    g.open(config);

    const float depth_scale = g.getDepthScale();

    // Camera intrinsic parameters
    vpCameraParameters cam = g.getCameraParameters(rs2_stream::RS2_STREAM_COLOR);
    std::cout << "Cam:\n" << cam << std::endl;

    rs2::align align_to_color(RS2_STREAM_COLOR);
    g.acquire(reinterpret_cast<unsigned char *>(I_color.bitmap), reinterpret_cast<unsigned char *>(I_depth_raw.bitmap),
              NULL, NULL, &align_to_color);

    vpImage<vpRGBa> I_color2 = I_color;
    vpImageConvert::createDepthHistogram(I_depth_raw, I_depth);
    vpDisplayX d1(I_color, 0, 0, "Pose from Homography");
    vpDisplayX d2(I_color2, I_color.getWidth(), 0, "Pose from RGBD fusion");
    vpDisplayX d3(I_depth, 0, I_color.getHeight(), "Depth");

    // For pose estimation
    std::vector<vpPoint> pose_points;
    pose_points.push_back(vpPoint(-tagSize/2, -tagSize/2, 0));
    pose_points.push_back(vpPoint( tagSize/2, -tagSize/2, 0));
    pose_points.push_back(vpPoint( tagSize/2,  tagSize/2, 0));
    pose_points.push_back(vpPoint(-tagSize/2,  tagSize/2, 0));

    //AprilTag
    vpDetectorAprilTag::vpAprilTagFamily tagFamily = vpDetectorAprilTag::TAG_36h11;
    float quad_decimate = 1.0;
    int nThreads = 1;

    vpDetectorAprilTag detector(tagFamily);
    detector.setAprilTagQuadDecimate(quad_decimate);
    detector.setAprilTagPoseEstimationMethod(poseEstimationMethod);
    detector.setAprilTagNbThreads(nThreads);

    // Image results
    int cpt_frame = 0;

    vpImage<float> depthMap;
    while (true) {
        g.acquire(reinterpret_cast<unsigned char *>(I_color.bitmap), reinterpret_cast<unsigned char *>(I_depth_raw.bitmap),
                  NULL, NULL, &align_to_color);

        I_color2 = I_color;
        vpImageConvert::convert(I_color, I);
        vpImageConvert::createDepthHistogram(I_depth_raw, I_depth);

        vpDisplay::display(I_color);
        vpDisplay::display(I_color2);
        vpDisplay::display(I_depth);

        depthMap.resize(I_depth_raw.getHeight(), I_depth_raw.getWidth());
        for (unsigned int i = 0; i < I_depth_raw.getHeight(); i++) {
            for (unsigned int j = 0; j < I_depth_raw.getWidth(); j++) {
                if (I_depth_raw[i][j]) {
                    float Z = I_depth_raw[i][j] * depth_scale;
                    depthMap[i][j] = Z;
                } else {
                    depthMap[i][j] = 0;
                }
            }
        }

        std::vector<vpHomogeneousMatrix> cMo_vec;
        detector.detect(I, tagSize, cam, cMo_vec);

        for (size_t i = 0; i < cMo_vec.size(); i++) {
            vpDisplay::displayFrame(I_color, cMo_vec[i], cam, tagSize/2, vpColor::none, 3);
        }

        std::vector<std::vector<vpImagePoint> > tags_corners = detector.getPolygon();
        for (size_t i = 0; i < tags_corners.size(); i++) {
            vpHomogeneousMatrix cMo;
            if (vpPose::computePoseRGBD(depthMap, tags_corners[i], cam, tagSize, cMo)) {
                vpDisplay::displayFrame(I_color2, cMo, cam, tagSize/2, vpColor::none, 3);
            }
        }

        vpDisplay::displayText(I_color, 20, 20, "Pose from homography + VVS", vpColor::red);
        vpDisplay::displayText(I_color2, 20, 20, "Pose from RGBD fusion", vpColor::red);


        vpDisplay::flush(I_color);
        vpDisplay::flush(I_color2);
        vpDisplay::flush(I_depth);

        if (vpDisplay::getClick(I_color, false)) {
            break;
        }

        if (save) {
            char buffer[256];
            sprintf(buffer, "image_results_%04d.png", cpt_frame++);
            std::string filename = save_folder + "/" + buffer;
        }
    }

    return EXIT_SUCCESS;
}
#else
int main() {
    return 0;
}
#endif

It is not easy to reproduce it until you manage do it.

One way is to display this image, zoomed on a monitor. With the sensor in a region at the right of the tag (looking at the center of the tag), at some point you can find a pose where the z-axis will flip continuously.

Another option should be to print a tag but make it so that it is not flat anymore.

- rename vpPose::computePoseRGBD() into vpPose::computePlanarObjectPoseFromRGBD()
- introduce doc
- update tutorial to display estimate pose with and without using depth
- consider the case where z_aligned parameter is modified by the user
@fspindle fspindle changed the title [Test] Pose estimation from RGBD Planar object pose estimation from RGBD Jan 17, 2020
@fspindle fspindle marked this pull request as ready for review January 21, 2020 14:33
@fspindle
Copy link
Contributor

@s-trinh Ready to merge on my side

@s-trinh
Copy link
Contributor Author

s-trinh commented Jan 22, 2020

@fspindle I will have a look.

Did you manage to reproduce the issue?
Do you have any feedback from potential other people that have the issue and that have tried this method with success?

@fspindle
Copy link
Contributor

Yes I was able to reproduce the issue displaying this image on my screen.

No feedback from other people

\param[in] colorIntrinsics : Camera parameters used to convert \e corners from pixel to meters.
\param[in] point3d : Vector of 3D points corresponding to the model of the planar object.
\param[out] cMo : Computed pose.
\param[out] confidence : Pose estimation confidence index in range [0, 1]. When values are close to zero
Copy link
Contributor Author

@s-trinh s-trinh Jan 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the word confidence. It looks too simplistic to use the depth filling ratio as a confidence measure.
I don't have recent experience with the method. If you have introduced it because you encountered sometimes bad estimated pose with this method, I would propose to reword the parameter. I don't have a good proposition though (depth filling ratio?, depth data percentage?).

There are also the M-estimator weights that should give some information.

And maybe also the plane fitting error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my experience, the "depth filling ratio" or better named the normalized number of depth data covering the tag is useful to detect when the depth sensor is occluded or partially occluded or when there are wholes in the depth map.

I agree that this is too simple to be a good confidence indicator.

A good compromise that I found is to compute the confidence index as the product between between the normalized number of depth data covering the tag and the normalized M-estimator weights returned by the robust estimation of the tag 3D plane. Results that I obtained sound good.

\return true if pose estimation succeed, false otherwise.
*/
bool vpPose::computePlanarObjectPoseFromRGBD(const vpImage<float> &depthMap, const std::vector<vpImagePoint> &corners,
const vpCameraParameters &colorIntrinsics, const std::vector<vpPoint> &point3d, vpHomogeneousMatrix &cMo, double &confidence)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the estimated poses with this method reliable from your tests?

Generally, I don't like having to pass too much parameters. If confidence is necessary to evaluate if the estimated pose is valid, keep it otherwise I would suggest using something like double *good_parameter_name=NULL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is very difficult to make appear the z-axis flipping phenomena using a real tag with the Realsense D435. I succeed by crumping the tag and observing a very small tag in the image.

As mentioned in a previous discussion, this phenomena occurs also displaying this image on my screen and detecting the upper right tag with the Realsense.

In both cases, results obtained with this method were reliable.

Ok passing a pointer.

@s-trinh
Copy link
Contributor Author

s-trinh commented Jan 24, 2020

I think that:

  • vpDetectorAprilTag::getTagsId() should have been introduced much earlier
  • and computePose should have handled different tags size

Anyway, for getPoint3D() a proposition would be:

std::vector<std::vector<vpPoint>> getPoints3D(const std::vector<int>& tagsId, const std::map<int, double>& tagSize);

Size of ouput of getPoints3D() should match the size of tagsId.

With something like:

std::map<int, double> tagsSize;
tagsSize[-1] = 0.05; //default tag size
tagSize[3] = 0.1; //custom size for tag with id=3
//if -1 and if the tag id are not present in the map, use a default tag size of 1

//[...]
detector.detect(I);
std::vector<int> tagsId = detector.getTagsId(); //return ids for all the previously detected tags

@@ -251,6 +251,7 @@ class VISP_EXPORT vpDetectorAprilTag : public vpDetectorBase
std::vector<vpHomogeneousMatrix> &cMo_vec, std::vector<vpHomogeneousMatrix> *cMo_vec2=NULL,
std::vector<double> *projErrors=NULL, std::vector<double> *projErrors2=NULL);

std::vector<std::vector<vpPoint> > getPoint3D();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current getPoint3D() will not work if getPose() is used.
Also, it requires to actually compute the tags pose while the 3D tag coordinates can be computed easily.

I suggest to add getPoint3D() independant of the pose computation machinery in vpDetectorAprilTagDetector.
This would allow using getTagsCorners() introduced in #680 to compute the pose with OpenCV or any other library.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I will do the changes

@fspindle fspindle merged commit 3cde0ba into lagadic:master Jan 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants