Using software-device alignment with rs-camera and non-realsense camera #10363

ilyak93 · 2022-04-01T17:41:10Z

Required Info
Camera Model	D465
Operating System & Version	Win 10
Platform	PC
SDK Version	2.34.0-0~realsense0.2250
Language	C++

Main goal: align realsense camera with nonreal-sense thermal (images mode: RS2_FORMAT_Z16)
The code of the alignment is checked with realsense rgb-depth images and works fine.
I did the calibration of the cameras with matlab:

left_images = string(ls("G:\Vista_project\cccalib2\rs_ther_rs_best\"));
left_images = left_images(3:end);
left_images = arrayfun(@(s) append("G:\Vista_project\cccalib2\rs_ther_rs_best\", s), left_images);

right_images = string(ls("G:\Vista_project\cccalib2\rs_ther_thermal_best\"));
right_images = right_images(3:end);
right_images = arrayfun(@(s) append("G:\Vista_project\cccalib2\rs_ther_thermal_best\", s), right_images);


[imagePoints,boardSize] = detectCheckerboardPoints(left_images, right_images);

squareSize = 20;
worldPoints = generateCheckerboardPoints(boardSize,squareSize);

imageSize = [720, 1280];
params = estimateCameraParameters(imagePoints,worldPoints,'ImageSize',imageSize);
showReprojectionErrors(params);
figure;
showExtrinsics(params);

and I got a pretty good re-projection error 0f 0.51 and it seems in the intrinsics visualization that the location of the cameras is correct:

The realsense is a little bit above the thermal camera, is in the physical setup, it does seem to be reliable.

Also the reprojection itself (red line) looks good:

(the green are the detected points, and they are detected on the rgb image, and on the thermal image its the same locations as in the rgb, so ignore it but the reprojection (red) look perfect)

The resolutions of the cameras are different, as the realsense is 720 x 1280 and the thermal camera is 512 x 640, so I tried two approaches:

Pad the 512 x 640 to 720 x 1280 with zeroes, so the 512 x 640 starts in the left upper corner.
Upsample the 512 x 640 to 720 x 1280 using some simple upsample technique of PIL lib cv2.resize(tc1_np, size, interpolation=cv2.INTER_AREA)

using the code for alignment didn't give correct results.

I also wondered if to use the intrinsics Matlab calculated for realsense or the original from the stream (rs2 api), tried both, didn't get anything significant (although I think the matlab intrinsics were better a little).

I did hope that the second approach will work, as real-sense seem to be using it, although I don't really know the inner implementation, but all I do know that before the alignment the images are both 720 x 1280, as in this approach. The thing I don't know is how is the calibration parameters are calculated, tho it is the most logical, that they are calculated for both images when they are 720 x 1280 and then register_extrinsics_to from the thermal to real-sense should do the job.

So my questions I guess are:

How should I deal with different resolution (I think i will also try to lower the rs resolution, so maybe the upsample and downsample will be less harmful for calibration because the initial dimensions will be closer ) ? Do the callibration should be done between the same resolution images, i.e does realsense make it that way (make sense cuz then the extrinsics should be the transform from one to another) ?
May it be as an indicator of good calibration that Matlab calculated intrinsics should be the same or close to the stream intrinsics ?
To make sure I'm putting the intrinsics and intrinsics right, here is their structure from matlab:

Camera1 is realsense and Camera2 is thermal, here are their intrinsics:

So as much as I understand, rotation and translation are the parameters for register_extrinsics_to.
Then the structure of the intrisics matrix:

is a transposed this one (my guess correct me if I'm wrong):

where f_x of the matrix is the fx parameter for creating a rs2_intrinsics for sensor, same about f_y, and c_x is ppx and c_y is ppy.

The coefficients as much as I can tell from the docs are for Brown Comrady distortion:

I deduced it looking at matlab documentation and the naming convention (correct me If I'm wrong):

so the k's are k1,k2,k3 the radial distortion coefficients and the k1,k2 are the tangential coefficients.

(I think it is correct:

)

P.S:
I tried to use the transposed rotation matrix, I think it was closer then as is, but can't tell if this is an indicator of something or just a random effect.

The whole alignment code:

//
// Created by Ily on 2/24/2022.
//

#include <iostream>
#include <opencv2/opencv.hpp>
#include <librealsense2/rs.hpp>
#include <librealsense2/hpp/rs_internal.hpp>
#include <sstream>
#include <string>
#include <fstream>
#include <vector>
#include "dirent-1.23.2/include/dirent.h"
#include "shlwapi.h"
#pragma comment(lib, "Shlwapi.lib")

int W = 1280;
int H = 720;

using namespace std;

wstring widen(string text)
{
    locale loc("");
    vector<wchar_t> buffer(text.size());
    use_facet< std::ctype<wchar_t> > (loc).widen(text.data(), text.data() + text.size(), &buffer[0]);
    return wstring(&buffer[0], buffer.size());
}

bool compare(const string& first, const string& second)
{
    wstring lhs = widen(first);
    wstring rhs = widen(second);
    bool ordered = StrCmpLogicalW(lhs.c_str(), rhs.c_str()) < 0;
    return ordered;
}




int main() {

    std::ifstream infile("G:/Vista_project/cccalib2/intrinsics_extrinsics_rs_thermal_upsample.txt");

    std::string line;
    vector<string> thermal;
    vector<string> realsense;
    vector<string> thermal_to_rs;
    while (getline(infile, line)) {
        vector<string> splitted;
        char *token = strtok((char*)(line.c_str()), " ");
        while (token != NULL) {
            splitted.push_back(string(token));
            token = strtok(NULL, " ");
        }
        if(splitted[0].compare(string("thermal")) == 0){
            thermal = splitted;
        } else if(splitted[0].compare(string("realsense")) == 0) {
            realsense = splitted;
        } else {
            thermal_to_rs = splitted;
        }
    }

    rs2::software_device dev;

    auto rs_sensor = dev.add_sensor("rs");
    auto thermal_sensor = dev.add_sensor("thermal");

    rs2_intrinsics rs_intrinsics{W, H, stof(realsense[3]), stof(realsense[4]),
                                    stof(realsense[5]), stof(realsense[6]),
                                 RS2_DISTORTION_BROWN_CONRADY,
                                    {stof(realsense[7]), stof(realsense[8]),
                                     stof(realsense[9]), stof(realsense[10]),
                                     stof(realsense[11])}};
    rs2_intrinsics thermal_intrinsics{W, H, stof(thermal[3]), stof(thermal[4]),
                                    stof(thermal[5]), stof(thermal[6]),
                                    RS2_DISTORTION_BROWN_CONRADY,
                                    {stof(thermal[7]), stof(thermal[8]),
                                     stof(thermal[9]), stof(thermal[10]),
                                     stof(thermal[11])}};

    auto thermal_stream = thermal_sensor.add_video_stream(
            {RS2_STREAM_DEPTH, 0, 0, W, H, 30, 2, RS2_FORMAT_Z16,
             thermal_intrinsics});
    auto rs_stream = rs_sensor.add_video_stream(
            {RS2_STREAM_COLOR, 0, 1, W, H, 30, 3, RS2_FORMAT_RGB8,
             rs_intrinsics});

    thermal_sensor.add_read_only_option(RS2_OPTION_DEPTH_UNITS, 0.001f);
    thermal_sensor.add_read_only_option(RS2_OPTION_STEREO_BASELINE, 0.001f);
    thermal_stream.register_extrinsics_to(rs_stream,
                                        {
        {
            float(stod(thermal_to_rs[1])),float(stod(thermal_to_rs[2])),
            float(stod(thermal_to_rs[3])),float(stod(thermal_to_rs[4])),
            float(stod(thermal_to_rs[5])),float(stod(thermal_to_rs[6])),
            float(stod(thermal_to_rs[7])),float(stod(thermal_to_rs[8])),
            float(stod(thermal_to_rs[9]))
            },
            {
            float(stod(thermal_to_rs[11])), float(stod(thermal_to_rs[12])),
            float(stod(thermal_to_rs[12]))
            }
                                        });

    dev.create_matcher(RS2_MATCHER_DEFAULT);
    rs2::syncer sync;

    thermal_sensor.open(thermal_stream);
    rs_sensor.open(rs_stream);

    thermal_sensor.start(sync);
    rs_sensor.start(sync);

    rs2::align align(RS2_STREAM_COLOR);

    rs2::frameset fs;
    rs2::frame thermal_frame;
    rs2::frame rs_frame;

    cv::Mat thermal_image;
    cv::Mat rs_image;

    int idx = 0;

    //inject the images

    struct dirent *ent_rs;
    struct dirent *ent_thermal;
    DIR *rs_dir = opendir ("G:/Vista_project/cccalib2/rs_ther_rs_best/");
    DIR *thermal_dir  = opendir ("G:/Vista_project/cccalib2/rs_ther_thermal_best/");
    assert(rs_dir != NULL);
    assert(thermal_dir != NULL);

    list<string> thermal_images_pathes;
    list<string> realsense_images_pathes;
    while((ent_rs = readdir (rs_dir)) != NULL) {
        if ((ent_thermal = readdir(thermal_dir)) == NULL) continue;
        if (string(ent_rs->d_name).compare("..") == 0 ||
            string(ent_rs->d_name).compare(".") == 0)
            continue;
        string rs_image_path = string("G:/Vista_project/cccalib2/rs_ther_rs_best/") + string(ent_rs->d_name);
        string thermal_image_path = string("G:/Vista_project/cccalib2/rs_ther_thermal_best/") + string(ent_thermal->d_name);
        thermal_images_pathes.push_back(thermal_image_path);
        realsense_images_pathes.push_back(rs_image_path);
    }
    thermal_images_pathes.sort(compare);
    realsense_images_pathes.sort(compare);
    int images_num = thermal_images_pathes.size();
    for(int k = 0; k < images_num; ++k){
        string rs_image_path =  realsense_images_pathes.front();
        realsense_images_pathes.pop_front();
        string thermal_image_path = thermal_images_pathes.front();
        thermal_images_pathes.pop_front();
        rs_image = cv::imread(rs_image_path.c_str(), cv::IMREAD_COLOR);
        thermal_image = cv::imread(thermal_image_path.c_str(),cv::IMREAD_UNCHANGED);
        rs_sensor.on_video_frame({(void*) rs_image.data, // Frame pixels from capture API
                                     [](void*) {}, // Custom deleter (if required)
                                     3 * 1280, 3, // Stride and Bytes-per-pixel
                                     double(idx * 30), RS2_TIMESTAMP_DOMAIN_SYSTEM_TIME,
                                     idx, // Timestamp, Frame# for potential sync services
                                     rs_stream});
        thermal_sensor.on_video_frame({(void*) thermal_image.data, // Frame pixels from capture API
                                     [](void*) {}, // Custom deleter (if required)
                                     2 * 1280, 2, // Stride and Bytes-per-pixel
                                     double(idx * 30), RS2_TIMESTAMP_DOMAIN_SYSTEM_TIME,
                                     idx, // Timestamp, Frame# for potential sync services
                                     thermal_stream});

        fs = sync.wait_for_frames();

        if (fs.size() == 2) {
            fs = align.process(fs);
            rs2::frame thermal_frame = fs.get_depth_frame();
            rs2::frame rs_frame = fs.get_color_frame();
            cv::Mat aligned_image(720, 1280, CV_16UC1, (void*) (thermal_frame.get_data()), 2 * 1280);
            cv::imwrite("G:/Vista_project/cccalib2/aligned_rs_t/" + to_string(k) + ".png", aligned_image);
        }
        idx++;
    }
    closedir (thermal_dir);
    closedir (rs_dir);
    return 0;
}

Here is the content of the calibration file (with matlab intrinsics to realseanse and not the stream ones from the api):

thermal 1280 720 392.8641263349174 242.8521537982638 1491.178661268357 1456.508589131852 0.977712227012730 -4.29498270257113 0 0 0 Brown Conrady
realsense 1280 720 721.396343936838 360.474365632246 1018.69038782411 1005.02815567598 -0.0212647 -0.0259618 0.000399448 -0.000477234 0.00108937 Inverse Brown Conrady
rotation_no_transpose 0.999522419764990 0.012358882235717 -0.028322966246906 -0.013900658340889 0.998395010561186 -0.0549014989250613 0.0275989870254250 0.0552689869313010 0.998089993436843 translation 62.2584703635828 -92.8114403720439 -2.38894165251570
rotation 0.999522419764990 -0.013900658340889 0.0275989870254250 0.012358882235717 0.998395010561186 0.0552689869313010 -0.028322966246906 -0.0549014989250613 0.998089993436843 translation 62.2584703635828 -92.8114403720439 -2.38894165251570

The original color image (I've cut my face on the color image, so there is another upper part)::

The results (second approach) with rotation as is from the matlab calibration :

With transposed rotation:

The second is better, because when I look one image on another, the objects are closer one to another, but it still isn't a good alignment result.

First approach results:
Color:

Thermal aligned with transposed rotation matrix:

Thermal aligned with as is rotation matrix:

Those are at least overlapping (pretty close).

In this approach I also tried to do rs2::align align(RS2_STREAM_DEPTH); instead of rs2::align align(RS2_STREAM_COLOR); and saving the aligned color images, so I get:

The only thing that comes to my mind I think is to try do better calibration (more images, more positions).
I will be glad for revisiting thing I've done and correct me if I wrong anywhere or any other comments.

Does it look to you guys that it works and it just a matter of more effort into calibration (tho the re-projection looked good and the error was small) ?

The text was updated successfully, but these errors were encountered:

MartyG-RealSense · 2022-04-02T10:12:05Z

Hi @ilyak93 In answer to your questions:

I would recommend using the last approach in your case that uses rs2::align so that the RealSense SDK's align processing block is used for alignment. The processing block automatically adjusts for differences in resolution and FOV size.

In the case of your change from rs2::align align(RS2_STREAM_COLOR); to rs2::align align(RS2_STREAM_DEPTH); it looks as though you are aligning color to depth instead of the much more common depth to color, a method demonstrated by the SDK's C++ rs-measure example program for measuring the real-world distance betweeen two points on the image.

https://github.com/IntelRealSense/librealsense/blob/master/examples/measure/rs-measure.cpp#L133

When color to depth alignment is performed, if the color FOV size is smaller than the depth FOV size then the color image will be stretched out to match the depth FOV size. This can be useful if you want the aligned image to fill the whole screen instead of having a border around it, as typically happens when aligning depth to a color stream with a smaller FOV.

The link below has links to scripting references for alignment in the MATLAB wrapper.

https://support.intelrealsense.com/hc/en-us/community/posts/4415502202771/comments/4415513801235

I would say a cautious 'yes' about whether it is a good sign that the intrinsics calculated by MATLAB closely match the stream intrinsics. I do not know enough about MATLAB calibration to give an absolute answer to this question though.

As the information at the top of this case states that you are using SDK 2.34.0, I would recommend considering a newer SDK version if possible. 2.34.0 had a problem with continuously generating timing errors for some RealSense users.

ilyak93 · 2022-04-02T20:30:48Z

I would recommend using the last approach in your case that uses rs2::align so that the RealSense SDK's align processing block is used for alignment. The processing block automatically adjusts for differences in resolution and FOV size.

Can you detail about that a little bit more ?
I thought that this adjustment is done according too the extrinsics and the intrinsics got by the calibration, so what do you mean by automatically adjusts ?

For now I've noticed that surprisingly the first approach did much better alignment and by moderating a little the ppx and ppy I was managed to move the thermal image onto the real sense even more and the alignment seems almost perfect excluding the distortion of the straight lines, small objects and objects around the borders of image. I think that the calibration may be not exact enough although it seems good on the matlab tests and criteria, so that why it was needed a little hand corrections of the ppx and ppy of thermal image as I said.
I think maybe I can play with the distortion coefficients to achieve same result.

I tried also making undistortion in matlab, where I also did the calibration, so it is the same environment and the arguments of the calibration to the matlab undistort function, i.e:

[J,~] = undistortImage(imread(right_images(i)), params.params.CameraParameters2);

But the result wasn't good because I think as much as the calibration good, it isn't ideal, otherwise i don't know how can I change the result. So maybe playing it by hand and looking at the results on a few dozens images is the best I can do, although it isn't a quantitative approach (like I can't really measure it excluding by eye).

I think that also is a factor using the first approach (excluding the thing you said about automatically adjustment), because stretching 512 x 640 image by up-sampling to 720 * 1280 creates too much noise for the calibration, that's why on the images u can see that the aligned thermal looks like it shrank too much. Of course probably I can then manually adjust the intrinsics (I think maybe the focal length) for controlling this dimensions, but it seems that in both ways the bottleneck is the calibration (or the way that I use this parameters, but I described here the process and that it do looks fine, idk), so that's why the most profitable approach is the one that makes the least changes to the images (no stretching and other pre-proccesing) as in the first approach. From what it seems to me, I guess.

According to this observation, maybe a good try is the one that I mentioned, to lower the resolution of the realsense to a more similar one to thermal.

By the way, the same resolution constraint is by using calibration of matlab mainly. Originally the objects of thermal images are just a little bit bigger then the same objects in real sense, that's why I suspect that maybe the first approach gives me better results, because the calibration is easier, thus the parameters are better, and there isn't shrinking effect, just not accurate alignment which I fix by hand (and it do looks good for the big objects in the images).

An exmaple:

You can notice in the left corner the small device is not exactly aligned, but like the big stuff are aligned, that's already a progress.

The original of that thermal image looks like this:

So you can notice it shrank the image and then I fixed a little bit the ppx and the ppy to fit as u see above. That's my best shot so far.

MartyG-RealSense · 2022-04-02T20:55:55Z

If alignment is performed in your code using rs2::align then the SDK's Align Processing Block mechanism makes automatic adjustments for differences in resolution and FOV size.

If you do not use rs2::align and create your own alignment method then the Align Processing Block is not used and you will have to create your own mechanism for compensating for differences, like your various methods described above.

So using rs2::align and having the Align Processing Block automatically handle adjustments between streams is much easier than implementing adjustments yourself. But it is certainly not compulsory to use rs2::align. What matters in the end is what works best for your project and produces the best results, and if your own align method achieves that then it is totally fine to use it.

MartyG-RealSense · 2022-04-10T08:10:34Z

Hi @ilyak93 Do you require further assistance with this case, please? Thanks!

MartyG-RealSense added D400 Series Windows matlab labels Apr 2, 2022

ilyak93 closed this as completed Apr 10, 2022

MartyG-RealSense mentioned this issue Apr 21, 2022

Zoom option fort realsense SDK #10426

Closed

MartyG-RealSense mentioned this issue Sep 24, 2022

Undistort Matlab #10937

Closed

MartyG-RealSense mentioned this issue Dec 20, 2023

The width of pointcloud does not match the rgb image size IntelRealSense/realsense-ros#2937

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using software-device alignment with rs-camera and non-realsense camera #10363

Using software-device alignment with rs-camera and non-realsense camera #10363

ilyak93 commented Apr 1, 2022 •

edited

Loading

MartyG-RealSense commented Apr 2, 2022 •

edited

Loading

ilyak93 commented Apr 2, 2022 •

edited

Loading

MartyG-RealSense commented Apr 2, 2022 •

edited

Loading

MartyG-RealSense commented Apr 10, 2022

Using software-device alignment with rs-camera and non-realsense camera #10363

Using software-device alignment with rs-camera and non-realsense camera #10363

Comments

ilyak93 commented Apr 1, 2022 • edited Loading

MartyG-RealSense commented Apr 2, 2022 • edited Loading

ilyak93 commented Apr 2, 2022 • edited Loading

MartyG-RealSense commented Apr 2, 2022 • edited Loading

MartyG-RealSense commented Apr 10, 2022

ilyak93 commented Apr 1, 2022 •

edited

Loading

MartyG-RealSense commented Apr 2, 2022 •

edited

Loading

ilyak93 commented Apr 2, 2022 •

edited

Loading

MartyG-RealSense commented Apr 2, 2022 •

edited

Loading