-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Major performance problem with align_depth option #1929
Comments
Trying to isolate the issue, do you see the same performance issue running |
Do you have a link for how to run I installed the camera & the wrapper from source. I'm using librealsense 2.45.0 and realsense-ros 2.3.0. I configured librealsense with |
Once you built with -DBUILD_EXAMPLES=true, rs align should be found here: /build/examples/align/rs-align |
I checked out p.s. Sorry it took me so long to reply--I forgot to enable notifications. I'll be able to respond more quickly going forward. |
Hey @doronhi -- what would you suggest as a next debugging step? |
The wrapper uses librealsense's align filter to align the depth image. From then onwards there is nothing different from the way other frames are being handled. Assuming you have just this one copy of librealsense2, which rs-align also uses with much better performance, I can't think of a source to the difference. Did you specify higher resolutions than the defaults which rs-align uses? Another thought I had was that rs_align uses a rs2::pipeline object to synchronize the depth and color images while realsense2_camera uses class "rs2::asynchronous_syncer" directly. An issue with the syncer or the timestamps could cause frames to arrive separately most of the time so aligning is not happening most of the time - hence reduced FPS with no CPU usage. You can test this hypothesis by changing ROS_DEBUG to ROS_INFO here and here and see. Do you receive some "Single video frame arrived" messages, and if so, at what rate, or just "Frameset arrived." messages, as expected? |
I will investigate this and get back to you asap with this result. Also, if it's helpful, I can share my raspberry pi image with all of the software built & installed, so that you can see the exact launches/behavior I'm seeing. |
Here is the launch file I'm using:
It's using the default resolution from I made the logging changes, I only only see |
I did some digging just to check into the launch files (since obviously the issue lies somewhere in the ROS bridge code), and I determined this: the color & aligned_depth images topics are publishing 1280x720 images, and the depth topic is publishing an 848x480 image. I'm not sure if this is useful information, but I want to provide as much detail as I can :) |
Seeing your launch file, could you try with something simpler? |
Ok, when I run
While doing this, I did some very unscientific and inaccurate profiling with If this hypothesis makes sense, perhaps I could write some code to do the alignment on another thread, configure the camera to output lower-resolution/correctly-aligned frames without software processing, write some code to leverage the PI's GPU, or perhaps the ARM compiler options are not generating SIMD instructions that would speed up the current implementation. The processor is a Cortex A72 @ 1.5GHz. Do you think any of this theories are worth following up? If you point me to the functions to look in to, I'd be happy to take a shot at implementing some of my suggestions. I installed the camera & the wrapper from source (since there aren't raspberry pi packages). I'm using librealsense 2.45.0 and realsense-ros 2.3.0. I configured librealsense with |
@dgrnbrg Could you please try with the lower resolution and turn off IMU data if you don't need it? |
@RealSenseSupport Which of the resolution-related parameters should I use? Also, I do need the IMU data, as the application is for a mobile robotic platform. Also, would you suggest a resolution setting? |
Regarding operating the filters (align, colorizer, etc.) on a different thread, it sounds like a good idea. It should be configurable for some users may not wish to allow the node to use all the available CPUs though. Regarding the compiler options and GPU usage, most of the hard work is done in the librealsense2 library, not in the realsense2_camera node which is essentially a wrapper. |
I made significant progress in trying to get the GPU usage running, and I would appreciate your help in figuring out what the next debugging step is. Here's what I've done so far: First, I rebuilt librealsense2 with
I validated that Next, I convinced At this point, I was seeing an unexpected crash at startup in the nodelet manager, so I added
|
Hi @dgrnbrg , You mentioned that rs-gl now runs. I think, perhaps the next step should be modifying the rs-align example to use rs::gl::align. That way you could both compare performance (which is important to decide whether you want to pursue this direction or not) and to have a more contained environment to debug rs::gl::align. I always find that debugging librealsense2 through running realsense2_camera node complicates things. |
I understand I’m in uncharted waters :). I’m trying to thoroughly document my work so that it’s possible to retrace my steps. It’s been years since I’ve written c++, and I’m still not familiar with librealsense’s design. I’m not sure if it’s realistic, but do you have any ideas based on that backtrack, what could be causing that null pointer? I’d love to leverage your real sense expertise and bring my own knowledge in hacking stuff together (in this case, the glsl approach). Ultimately, all I really care about is getting 10 FPS from the camera on the RPi4, so that I can use it for this robot platform to do mapping by streaming the feed over wifi to a faster computer nearby. If you think the threading approach might be easier to implement than debugging the glsl approach, I’m open to whichever gets that goal accomplished. And then, I’m happy to share all work and setup so that other customers can have this working out of the box. |
I’m this situation, it seems like some “stream” is null. Is there docs on the programming model design? What’s a stream, and what could make it null without a failure during program initialization? |
Hey @doronhi -- do you think I should stay on this issue, or open a new one on librealsense to understand why the OpenGL align filter segfaults? |
@dgrnbrg Sorry for late reply. I would suggest you to open a new ticket on librealsense project for OpenGL align filter segfaults. Thanks! |
@dgrnbrg Any other questions about this ticket? Looking forward to your update. Thanks! |
@dgrnbrg Any other questions for this ticket? Please note that this will be closed if we don't hear from you for another 7 days. Thanks! |
Hello, I am having a similar issue with my D435 and Jetson NX (640X480 resolution, problem repeats calling "roslaunch realsense_camera rs_camera.launch"). It does not repeat when using my Linux PC. From what I have tried here, the only conclusion I came to was to write a scrip based on "rs-align" to align the images outside of the camera main stream... Idk if it will work for my application. It is kinda sad because using the "align_depth:=true" is such a simple solution. Let me know if were able find any other solutions to this! |
I bought a lattepanda alpha and was able to confirm that it can handily process at 15+fps with plenty of resources to spare. @paulacvalle, if you know ARM assembly, you could try porting the x86 SIMD align depth to ARM. |
@dgrnbrg Thank you for you suggestion! I was able to fix my problem by making sure I was installing everything from source and enabling CUDA... Which I thought I did before, but I was wrong. Anyhow, nice to know that the lattepanda alpha can also handle this processing! |
@dgrnbrg Glad to know the issue resolved. Can we close this accordingly? Thanks!
|
@dgrnbrg Any other questions about this issue? Looking forward to your reply. Thanks! |
@dgrnbrg Issue resolved. Close the ticket accordingly. Thanks! |
Hello, I am using a realsense d435i on a raspberry pi 4, with the intended use case streaming back to a nearby computer to run mapping and object detection.
I am seeing very, very poor performance with the
align_depth
option. Initially, I thought I had network issues, but I've reproduced the performance solely on the raspberry pi 4 (so I've eliminated network issues as a cause).When I look at
rostopic hz /camera/depth/image_rect_raw
on the rpi4, I see ~15 images per second. When I look atrostopic hz /camera/color/image_raw
on the rpi4, I also see ~15 images per second. However, when I look atrostopic hz /camera/aligned_depth_to_color/image_raw
on the rpi4, I only see ~1.2 images per second. Clearly, the aligned topic has much lower performance. Looking at the system metrics, I have 7.3G of memory free and I'm using about 250% of the 400% total CPUs available.What can I do to get aligned images so that I can use RTabMap, but without the major performance loss? Is the aligning something that happens on the host CPU and not on the realsense hardware? If so, are there build flags I should use to ensure I get reasonable performance [1]?
Thank you for your help.
[1] This seems strange, because I do not see the CPUs even close to full utilization.
The text was updated successfully, but these errors were encountered: