Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to downsample pointcloud and it's performance (decimation filter) #1964

Closed
Combinacijus opened this issue Jul 1, 2021 · 5 comments
Closed
Labels

Comments

@Combinacijus
Copy link

Combinacijus commented Jul 1, 2021

Running roslaunch realsense2_camera rs_camera.launch takes 25% CPU (on single core) on Jetson AGX Xavier
But enabling pointcloud roslaunch realsense2_camera rs_camera.launch filters:=pointcloud CPU usage goes to 112%
I'm not sure if this is ok cpu usage or too high. But I assume by down-sampling pointcloud data before publishing would save cpu resources.
Is it possible to downsample pointcloud with ros wrapper?

@Combinacijus Combinacijus changed the title Is it possible to downsample pointcloudRealsense D435 Is it possible to downsample pointcloud Realsense D435 Jul 1, 2021
@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jul 2, 2021

Hi @Combinacijus As you are using a Jetson that has an Nvidia graphics GPU, it may be worth first adding CUDA graphics acceleration support to your RealSense configuration if you do not have CUDA enabled already. Doronhi the RealSense ROS wrapper developer comments about use of CUDA in RealSense ROS in the link below.

#1177 (comment)

My understanding of that discussion is that Doronhi advises that the RealSense ROS Debian packages do not include CUDA support. CUDA support can instead be built in librealsense using -DBUILD_WITH_CUDA=true, as mentioned in the instructions on Intel's Jetson installation page.

https://github.com/IntelRealSense/librealsense/blob/master/doc/installation_jetson.md

You could also investigate using a decimation filter to reduce depth scene complexity.

#1924 (comment)

If you plan to use align_depth in your pointcloud in combination with a decimation filter then Doronhi provides advice about this in the link below.

#1924 (comment)

@mszuyx
Copy link

mszuyx commented Jul 4, 2021

@Combinacijus Hi, I was originally going to get a Jetson AGX Xavier to host my realsense cameras (2~4). But since I noticed the most critical resource to host multiple cameras is the CPU capacity, and it seems like by default the cameras tend to exploit a single CPU thread than distribute the computation through all threads, I decided to go with a i7 up xtreme instead (beefy CPU per thread, similar price to AGX, great compatibility with realsense product... but sadly no CUDA)(https://up-shop.org/up-xtreme-series.html).

At the time, the -DBUILD_WITH_CUDA=true option hasn't came up yet (or I just not aware of it). I am very interested in staying in touch and knowing how this build option effects the performance / cpu usage of AGX Xavier.

Btw, for me, setting the decimation filter with a magnitude of 4 cut the CPU usage in half. (I checked it via system monitor on Ubuntu. It is not a very precise way to measure the exact CPU usage. But it is enough to know that decimation filter is making a significant difference. If you are not trying create a highly detailed map using the camera, this is a good way to do it)

Thanks!

@Combinacijus
Copy link
Author

Combinacijus commented Jul 5, 2021

Thank You so much @MartyG-RealSense! It's really nice to have support from such people like you who knows issues in and out I appreciate that

@mszuyx did some tests. Do you remember what CPU usage you were getting and do you think my results are reasonable?

So I made this lengthy comment with steps how to reproduce it and results at the end. I'm sure it will be handy for me in a future and hopefully to other. Also note that cpu usage in original question was measured in different power mode so don't compare it with this comment.

Setup for librealsense performance testing:

Jetson setup

sudo nvpmodel -m 0  # MAXN mode (max performance)
sudo nvpmodel -d cool  # Set fan mode | quiet or cool 
sudo nvpmodel -q # Check current power mode

Check realsense packages and versions

ls -l /usr/bin | grep rs-  # Check rs-* binary files from apt installation
ls -l /usr/local/bin | grep rs-  # Check rs-* binary files from source installation
dpkg -l | grep realsense  # Check which realsense packages and versions are installed

Uninstall librealsense. #950 (comment)

cd ~/Downloads/librealsense/build
sudo make uninstall && make clean  # Uninstall built from source
sudo apt remove ros-$ROS_DISTRO-librealsense2 # Uninstall from apt
dpkg -l | grep "realsense" | cut -d " " -f 3 | xargs sudo dpkg --purge # Uninstall from apt

Download librealsense (tag v2.48.0 currently latest version) and general setup

# Do this once
cd ~/Downloads
git clone --depth 1 --branch v2.48.0 https://github.com/IntelRealSense/librealsense.git
cd ~/Downloads/librealsense/
./scripts/setup_udev_rules.sh  

Build setup. Build flags

librealsense

mkdir build && cd build
cd ~/Downloads/librealsense/build
cmake .. -DBUILD_EXAMPLES=false -DCMAKE_BUILD_TYPE=release -DFORCE_RSUSB_BACKEND=true -DBUILD_WITH_CUDA=false && make -j$(($(nproc)-1)) && sudo make install

realsense-ros I have built from source so my ros setup:

# cd to ros workspace
# catkin clean  # Clean build needed only if librealsense version have changed
catkin config --cmake-args -DCATKIN_ENABLE_TESTING=False -DCMAKE_BUILD_TYPE=Release
catkin build

Test performance

Launching rs_camera with pointcloud filter

roslaunch realsense2_camera rs_camera.launch filters:=pointcloud

Measure cpu usage with rqt_top my fork for total cpu usage with regex real to filter out realsense nodes (2 node camera and camera_manager)

rosrun rqt_top rqt_top

Rebuild with different flags... Repeat


Jetson AGX Xavier MAXN power mode performance test results

CPU usage is eyeballed average. 100% means that 1 cpu core is used fully. Jetson Xavier on MAXN power mode have 8 cores available that means 800% available cpu.

librealsense is built with different methods. realsense-ros not changing.

Test command: roslaunch realsense2_camera rs_camera.launch filters:=pointcloud

Results:

Flags CPU %
apt 75
RSUSB 0 | CUDA 1 75
RSUSB 1 | CUDA 1 82
RSUSB 0 | CUDA 0 110
RSUSB 1 | CUDA 0 113
apt | apt 106

Installed from apt || CPU: ~75%

sudo apt-get install librealsense2-utils librealsense2-dev

RSUSB 0 | CUDA 1 || CPU: ~75%

cmake .. -DBUILD_EXAMPLES=false -DCMAKE_BUILD_TYPE=release -DFORCE_RSUSB_BACKEND=false -DBUILD_WITH_CUDA=true && make -j$(($(nproc)-1)) && sudo make install

RSUSB 1 | CUDA 1 || CPU: ~82%

cmake .. -DBUILD_EXAMPLES=false -DCMAKE_BUILD_TYPE=release -DFORCE_RSUSB_BACKEND=true -DBUILD_WITH_CUDA=true && make -j$(($(nproc)-1)) && sudo make install

RSUSB 0 | CUDA 0 || CPU: ~110%

cmake .. -DBUILD_EXAMPLES=false -DCMAKE_BUILD_TYPE=release -DFORCE_RSUSB_BACKEND=false -DBUILD_WITH_CUDA=false && make -j$(($(nproc)-1)) && sudo make install

RSUSB 1 | CUDA 0 || CPU: ~113%

cmake .. -DBUILD_EXAMPLES=false -DCMAKE_BUILD_TYPE=release -DFORCE_RSUSB_BACKEND=true -DBUILD_WITH_CUDA=false && make -j$(($(nproc)-1)) && sudo make install

librealsense installed from apt AND realsense-ros installed from apt (unlike tests above) || CPU: 106%

sudo apt-get install librealsense2-utils librealsense2-dev
sudo apt-get install ros-$ROS_DISTRO-realsense2-camera

realsense-ros from apt have poor performance probably because old version were installed:

dpkg -l | grep realsense  # Check which realsense packages and versions are installed 
ii  ros-melodic-librealsense2                    2.45.0-1bionic.20210507.014007                   arm64        Library for capturing data from the Intel(R) RealSense(TM) SR300, D400 Depth cameras and T2xx Tracking devices.
ii  ros-melodic-realsense2-camera                2.3.0-1bionic.20210507.053718                    arm64        RealSense Camera package allowing access to Intel T265 Tracking module and SR300 and D400 3D cameras
[ INFO] [1625507077.406013408]: RealSense ROS v2.3.0
[ INFO] [1625507077.406126528]: Built with LibRealSense v2.45.0
[ INFO] [1625507077.406207744]: Running with LibRealSense v2.45.0

After deleting realsense-ros and rebuilding from source performace is back to normal ~75%

sudo apt remove ros-$ROS_DISTRO-librealsense2
# Rebuild from source
# Back to normal
[ INFO] [1625507483.785335360]: RealSense ROS v2.3.1
[ INFO] [1625507483.785429408]: Built with LibRealSense v2.48.0
[ INFO] [1625507483.785518592]: Running with LibRealSense v2.48.0

Reduced output of rs_camera.launch (frame sizes, frame rate etc.)

process[camera/realsense2_camera_manager-2]: started with pid [16085]
[ INFO] [1625508732.262271552]: Initializing nodelet with 8 worker threads.
process[camera/realsense2_camera-3]: started with pid [16091]
[ INFO] [1625508732.971868064]: RealSense ROS v2.3.1
[ INFO] [1625508732.971984960]: Built with LibRealSense v2.48.0
[ INFO] [1625508732.972047168]: Running with LibRealSense v2.48.0
[ INFO] [1625508733.054951232]: Device with name Intel RealSense D435 was found.
[ INFO] [1625508733.532452224]: JSON file is not provided
[ INFO] [1625508733.533346720]: Device FW version: 05.12.14.50
[ INFO] [1625508733.533486848]: Enable PointCloud: On
[ INFO] [1625508733.533649280]: Align Depth: Off
[ INFO] [1625508733.533711232]: Sync Mode: On
[ INFO] [1625508733.543449344]: Stereo Module was found.
[ INFO] [1625508733.555027840]: RGB Camera was found.
[ INFO] [1625508733.555170816]: (Confidence, 0) sensor isn't supported by current device! -- Skipping...
[ INFO] [1625508733.555253632]: Add Filter: pointcloud
[ INFO] [1625508733.556600896]: num_filters: 1
[ INFO] [1625508733.863399520]: depth stream is enabled - width: 848, height: 480, fps: 30, Format: Z16
[ INFO] [1625508733.865829632]: color stream is enabled - width: 640, height: 480, fps: 30, Format: RGB8
[ INFO] [1625508733.878255808]: Expected frequency for depth = 30.00000
[ INFO] [1625508734.068834048]: Expected frequency for color = 30.00000
[ INFO] [1625508734.228907168]: insert Depth to Stereo Module
[ INFO] [1625508734.229159168]: insert Color to RGB Camera
[ INFO] [1625508734.294910400]: SELECTED BASE:Depth, 0
[ INFO] [1625508734.370369536]: RealSense Node Is Up!

Decimation filter performace test

Continuing with last test best performer (librealsense apt, realsense-ros from source) but adding decimation filter. (For some reason even without decimation filter new launch file performs a bit better although I copy pasted it)

rs_camera.launch file changes for decimation filter

	...
	<arg name="filters"                   default="pointcloud,decimation"/>
	<arg name="decimation_filter_magnitude"  default="2"/>
  ...

  <group ns="$(arg camera)">
    <param name="decimation/filter_magnitude" value="$(arg decimation_filter_magnitude)"/>

    <include file="$(find realsense2_camera)/launch/includes/nodelet.launch.xml">
    ...
    </include>
  </group>

Results decimation filter

Decimation magnitude CPU %
OFF 72
1 75
2 43
3 35
4 33
5 30
6 28
7 28
8 28

Conclusion

Best performing option is to:

  • Install librealsense from apt or build with -DFORCE_RSUSB_BACKEND=false -DBUILD_WITH_CUDA=true flags
  • Build realsense-ros from source to match librealsense version

Decimation filter with magnitude of 2 or 3 significantly improves performance but reduces resolution which might be ok depending on use case.

Other things to try would be reducing resolution and fps or disabling rgb camera.

From the tests it seems like librealsense in apt is built with CUDA and on V4L Native backend aka RSUSB=false. Also CUDA increases performance only by ~31% compated to no CUDA it's not as drastic as suggested.

For now I'll stick with apt version of liberalsense and built from source realsense-ros. Downside of this is that liberalsense might update automatically and I'll need to match version manually.

Notes

When running rs_camera.launch file build versions will be printed like:

[ INFO] [1625504425.540778784]: Built with LibRealSense v2.48.0
[ INFO] [1625504425.540846656]: Running with LibRealSense v2.48.0

So make sure it's matching because I had some trouble to cleanly rebuild/reinstall librealsense and realsense-ros because apt and manual build were mixed up. Best to delete both (or should I say all 4: apt and built versions for both) and install/build again.

Questions

  1. I'm not sure if I haven't missed something but can someone verify it's a reasonable performance (~75% single core usage) that I can expect from Jetson Xavier using realsense D435 camera at depth stream is enabled - width: 848, height: 480, fps: 30 resolution with pointcloud filter in ROS?
  2. Idea is to connect multiple realsense cameras so any other ideas on performance improvement would be appreciated although this might be enough.
  3. Should I rename this issue to something like: "How to downsample pointcloud and Jetson Xavier performance" (or similar, suggestions?) so that answers would better reflect the question?

Thanks again!

@MartyG-RealSense
Copy link
Collaborator

Thanks so much @Combinacijus for your deeply detailed report that is sure to be of benefit to future readers in the RealSense ROS community!

In regard to your questions:

  1. I do not have much that I would add to my earlier comments about CUDA. I note in your original comment that you experienced especially high CPU usage when generating a pointcloud. Pointclouds, alignment and color conversion are the types of operation that CUDA support in librealsense can accelerate by offloading CPU work to the Nvidia GPU on the Jetson boards. So it is certainly worth having CUDA support in the SDK enabled if you performing those three types of process.

  2. Jetson boards are well suited to handling multiple cameras on the same board. A mains electricity powered USB 3 hub is recommendable for such applications. A model that Intel tested successfully in their 400 Series multiple camera white-paper document was Amazon's AmazonBasics powered USB 3 hub. I have one of these myself based on that recommendation and it has been problem-free.

https://dev.intelrealsense.com/docs/multiple-depth-cameras-configuration#section-2-multi-camera-considerations

  1. If you believe that your advice is especially relevant to Jetson then there is certainly no harm in changing the issue title to reflect that.

@Combinacijus Combinacijus changed the title Is it possible to downsample pointcloud Realsense D435 How to downsample pointcloud and it's performance (decimation filter) Jul 7, 2021
@mszuyx
Copy link

mszuyx commented Jul 9, 2021

@Combinacijus Yes, I think the CPU usage makes sense. Actually I am seeing similar numbers in Up xtreme with D455. (~106% with 1280 x 720 RGB and 848 x 480 pointcloud)

I am also working multi-camera setup with D455. You are welcome to DM me on [email protected] and exchange experience.

For my application, I am using multiple cameras to gain bigger FOV for robot navigation. Since I don't care the texture of the obstacles in this context, I also managed to save a bit more CPU by streaming textureless pointcloud. #1924

BTW, I want to appreciate your decimation magnitude vs CPU table. This tells us the most cost effective setting for the decimation filter is 2~4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants