Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor performance with multiple subscribers to point cloud topic from Gazebo #215

Closed
Michael-Equi opened this issue Aug 4, 2020 · 4 comments

Comments

@Michael-Equi
Copy link

Michael-Equi commented Aug 4, 2020

Bug report

Required Info:

  • Operating System:
    • Ubuntu 20.04
  • Installation type:
    • Binaries
  • Version or commit hash:
    • 0.7.3
  • DDS implementation:
    • Cyclone DDS

Steps to reproduce issue

Run a simulated depth sensor in Gazebo and create two subscribers to the same depth sensor point cloud topic either through the CLI or Rviz (QoS does not matter).

Expected behavior

Real time factor of the simulation should remain at around the 0.98 - 0.99 mark that it normally sits at on my computer.

Actual behavior

Gazebo real-time factor drops to 0.04 rendering the simulation unusable.

@eboasson
Copy link
Collaborator

eboasson commented Aug 4, 2020

This sounds suspiciously like #207, so I hope you don’t mind my asking before diving into it: is this with the released version of Cyclone or is it with current master? The fix that did the trick for #207 is in 0.7.0rc1, but I haven’t tagged it as 0.7.0 yet (that’ll be later this week). After it has been officially released it can start making its way into the Foxy release as well.

(In case you wonder why it has been sitting for so long without being released, the reason is twofold. Firstly that fix came after the merge of DDS Security, which changed a lot of internal details that I don’t think should be released without proper care. Secondly, the process for releasing Eclipse projects just got a lot easier and I forgot to take that into account in the planning ...)

@Michael-Equi
Copy link
Author

Michael-Equi commented Aug 4, 2020

Interesting, I switched to the release version after I tested on a fresh machine thinking the patch was already in which solved #207. Maybe I changed something inadvertently between the tests that was involved in slowing things down. It never got as slow as now though which makes me feel it might be different. Also in #207 I didn't have point clouds running at all and things were slowing down, now everything is perfectly fine until there are two subscribers on the simulations point cloud topic.

@eboasson
Copy link
Collaborator

eboasson commented Aug 4, 2020

The particular bug has at times caused up to 100s delay on 10MB point clouds. eclipse-cyclonedds/cyclonedds#555 has a bit of info, if you look at the table you see a 45s one! That’s certainly enough to cause such a crazy real-time factor.

What happens is that the DDSI protocol requires that writers ping their readers with information on the sequence numbers that are available for retransmit until all readers have acknowledged the data (a perfectly ordinary thing to do). It also says that a reader is only allowed to request a retransmit in response to such a “heartbeat” message (not quite as standard).

At some point, the stack that Cyclone is derived from started dropping the rate at which it sends these “heartbeats” if the data are still not acknowledged after some time. If you have really large messages and (for whatever reason) have lost a lot of packets and need a lot of round-trips to get everything to the reader, then dropping the rate becomes a real issue. And the larger the samples, the worse it gets.

There are some other things that are included in the set of fixes I did, but those have much smaller impact.

@Michael-Equi
Copy link
Author

Ok, it sounds like that is the issue then (I didn't realize the patch hadn't been released when I made this). Ill go ahead and close this and if the problem persists after the release Ill reopen it. Thank you for explaining the patch, it makes it clearer as to why this issue exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants