-
Notifications
You must be signed in to change notification settings - Fork 464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash: Removing MarkerArrays not threadsafe? #1469
Comments
I recorded a bag file but cannot reproduce in stand-alone rviz so far: |
I have another similar crash in RobotStateDisplay this time. I guess this is due to our code having an AsyncSpinner running ... @rhaschke do you have an idea for a general solution or should I go through all displays and use signal-slots to fix threading and move all rendering related stuff to the gui thread? Would be some PRs ... |
rviz doesn't use a spinner, but calls
|
We embed a Now understanding the concept of message queues and NodeHandles better I think I found the bug in RobotStateDisplay at least: |
I think, rviz assumes to a large extent that ROS callbacks are executed synchronously with the main GUI thread. There are only a very few exceptions to this rule, namely displays derived from PointCloudCommon. This indeed, provides a separate callback queue and a spinner processing this queue. I think, you should call ros::spinOnce() from your main GUI thread frequently... Regarding RobotStateDisplay: As the new NodeHandle doesn't have its own callback queue, it is just using the global one as well, which is processed from the GUI thread. I think, changing to |
Line 270 in 6bf5975
at least according to the comment here it should have the desired effect of being run synchronous and indeed I do get different crashes now ... |
Could you please verify that the two handles use different (or the same) queues? |
Never trust a comment/doc-string ... rviz/src/rviz/visualization_manager.cpp Line 289 in 6bf5975
So this should probably be changed to actually use a separate queue that really is processed synchronously? I'll open an rviz PR I also experimented with locking the RobotState in RobotStateDisplay, this would even allow us to always handle RobotStateMessages asynchronously (by using threaded_nh_ and a std::mutex/std::lock_guard) which fixed my crashes so far. Will open a MoveIt PR as well ... |
I think, the global callback queue is assumed to be processed by the main gui thread, while the threaded_nh_ uses a separate queue that needs to be processed by a separate spinner. |
It could be triggered in Update/the rendering event (by a QTimer) and would then always be tied to the gui thread. I'll try that later and open a PR |
I opened a PR #1475 |
I displayed a large (~3000) set of axes and when removing them from the display again (via the corresponding marker message) rviz crashed. There seems to be a race condition or similar problem between Thread 18 removing a material and Thread 1 displaying it
Please see the gist for a summarized backtrace
https://gist.github.com/simonschmeisser/12b1f4d6e0bcbd61516ec8646628c585
Your environment
The text was updated successfully, but these errors were encountered: