-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SmacPlannerHybrid creates kinematically infeasible path #4271
Comments
Questions:
I've not seen this ever in my testing and deployment of Smac so I don't have an immediate thesis of what's going on beyond asking probing follow up questions hoping one will yield some insightful result.
What is this? Are you perhaps using a custom inflation layer? Not that I think this is the issue, but what if you use the params from the Global Costmap from nav2_bringup + Smac configuration guide params? Perhaps there's some subtle problem that we see with your setup that could help us narrow into it |
One question, assuming this is a real bug, is there some kind of test that would detect this? |
Is the inflation radius you set smaller than the robot's actual minimum radius? I see in the color scheme now that there is no inflation outside of the critically inscribed space. I'm not even sure in the greyscale visualization why that showed a gradient - it looks to be all 253's. Try increasing your inflation radius / decreasing your cost factor to create a potential field. If a 2+m radius (idk exact, just "enough" to test) does the job, I have a thesis on why but I would have expected that you'd see some logging at the warning/error level at least once about a problem from Smac. Irregardless, you should have a smoother potential field, it makes Smac behavior better and actually runs faster. I don't understand your image though, considering the footprint is in the middle completely disconnected from the path + why the path is orbiting so many times.
Agreed, or at least to the extent that you can test it to check if its an issue there or not to know if its something in the stack itself or something going on on your managed fork. This should be a Planning Server specific issue, so you should be able to use the simple commander API + a launch/params file to bundle up a reproduction example for me to bring up only the planner server and execute the request the triggers this for working the problem.
Almost certainly, but I think we need some better understanding of what's actually happening first. |
So, I spend a lot of time on trying to make it work on Rolling but I'm still not there yet. So I tried on Humble and I think I could reproduce, unfortunately without the planned_footprints as these are not on Humble (and I couldn't backport easily). We can see that a lethal part of the global costmap is in the footprint. The screen recording is like 8GB so here is a screenshot instead. I made the global footprint a bit bigger than the local one. These are the changes from the default nav2 params: |
Can you try what I asked? You still have the inflation way too small - there is no potential field established - and I want to know if that is causing the issue |
Oops I forgot that part. I'lll try it tomorrow, but assuming it works, we can't just call it a day right? I would think the planner should not generate kinematically infeasible path under any circumstances (throw an error instead). Or am I wrong? |
If that's the case, I know the ~3 places I need to look and think about and I can come back with a better answer about "why" and "what" and what a solution would look like, if one is needed. If nothing else, some kind of warning or error or rejection of optimizations should occur in that situation but maybe I missed something. However we don't even know if that'll help, so one step at a time 😄 But no, that's not the end of it, it just gives me somewhere to start! Right now, if that isn't the cause, I don't have many other answers :laugh: |
sorry for the confusion, this was actually a different configuration Anyway I think I have the results of what you asked for. robot_radius is 0.3 and inflation_radius is 0.55 but it can still happen. With the default values I couldn't reproduce easily but I don't think it's impossible. P.S: The localization of the TB3 demo is very jumpy, it's hard to tell for sure if the plan is unfeasible or if it just "jumped" into the lethal obstacle. I'm trying to get more convincing examples
To answer this, the circles are not a path orbiting, the red circular line is the global footprint I set and the blue circles are the debug /planned_footprints. (the red rectangle is the local footprint, not important here I guess) In this use case the robot radius is much bigger than the inflation layer, which you disadvised, but I was just testing my assumption that "SmacPlannerHybrid should never plan a kinematically unfeasible path". |
What's the black? Something seems a little odd. Note that when you use robot radius, it uses center point-checks, not the SE2 footprint. Please use a footprint so that we can have something analog to what you're seeing :-) They're checking different things if circular or non-circular (i.e. center point costs vs full footprint line segment iteration)
Note that we clear the robot's current cell, since it has to be clear if the robot is cough there. So what you showed isn't odd assuming the very next point in the path isn't in collision. That's to prevent deadlocks so that if you're in collision or just up close to something you can never get out of it. What I'd want to see is collisions mid-path! |
Not sure 😕 but it's gone with my next configuration
Alright let's take this example I cooked up It's a bit harder to cause a collision mid-path so I created this example which is a (potential) collision end-of-path - I think it points the the same thing right?. I positioned the robot in front of an obstacle and sent it a goal forward towards the obstacle. Result, the path is created but the controller, in this case TEB complains that the path is not feasible |
hmm I'm assuming you're talking about the cylindrical pillar is part of the static_layer whereas the rectangular one on the left is not (only part of the obstacle layer). You can see that only the obstacle layer is cleared by the footprint. Maybe it would make sense to also have it for the static layer? Should I create an issue for it? |
Lets simplify this, show this happening with your previous environment setup you originally posted with the new configuration - don't try to make it work on the TB3 example / some other situation. Lets focus on the situation you presented. You had showed it happening for many intermediate poses that was unquestionably problematic, show me that again with the new configs so we know if it helped or not. I don't want us to get bogged down in some first/last point details right now - we should be able to see this clear as day as you showed it before with setting A and gone with setting B if we found the problem. I think some of what you're using as a proxy (the robot's current localized location) is not a good proxy for planning, so lets remove that estimated proxy and look at the exact things as much as possible I just grabbed the Smac Hybrid params from the docs and ran it against the same 0.3m robot radius + 0.8 inflation radius on the default Nav2 configs. All seems good to me: (not in the left photo that it looks like its colliding; its not, the marker array line thinkness is just set wide. Zooming in its fine.) When I set the inflation radius to be less than the actual robot radius, it clearly occurs Please show me that you see that too 😉 |
So thinking about this logically: if you have a 20m wide robot (for example) but set the inflation radius to 0.1m, then yeah, that seems like a problem because the costmap doesn't even know to look for a collision because its all just So, option:
But at its root, your configuration had set the inflation radius less than your robot's actual radius - so that's a fix for the problem, but not a fix to make sure its not possible. Point2 makes it no longer possible so that's my recommendation and should be only a few line change to check the 2 radii and take the maximum of the two. I can show that when they're equal, things work again fine (though gets very close to collision, as you'd expect without a potential field incentivizing against it; but still valid) |
Ah I see, I thought collision checking works differently: I thought for every potential path, it checks if any of the planned_footprints is in collision (like polygon to polygon intersection), and rejects the path if so. Do I understand correctly that the way it actually works is by checking only the cost at the projected poses and reasons based on that? (Anywhere I could read more about it apart from code?) If that's the case. I like your fix suggestions. We will come back to this first issue but I think, even with the inflation radius set correctly, SMAC can still plan for infeasible paths, when the starting pose of the robot is already close to an obstacle (which is what the original issue is about). Here are some examples: Here is the global planner/costmap config for these screenshots:
|
First, these are really pretty pictures! The footprint projections are really cool. But anyway - What is the green? Curious if you know this wasn't an issue before some update or this is always been an issue. If it was a regression, fast forwarding changes until it happens could be a good way to see where it was introduced if new This is quite hard for me to debug from afar. What I'm wondering is if there's an issue with collision checking in the analytic expansions. These paths are so trivial that its likely that these are totally formed up of only the analytic expansions without any search. Can you do this again with some more complex request so that search is strictly required at some point and its not trivial to a point that it could be just analytic expansions? That would help answer the following: Do you consistently see this at the start or end of the paths or both? If it only happens at the end (i.e. when analytic expansions are used), that narrows things down. If it happens at the start (uniquely or combined) that also tells us something. I cannot replicate this on my side so far, so I feel like this is still a configuration derived issue, but I don't discount a real bug either. However there are a bunch of companies with large non-circular robots using this and it hasn't been reported by them unless its a newly introduced bug.
This is still suspicious to me, can we not do that for testing purposes or can you show that it happens with a normal global map without this? Again, collecting data. If we find that's the case, we can dig into why
I'm reading this that your robot is 1.01 from the pivot point but your inflation radius is only 1.0. What if you make the inflation radius 2.0 (just sufficiently larger than 1.0)?
This isn't a param for Hybrid-A*, this is making me suspicious that there is something really odd going on with your params. What if you use the params from https://navigation.ros.org/configuration/packages/smac/configuring-smac-hybrid.html#example exactly? Note: When I say "start" and "end" I don't want to even look at the very first or very last points, just generally collisions are being recorded near the start/end. Lets not miss the forest for the trees. |
Green is TEB's path/control sequence
I can't pinpoint a moment where that "started" happening, I think it was always there
Since the bug mostly occurs when being already close to an obstacle (being in circumscribed area), I would say that it occurs mostly at the start or end of a path i.e. first/last 5 poses. It does not happen when going from free point A to free point B while avoiding an obstacle midway.
That's odd. Do you have all the information needed to try and reproduce it? Important factors:
Will do. I'll test again with as much "standard" config as possible.
will do
that was an oversight, the default config was just copied as is and changed what is needed. Removed Since you can't reproduce easily, do you have ideas what debug logs I could add inside the code that could give some insight? |
Let's start with more pictures. First some successes: the failures: Default config used:
as a workaround, is there a config that specifies how far away the |
Hi, Got it. Huh. Honestly not sure what to say about that off hand. I still can't seem to make that happen. I believe you that there is a problem if that's what you're seeing, but I'm at a loss if I can't reproduce it. Can you make a repository with the costmap, planner configs, launch file, world, and a script to call the server to go from some example A->B that you're using here so that I can try 1:1 what you're doing on my side and use that as a test case to look into? If I can poke around, that would be ideal. Otherwise, I'd ask you to look into the collision checker object and log information that you think might help point to a particular issue. I know, for instance, the last point could potentially be slightly in collision due to the exact goal pose added at the end of the path to have precise solution even though the plan is made in a binned space. But any intermediary point should never be possible. A few of the failures (images 1,2,5) I look at and see that it appears to be the start/end point which is something we should deal with as a separate matter. Image 4 also has that but I see the 2nd to last pose could (?) be in collision. Its hard to tell given the thickness of the line. Image 3 looks less ambiguous but still not overly clear unlike some of your original images |
Yeah I get that it can be hard to debug if you can't reproduce. Instead of creating a repo from scratch, how about I use what I can from TB3 demo, change to a rectangular footprint + config used above + script and send you my branch? |
See the note above I edited, you responded faster than I thought 😉 I would like something standalone with the planner server / demo script so we can look at this in isolation so we're not messing with the full simulator with localization updates and whatnot that can make the waters a little more murky. Usually what you describe would be fine for me (even preferable so I don't have to change my usual workflows), but this looks to be subtle now unlike before, which was a clear and obvious problem, so I don't want to leave anything to interpretation |
I've been delaying creating the reproducible example because it seems like a lot of work. Instead I tried to debug it more myself and I think I have a good lead: by commenting out this check and consequently making the collision checker run all the time, I could not reproduce the issue anymore. So I'm guessing footprint_cost_ or possible_collision_cost_ is wrong. I'm finding it hard "proving" that one or the other is wrong but that's where I'm at right now, if you have any leads, shoot |
Not shocked by that I suppose, but unclear. For the second image, there are clearly points in free space, so those I don't have an immediate answer. Some costmap resource not being locked correctly in your fork (or main)? I forget if you're showing this based on your managed fork or Nav2 mainline
What if this is set to 1.0? Curious. |
Not off hand, but the rviz color scheme seems clear that it is not actually 0. Hence suspecting threading / locking issues getting corrupted data perhaps |
Looks like it's the case, I plugged in the coordinates in a custom service that get the cost and it doesn't return 0. I'm really going down the rabbit hole with this bug... |
It didn't take long, @SteveMacenski try it with: |
Correct. Because this is no longer finding the exact analytic solution, its finding another solution within your specified tolerance parameter. Its not just removing the last point (which was the second option I set in my write up above). |
@SteveMacenski any estimate when you'll be able to take a look at this? |
The Jazzy release stuff has put this as a secondary problem for me until thats all squared away for a release. I hoped to get to this over the weekend but that didn't pan out. I expect that this is a much simpler problem though if you wanted to take a look. You shouldn't have to go too deep since we convert the start to a grid-space early on and then collision check it. I expect we have some clearing under start that we say is "OK" that could be made into an option |
@tonynajjar this is easily explained.
We clear the start since the robot is there, it is thus "clear" because we are physically there already. It must be clear (though perhaps in contact with something) if the robot is in this location. When I add in while removing
It triggers that there is a collision and refuses to plan. This is the behavior we have in Humble & Iron but not main. I think this is a sensible behavior to have, don't you? Else you could end up in a forever loop of not being able to move when in collision with something in the static map that moved and thus is no longer actually there (or if in a tight space cannot get out). I suppose we could make this an option about if you want to clear it or check it, that would be straight forward. I however don't see much reason you'd want to be so strict about it unless you were generating starting point paths where the robot physically was not correctly located for testing (or I suppose precomputation). Let me know what you think. Its an easy resolution either way (closing this as finally completed; and/or adding in the option) |
That's my bad, I didn't mean for the starting pose to be in collision. The problem I'm trying to reproduce is when the collision is at some pose at the beginning (but not the first pose). I was able to reproduce and colored the first pose in red so you can be sure that it's not the first pose that's in collision: I updated the branch |
Any chance this could be looked at this week again? (sorry to pressure but deployment is soon and that's still an issue 🙈). I would look at it myself if I didn't already for > 5 hours, I just don't know enough about how this planner works yet. |
@tonynajjar I'm going to look into it this afternoon, but I think even before starting to program I know what it is. I've been having this on the back of my mind and I guess my brain was crunching on it subconsciously after you posted the updated image. I think what's happening is that we set the starting cell to clear. So, if you also happen to rotate in place within the same cell, that would also be free at any orientation, not just the starting pose. And that's what it looks like is happening with your example. Its easy for me (once this compiles and I come back from lunch) to test if that's the case with the same experiment I did above removing Then, I'll just need to change how we handle the starting point as "special" rather than editing the costmap for how we process it. Its a little unclear to me how it is that we have a virtual rotate-in-place for hybrid-A* at the start, since that's not a behavioral option. But, that is a behavioral option for State Lattice so that's a bug nonetheless. I gauge this to be done by EOD. Lunch first. |
I think this squares up this ticket :-) Thanks for filing this @tonynajjar and these are good details to have worked out. The start pose issue that we just handled is actually unique to Obviously happy to reopen if there were other topics, but I reviewed the thread and I think we hit all the others |
Cool! Will try it out tomorrow. I think there is one more thing to do in this issue - warn or block if the inflation layer is not configured big enough. We discussed it in the beginning |
I thought that was something you were going to open a PR for 🙃 |
I wanted to gather all the TODOs for this issue and tackle them at once but that's not how it panned out 😅 I'll reopen just in case my short-term memory fails me until I create another ticket (or PR) for it tmrw |
Add a ERROR log instead of a ticket and that should suffice! At least that way someone looking through logs to debug would see that they made a mistake even if we don't actively prevent them from making that error. |
Hi @SteveMacenski @tonynajjar @doisyg I am using ROS2 Iron, the clearStart fix is not needed as you have mentioned. However I am still facing collision issues. I have already increased the radius_inflation parameter either on local and global costmaps to be greater than the robot's footprint and set the resolution of costmaps from small values (0.01) to huge values ( 10.0). Can you please assist me to fix this issue? My nav2 parameters are in this link: truck_nav2_params.yaml |
Any chance your could provide reproducible examples like I did using the planner_playground? |
@tonynajjar I was busy while you were fixing the issue, I am reworking the planner tasks now...I've missed this part, I will check how to use this tool you have mentioned and then I can submit the results here. One question for now, In your case, you don't need to avoid an obstacle in the middle of your trajectory like me? |
I do, but that was never a problem if the inflation layer is configured correctly, only collisions at the start and the end |
Oh, that’s concerning 😬. Could I take a look at your configuration file and get the dimensions of your robot (footprint or radius)? For reference, my truck is 10 meters in length and 4 meters in width. I’ve set the inflation layer values to be greater than these dimensions, even using a value of 50.0. However, I’ve noticed that the costmap's obstacle layers seem to have a numerical value threshold. Visually, the inflation seems to stop increasing after setting a value around 5.0. |
You can get my planner config from my fork of the planner_playground I shared a few comments up |
Hi, guys! The same problem is discussed here and here regarding MPPI algorithm. Here is how the default behavior looks like: Here is the result with this modification: |
That simply removes the optimization and you pay for that in compute time. The issue is that your inflation layer is not properly setup as discussed at length in the ticket above. If using What happens if you set your inflation radius to be larger than your circumscribed radius? The tuning guide, ROSCon talks, and example configurations all highly suggest that you use a smoother potential field for performance and path quality improvements. While actual collisions shouldn't happen, this is why most people don't see this issue since having a tiny inflation with pure 0-cost space is a bad configuration for those algorithms. We find the cost to be possibly in collision using the circumscribed cost (https://github.com/iradionov/navigation2/blob/4b112116c5ae728185e7c4a7874c6a5879abf7ce/nav2_smac_planner/src/smac_planner_hybrid.cpp#L226), which in your case should render to |
Thank you for your prompt response. The last question: why do we get the cost of the shifted point? |
It has to do with the fact that casting it to an
That is a tuning thing with the |
Bug report
Required Info:
Steps to reproduce issue
I tried it in my custom simulation but I reckon it should also happen on tb3 simulation. You could try to set the global footprint bigger than the local one and try navigating close to a lethal obstacle on the global costmap.
This is the planner/costmap config:
Expected behavior
Create kinematically feasible path
Actual behavior
Resulting path makes planned footprints collide with global costmap
red thin rectangle = global and local footprint
red thicker rectangle = TEB debug visualization of the infeasible projected footprint
red path = SmacPlannerHybrid global plan
I also tried with a circular footprint, same result
Additional information
The text was updated successfully, but these errors were encountered: