-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reach-based pruning for bidirectional astar #3257
Conversation
|
||
graph_tile_ptr t2 = nullptr; | ||
baldr::GraphId opp_edge_id; | ||
const auto get_opp_edge_data = [&t2, &opp_edge_id, &graphreader, &meta, &tile]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we call this logic in several places, so moved out this logic into a separate lambda function
// encountered on the opposing search we should do the same now: skip or traverse. | ||
if ((opp_edge_set != EdgeSet::kSkipped && | ||
hierarchy_limits[meta.edge_id.level() + 1].StopExpanding(pred.distance())) || | ||
opp_edge_set == EdgeSet::kPermanent || opp_edge_set == EdgeSet::kTemporary) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
choose to skip or use the shortcut based on the opposing search results
const auto opp_status = edgestatus_reverse_.Get(fwd_pred.opp_edgeid()); | ||
if (opp_status.set() == EdgeSet::kPermanent || | ||
(opp_status.set() == EdgeSet::kTemporary && | ||
edgelabels_reverse_[opp_status.index()].predecessor() == kInvalidLabel)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can pass through the destination edge (in case we have several snapping edges) and prune all branches earlier than pull out this edge from the queue. In order to prevent loosing such paths , we can add this connection even if the opposing edge marked as "temporary"
src/thor/bidirectional_astar.cc
Outdated
for (size_t level = TileHierarchy::levels().size() - 1; level > 0; --level) { | ||
if (hierarchy_limits_reverse_[level].StopExpanding(rev_pred.distance()) && | ||
!hierarchy_limits_forward_[level].StopExpanding(fwd_pred.distance())) { | ||
force_forward = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
force forward pass in case the reverse search exhausted hierarchy limits for this level
src/thor/bidirectional_astar.cc
Outdated
break; | ||
} else if (hierarchy_limits_forward_[level].StopExpanding(fwd_pred.distance()) && | ||
!hierarchy_limits_reverse_[level].StopExpanding(rev_pred.distance())) { | ||
force_reverse = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
force reverse pass in case the forward search exhausted hierarchy limits for this level
// Estimate lower bound cost for the shortest path that goes through the current edge. | ||
float route_lower_bound = | ||
fwd_pred_pred.cost().cost + fwd_pred.transition_cost().cost + rev_pred.sortcost() - | ||
astarheuristic_reverse_.Get(pred_tile->get_node_ll(fwd_pred_pred.endnode())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Estimate lower bound cost for the route that goes through the current edge. The technique from the article was adopted to the edges (in the article used vertices). Let me describe the formula for vertices. Let's suppose the current vertex we pulled out of the forward queue is v
(v'
- a vertex with min sortcost in the reverse queue), so the are following equations are true:
cost_f(v) + sortcost_r(v') - h_r(v) <= cost_f(v) + sortcost_r(v) - h_r(v) =
= cost_f(v) + cost_r(v) + h_r(v) - h_r(v) =
= cost_f(v) + cost_r(v) = cost(shortest path through v)
where
cost, sortcost, h - costing function, sortcost, heuristic function;
*_f, *_r - indicates forward and reverse directions respectively.
these equations say that cost_f(v) + sortcost_r(v') - h_r(v)
is a lower bound cost for the route through v
. Using this logic we can replace vertices by edges and get approximately the same formula for our case.
if (desired_paths_count_ == 1) { | ||
cost_threshold_ = sortcost + kThresholdDelta; | ||
cost_threshold_ = c + kThresholdDelta; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct cost threshold. it guarantees that the optimal route will not be missed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dnesbitt61 looks like we can skip kThresholdDelta
in case of ignore_hierarchy_limits=true
(no hierarchy prunings, no shortcuts). But it should be tested on some big bicycle/pedestrian dataset that I don't have right now. It can be done in a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, on my local planet I didn't notice any difference on pedestrian_roundabout_routes.txt
, bicycle_routes.txt
, and pedestrian_routes.txt
from test_requests
folder
51a326e
to
00eb289
Compare
00eb289
to
711c629
Compare
i have to ask it 😄 whats the performance change, is it similar to the comment you posted before when you first started this approach? also thank you so much for the extensive description it goes a long way to helping make the changes reviewable. i will take an in depth look at the end of the day |
ComparingShortest pathPerformancepassed requests with
branch
branch-with-no-pruning
I measured timings for the master branch, current branch and current branch with disabled pruning to showcase the influence of pruning technique on performance. As we can see, the average response time increased only by 8% for the branch and by 46% for the branch with disabled pruning. Routes qualityNumber of routes to analyze: 13797 This branch affected 687 (out of 13797) routes, it's about 5%. Final cost of the optimal path was improved in 581 cases and degraded in 106 cases (in terms of ETA, the numbers are the following: 532 <-> 156). So, we can say that in 85% of cases changes were positive. It's important to notice, that I got only 10 routes (out of 13797) that were different for the branch with pruning and without pruning. So, in 99.9% of cases we can be sure that the pruning technique didn't cut a branch with more optimal route. The reasonable question may be here: why we actually got a difference ? - after analyzing differences I found out that the main reason is this one #2979 . We can't completely "synchronize" shortcuts usage for bidirectional search, there may be a cases when the shortcut was expanded in forward direction but we came into the middle of the shortcut using reverse direction. To solve this we probably should recover shortcuts while traversing trees, but it's too expensive now. AlternativesPerformancepassed requests with
branch
We see that the timings in case of alternatives routes search became even a little bit faster. Alternatives count
In the table above we can see the difference in the number of alternatives that were found for test routes. Significant improvement in case of the branch. |
Thank you! I posted results for the "shortest path" case, also will add results for alternatives. |
I tried some sort of reach based pruning a couple of years ago and saw some performance gains for bicycle routes but there were too many changes to routes for me to feel comfortable. I tried this branch for my long, test bicycle route and performance decreased by 21%. I would advocate creating a simpler, bidirectional A* algorithm for use where no hierarchies exist (that is how I create my data for bicycling) or are needed (e.g. bicycle and pedestrian routes). |
// Keep the best ones at the front all others to the back | ||
best_connections_.emplace_back(CandidateConnection{fwd_edge_id, rev_pred.edgeid(), c}); | ||
|
||
if (c < best_connections_.front().cost) | ||
std::swap(best_connections_.front(), best_connections_.back()); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any significance to moving this down here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yes i see it, in the first if second part of the or clause, you want ot check the current best connection against this one
src/thor/bidirectional_astar.cc
Outdated
} else { | ||
cost_threshold_ = sortcost + std::max(kAlternativeCostExtend * sortcost, kThresholdDelta); | ||
cost_threshold_ = (1.f + kAlternativeCostExtend) * c + kThresholdDelta; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just set the constant to 1.1 and explain its a multiplier in the comment for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also i notice the semantic difference of always adding the thresholdDelta rather than using it only if its greater than the sortcost. maybe this is worth a description? i guess its because now we dont double the sortcost we use the actual cost of both sides of the route added together, scale that by 1.1 to do the extension and then we add some fluff (delta) just in case the route is super short and the scaling doesnt do much. anyway a comment to that effect would make this less magic. i know we have some above but i feel like they dont capture the full picture
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, for short routes scale 1.1
is not enough to find alternatives. Ok, will add a comment about that
@@ -617,11 +643,31 @@ BidirectionalAStar::GetBestPath(valhalla::Location& origin, | |||
return {}; | |||
} | |||
|
|||
// Exhaust hierarchy limits simultaneously in both directions. As soon as forward/reverse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dnesbitt61 i am wondering about your comment on performance for routes that dont do hierarchy culling. where do you think that the slow down is coming from? the way this code reads to me is that we are trying to balance the searches, but wouldnt this never happen for bike and ped routes? since stop expanding would never return true? i'm just wondering where the heck the performance drop is coming, maybe its all just the updated heuristic using the actual cost. did you notice if it was more iterations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on my local machine I noticed that iterations increased by ~6%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can also take into account the fact if we use hierarchies or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is more iterations on an already long search: about 6% increase on forward direction and 10% on reverse. Bicycle and pedestrian routes need a way to decrease the search space/# of iterations and I was hoping this might since the title has the word pruning!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I already mentioned this PR actually fixes a "bug". If it wasn't, we would discuss iterations decrease now
(opp_status.set() == EdgeSet::kTemporary && | ||
edgelabels_reverse_[opp_status.index()].predecessor() == kInvalidLabel)) { | ||
if (SetForwardConnection(graphreader, fwd_pred) && | ||
opp_status.set() == EdgeSet::kPermanent) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am wondering what this added check does its not clear to me why we check for permanent here. is this the part you are refering to when you say
wasn't still pulled out of the queue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this just says that if the opposing edge marked as "temporary" we should keep searching. We can skip this branch only when both forward and reverse searches marked the edge as "permanent"
return false; | ||
|
||
const auto& opp_edgestatus = FORWARD ? edgestatus_reverse_ : edgestatus_forward_; | ||
const auto opp_edge_set = opp_edgestatus.Get(opp_edge_id).set(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess this hash lookup isnt 100% free 😄 maybe for modes of transport that dont use shortcuts we can disable this to help bike and ped performance
what do you think @dnesbitt61 and @genadz ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense
honestly that wasnt too bad to review, i had a couple small questions but it made sense to me mostly. the performance numbers for the auto routes look fine to me. the basic gist is that the 99% percentile of an already skewed request set towards very long routes, shows a pretty big performance drop. honestly i dont care too much about p99 but i guess my quesiton is, is @dnesbitt61 's route a p99 route or a p50 route? is bike routing particularly impacted or is it just as bad as auto and its that we are focused on the 99th percentile? i feel like we shouldnt focus on the 99th percentile so much tbh. i mean i understand we want to be performant on very long routes, but if those constitute 1% of the traffic that would come to a running service it doesnt make sense to focus so much effort on making them efficient at the cost of getting worse routes. another track to take with that is we could relax some of the optimality when we know the route is going to be very long. one could argue that on a very long route optimality of somethign that is already a combination of imperfect measurements isnt that important when we are talking about costs that differ by 1% or something over thousands of miles. i'm not sure if implementing this wouldnt make the code spaghetti but it is a possibility. the problem here is that we cant really just measure performance in the typical way to see what we need to optimize. here the algorithm itself causes more iterations, we're probably already optimized on the cpu cycles that an iteration takes so the only thing we can do is intelligently remove iterations by culling search space. |
to avoid useless mainpulation with hiearchies in case of bicycle and pedestrian costs
b7943c7
to
a89f1c9
Compare
That's why I did some stuff about shortcuts "synchronization" and balancing hierarchies during trees expansion. It eliminated false positive prunings.
As I mentioned above, on my local machine I noticed increase about 6% in number of iterations. But, the same time this test
that you proposed some time ago showed greater performance drop. With last changes (using I think, taking into account all described here #3257 (comment) is acceptable drop. (we shouldn't forget that this PR fixes actually a bug) |
interesting thoughts. Probably we can do something similar to what alternatives routes search does: after we found a route - limit number of additional iterations. should it be implemented in this PR ? |
Thank you! I will consider this a compliment))) |
I saw you did some of the optimizations where certain costings dont do hierarchy culling, did you mean that neither of them had any effect? thats too bad. from my perspective this is ready to be shipped. @dnesbitt61 do you think we should further investigate ped and bike performance before merging? i think it would be ok to do a follow up where we look into reducing the number of iterations for ped and bike for longer routes specifically where accuracy isnt as important |
not really, here #3257 (comment) I mentioned that it's still slower than master by ~10% (on my local machine and planet) using David's test. (before using
@dnesbitt61 could you check if performance changed after introducing |
@dnesbitt61 |
maybe it should be "prefer_roads": 0.0 - meaning the bicycle route should try to stay on cycleways and bicycle lanes (basically a safer route if possible). |
@dnesbitt61 yep that bit is clear i was just pointing it out because i assume it puts a lot of penalties which would i think make a worst case for expansion (big costing numbers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks good to me but i'd like @dnesbitt61 to weigh in on whether we need to do performance optimization for bike and ped now or we can do it in a separate pr
I am fine doing performance optimization in another PR. I may use a simplified bidirectional A* for cycling in the meantime, selecting it if the data set does not have hierarchies and shortcuts (as noted in the mjolnir config). |
Issue
closes #2928 (incorrect stop criterion in the bidirectional astar algorithm).
There are following changes were made.
Tasklist
Requirements / Relations
Link any requirements here. Other pull requests this PR is based on?