Slow loading time reported by Vroom when under heavy load #1180

SimonBradley1993 · 2024-11-05T17:24:08Z

We have a setup using vroom-express deployed on large AWS EC2 machines, with Vroom handling around 500+ req/s. When under this load we're seeing frequent loading times of over 1s. Usually this loading time is in the low 10's of ms, up to around 100ms. OSRM doesn't report any particular slow downs in response times. All CPU usages remain low - around 20% for all machines.

We've tried switching Vroom to using libOSRM and this exacerbated the problem with loading times going into the multiple second range.

Is there anything that can be improved with the loading time to resolve this issue when Vroom is under load?

Alternatively, is there any chance this value is being reported incorrectly by Vroom? Our service that interacts directly with the vroom-express instances doesn't show a slowdown in response times, but another service further up the chain seems to.

The text was updated successfully, but these errors were encountered:

jcoupey · 2024-11-06T09:46:17Z

The reported loading time includes everything prior to actually running the solving approach:

parsing the json payload;
building up internal data structures;
computing the matrices (external calls to routing engine);
precompute various things used down the line.

In general most of the time is spent in computing the matrices but of course if the system is under heavy load all of the above could be slowed down. This is also very dependent on the matrices sizes and routing setup: same machine or not etc.

First thing coming to mind would be to check if OSRM spends more time per request under load, but you seem to have ruled that out. Next in line would be to investigate network latency, especially if OSRM is on a remote machine and/or behind a proxy that may be limiting throughput.

jcoupey · 2024-11-06T09:48:36Z

Alternatively, is there any chance this value is being reported incorrectly by Vroom?

The reported value is just a subtraction between two points in time so nothing fancy or error-prone here. Again this includes a variety of different tasks that may be impacted by many different factors.

SimonBradley1993 · 2024-11-06T10:20:49Z

Hi Julien,

Thanks for picking this up.

On the value being reported, I thought as much - that it is just a simple subtraction of two timestamps.

So, our set up has Vroom and OSRM on different machines, and I'd considered network latency and json parsing from another github issue talking about slowdowns with OSRM.

This led us to deploying Vroom with libOSRM - meaning OSRM and Vroom were then on the same machine - which would rule out those things, and we saw a significant decrease in performance for the loading step reported by Vroom.

It seems that even though the machine's CPU usage is quite low, around 20%, due to the volume of requests coming in Vroom is struggling to performing the loading step as efficiently as it can when it is under much lighter load.

Currently our solution is going to be talking to OSRM directly and passing the matrices to Vroom to cut out the loading step that it has to do. Do you know if Vroom will have to perform any computations still on the matrices passed into the request?

jcoupey · 2024-11-06T16:59:18Z

and we saw a significant decrease in performance for the loading step reported by Vroom.

Using libOSRM also requires some boilerplate to create objects in C++. I'd say that whether the gain is worth the trouble would be very dependent on how long your typical OSRM requests actually take. I have no data to back this thought but if you have a lot of very small requests, this may be a lead.

It seems that even though the machine's CPU usage is quite low, around 20%

Not sure how you measure this, but if the system load is reported as some kind of average over a certain period, then it is still possible to get a low usage on average, but with huge peaks where you experience slowdowns due to concurrent requests.

Currently our solution is going to be talking to OSRM directly and passing the matrices to Vroom to cut out the loading step that it has to do

I don't really see how this would solve the problem if it is related to network or OSRM load. You'd most probably simply be reproducing the same problem outside VROOM with your additional layer.

jcoupey added the question label Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow loading time reported by Vroom when under heavy load #1180

Slow loading time reported by Vroom when under heavy load #1180

SimonBradley1993 commented Nov 5, 2024

jcoupey commented Nov 6, 2024

jcoupey commented Nov 6, 2024

SimonBradley1993 commented Nov 6, 2024 •

edited

Loading

jcoupey commented Nov 6, 2024

Slow loading time reported by Vroom when under heavy load #1180

Slow loading time reported by Vroom when under heavy load #1180

Comments

SimonBradley1993 commented Nov 5, 2024

jcoupey commented Nov 6, 2024

jcoupey commented Nov 6, 2024

SimonBradley1993 commented Nov 6, 2024 • edited Loading

jcoupey commented Nov 6, 2024

SimonBradley1993 commented Nov 6, 2024 •

edited

Loading