TF Serving for mask rcnn too slow #1991

vscv · 2022-04-06T05:39:19Z

I also have the same low-performance issue. I guess it mainly comes from two parts:

It takes time to convert the image into JSON payload and POST.
TF serving itself is delayed (posts have been made several times in advance as a warm-up).

Therefore, the result of my POST test on the remote side and the local side is that the remote side (MBP + WIFI) takes 16 ~ 20 seconds to print res.josn. The local side takes 5 ~ 7 seconds. Also, I observed GPU usage, and it only ran (~70%) for less than a second during the entire POST.

# 1024x1024x3 image to json ans POST
image = PIL.Image.open(sys.argv[1])
payload = {"inputs": [image_np.tolist()]}
res = requests.request("POST", "http://2444.333.222.111:8501/v1/models/maskrcnn:predict", data=json.dumps(payload))
print(res.json())

The text was updated successfully, but these errors were encountered:

godot73 · 2022-04-15T21:50:26Z

I'm not sure this slowness really arises from the TF serving itself. From the description of the test results, it sounds like the real bottleneck is preparation and transportation of the payload data, not the computation of the model. Perhaps it implies that the data is large for the model's complexity.

I would consider some pre-processing to reduce the payload size and/or use faster languages (e.g. C/C++) for payload prep.

vscv · 2022-04-18T01:26:21Z

thanks, @godot73

After a comparison test, using the same model and input image, the API built by Flask can achieve a response time of less than one second when using POST "file=open_img_file" to transmit a 1024x1024 image remotely. Maybe the time difference really comes from data=json.dumps(payload). But TF serving only allows josn delivery, which seems to be a dead end.

gaikwadrahul8 · 2022-12-21T20:23:06Z

Hi, @vscv

Apologies for the delay and TF serving supports REST and gRPC and if you're looking for low latency and better throughput then you'll have to do Batching Configuration with below parameters and you can play with those parameters values, you can refer official documentation of TensorFlow Serving Batching Guide and gRPC is more network efficient, smaller payloads and it can provide much faster inferences as compared to REST, you can refer these articles for TF serving with gRPC [1],[2]

Example batching parameters file:

max_batch_size { value: 128 }
batch_timeout_micros { value: 0 }
max_enqueued_batches { value: 1000000 }
num_batch_threads { value: 8 }
--enable_batching=true

Could you please try above workaround and confirm if this issue is resolved for you ? Please feel free to close the issue if it is resolved ?

If issue still persists, please let us know In order to expedite the trouble-shooting process, please provide a code snippet to reproduce the issue reported here

Thank you!

vscv mentioned this issue Apr 6, 2022

tf serving performace is so slow #1989

Closed

pindinagesh self-assigned this Apr 6, 2022

pindinagesh added the type:performance Performance Issue label Apr 6, 2022

pindinagesh assigned godot73 and unassigned pindinagesh Apr 12, 2022

pindinagesh added the stat:awaiting tensorflower label Apr 12, 2022

gaikwadrahul8 removed the stat:awaiting tensorflower label Dec 21, 2022

gaikwadrahul8 added the stat:awaiting response label Dec 21, 2022

vscv closed this as completed Dec 22, 2022

gaikwadrahul8 self-assigned this Dec 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TF Serving for mask rcnn too slow #1991

TF Serving for mask rcnn too slow #1991

vscv commented Apr 6, 2022

godot73 commented Apr 15, 2022

vscv commented Apr 18, 2022

gaikwadrahul8 commented Dec 21, 2022

TF Serving for mask rcnn too slow #1991

TF Serving for mask rcnn too slow #1991

Comments

vscv commented Apr 6, 2022

godot73 commented Apr 15, 2022

vscv commented Apr 18, 2022

gaikwadrahul8 commented Dec 21, 2022