Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF Serving for mask rcnn too slow #1991

Closed
vscv opened this issue Apr 6, 2022 · 3 comments
Closed

TF Serving for mask rcnn too slow #1991

vscv opened this issue Apr 6, 2022 · 3 comments
Assignees

Comments

@vscv
Copy link

vscv commented Apr 6, 2022

I also have the same low-performance issue. I guess it mainly comes from two parts:

  1. It takes time to convert the image into JSON payload and POST.
  2. TF serving itself is delayed (posts have been made several times in advance as a warm-up).

Therefore, the result of my POST test on the remote side and the local side is that the remote side (MBP + WIFI) takes 16 ~ 20 seconds to print res.josn. The local side takes 5 ~ 7 seconds. Also, I observed GPU usage, and it only ran (~70%) for less than a second during the entire POST.

# 1024x1024x3 image to json ans POST
image = PIL.Image.open(sys.argv[1])
payload = {"inputs": [image_np.tolist()]}
res = requests.request("POST", "http://2444.333.222.111:8501/v1/models/maskrcnn:predict", data=json.dumps(payload))
print(res.json())
@godot73
Copy link
Contributor

godot73 commented Apr 15, 2022

I'm not sure this slowness really arises from the TF serving itself. From the description of the test results, it sounds like the real bottleneck is preparation and transportation of the payload data, not the computation of the model. Perhaps it implies that the data is large for the model's complexity.

I would consider some pre-processing to reduce the payload size and/or use faster languages (e.g. C/C++) for payload prep.

@vscv
Copy link
Author

vscv commented Apr 18, 2022

thanks, @godot73

After a comparison test, using the same model and input image, the API built by Flask can achieve a response time of less than one second when using POST "file=open_img_file" to transmit a 1024x1024 image remotely. Maybe the time difference really comes from data=json.dumps(payload). But TF serving only allows josn delivery, which seems to be a dead end.

@gaikwadrahul8
Copy link

Hi, @vscv

Apologies for the delay and TF serving supports REST and gRPC and if you're looking for low latency and better throughput then you'll have to do Batching Configuration with below parameters and you can play with those parameters values, you can refer official documentation of TensorFlow Serving Batching Guide and gRPC is more network efficient, smaller payloads and it can provide much faster inferences as compared to REST, you can refer these articles for TF serving with gRPC [1],[2]

Example batching parameters file:

max_batch_size { value: 128 }
batch_timeout_micros { value: 0 }
max_enqueued_batches { value: 1000000 }
num_batch_threads { value: 8 }
--enable_batching=true 

Could you please try above workaround and confirm if this issue is resolved for you ? Please feel free to close the issue if it is resolved ?

If issue still persists, please let us know In order to expedite the trouble-shooting process, please provide a code snippet to reproduce the issue reported here

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants