-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TF Serving for mask rcnn too slow #1991
Comments
I'm not sure this slowness really arises from the TF serving itself. From the description of the test results, it sounds like the real bottleneck is preparation and transportation of the payload data, not the computation of the model. Perhaps it implies that the data is large for the model's complexity. I would consider some pre-processing to reduce the payload size and/or use faster languages (e.g. C/C++) for payload prep. |
thanks, @godot73 After a comparison test, using the same model and input image, the API built by Flask can achieve a response time of less than one second when using POST "file=open_img_file" to transmit a 1024x1024 image remotely. Maybe the time difference really comes from data=json.dumps(payload). But TF serving only allows josn delivery, which seems to be a dead end. |
Hi, @vscv Apologies for the delay and TF serving supports REST and gRPC and if you're looking for low latency and better throughput then you'll have to do Batching Configuration with below parameters and you can play with those parameters values, you can refer official documentation of TensorFlow Serving Batching Guide and gRPC is more network efficient, smaller payloads and it can provide much faster inferences as compared to REST, you can refer these articles for TF serving with gRPC [1],[2] Example batching parameters file:
Could you please try above workaround and confirm if this issue is resolved for you ? Please feel free to close the issue if it is resolved ? If issue still persists, please let us know In order to expedite the trouble-shooting process, please provide a code snippet to reproduce the issue reported here Thank you! |
I also have the same low-performance issue. I guess it mainly comes from two parts:
Therefore, the result of my POST test on the remote side and the local side is that the remote side (MBP + WIFI) takes 16 ~ 20 seconds to print res.josn. The local side takes 5 ~ 7 seconds. Also, I observed GPU usage, and it only ran (~70%) for less than a second during the entire POST.
The text was updated successfully, but these errors were encountered: