Support accepting gzipped requests #1091

orf · 2018-09-14T13:27:42Z

Feature Request

Describe the problem the feature is intended to solve

Currently tensorflow-serving does not handle GZipped request bodies in the REST endpoint, which can slow down POSTing large volumes of data to tensorflow-serving. It fails with a JSON parse error if this is the case, even if the correct headers are sent. Networks are fast, sure, but why send ~10 MB JSON bodies when you can send ~100kb ones?

Describe the solution

If Content-Encoding: gzip is sent as a header then the body should be decompressed before parsing.

Describe alternatives you've considered

A reverse proxy that decompresses responses before passing them to tf-serving, but this is not an ideal solution and adds overhead.

Additional context

When encoding an image for inference the resulting JSON can be very large, upwards of 10 megabytes. Gzipped this often is reduced to ~400kb.

The text was updated successfully, but these errors were encountered:

ymodak · 2018-09-26T17:20:21Z

@orf Thank you for the fix and verification. This is an interesting addition.

orf · 2018-09-26T17:28:46Z

Just for some idea of numbers, we have seen our payload size reduced by up to 15x (with a mean of 10x) and significantly reduced transfer times.

Using GRPC would be more optimal here (as it supports compression?) but JSON is often better supported and easier to adopt piecemeal.

gautamvasudevan · 2018-10-02T20:17:52Z

Per conversations - @wenbozhu is working on this.

gautamvasudevan · 2018-10-17T17:14:46Z

Fixed by b94f6c8

orf · 2018-10-17T17:17:52Z

Thank you very much @wenbozhu!

wenbozhu · 2018-10-23T22:48:08Z

@orf Reading your FR again, I am not sure why there is a JSON parsing error when gzip is not supported.

Also, is gzip useful for the "download" case?
.

orf · 2018-10-23T23:01:22Z

Hey Wenbo, I am not sure I understand. Where is the JSON error? Gzipping the response body I think is a good idea, it's more supported by clients and can offer a big reduction in latency if a large amount of JSON is retrieved.

…

On Tue, 23 Oct 2018, 23:48 Wenbo Zhu, ***@***.***> wrote: @orf <https://github.com/orf> Reading your FR again, I am not sure why there is a JSON parsing error when gzip is not supported. Also, is gzip useful for the "download" case? . — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1091 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA-sh9bkcSvtP-cZw6L1xwrtUTXxUJFXks5un5ytgaJpZM4WpRZG> .

wenbozhu · 2018-10-23T23:58:03Z

"
Describe the problem the feature is intended to solve

Currently tensorflow-serving does not handle GZipped request bodies in the REST endpoint, which can slow down POSTing large volumes of data to tensorflow-serving. It fails with a JSON parse error if this is the case, even if the correct headers are sent. ....
"

orf · 2018-10-24T00:02:18Z

Oh right! Sorry. If you send a gzipped request body and the server does not decode it (due to it not supporting compressed requests) and blindly passes it into the JSON decoder it will likely fail as it's a bunch of nonsense bytes. That's where the error was coming from I guess.

This should not happen now that your PR has been merged though.

wenbozhu · 2018-10-24T00:05:57Z

OK. Then is it important to support gzipped response bodies?

orf · 2018-10-24T00:08:04Z

I'd say yes, it doesn't seem like it would take much more effort now that this is closed and it's an easy win. If you are sending many requests in a batch the responses can get quite large.

shlomiken · 2019-04-07T15:44:45Z

Hi
can one give an example how to send such gzip request to predict from curl, as it does not appear in the documentation and i get 400 bad request when sending binary data with that json zipped.
for example
curl -v --data-binary @body.gz -H'Content-Encoding: gzip' -X POST http://localhost:8501/v1/models/pos-model:predict

on the server i get these errors
: Got zlib error: -3
[evhttp_request.cc : 199] RAW: Failed to uncompress the gzipped body
[evhttp_request.cc : 236] RAW: Got zlib error: -4
[evhttp_request.cc : 199] RAW: Failed to uncompress the gzipped body
[evhttp_request.cc : 236] RAW: Got zlib error: -4

ttang235 · 2019-04-12T06:24:27Z

Hi
can one give an example how to send such gzip request to predict from curl, as it does not appear in the documentation and i get 400 bad request when sending binary data with that json zipped.
for example
curl -v --data-binary @body.gz -H'Content-Encoding: gzip' -X POST http://localhost:8501/v1/models/pos-model:predict

on the server i get these errors
: Got zlib error: -3
[evhttp_request.cc : 199] RAW: Failed to uncompress the gzipped body
[evhttp_request.cc : 236] RAW: Got zlib error: -4
[evhttp_request.cc : 199] RAW: Failed to uncompress the gzipped body
[evhttp_request.cc : 236] RAW: Got zlib error: -4

Maybe because too many instances in body.gz?
I also have this problem. Looks like if I send less than 100 instances in one request, it works; if I send 500 instances in one request, it would fail. The even more confusing thing is that the boundary isn't clear: sometimes the max number of instances that tf serving can handle is 200, and sometimes it's 188, etc, etc. It's not stable at all.... (If you're curious, I was using binary search to find the boundary)

Could anyone explain this?
I'm using tf serving 1.12.0

ttang235 · 2019-04-12T08:17:49Z

I

Hi
can one give an example how to send such gzip request to predict from curl, as it does not appear in the documentation and i get 400 bad request when sending binary data with that json zipped.
for example
curl -v --data-binary @body.gz -H'Content-Encoding: gzip' -X POST http://localhost:8501/v1/models/pos-model:predict
on the server i get these errors
: Got zlib error: -3
[evhttp_request.cc : 199] RAW: Failed to uncompress the gzipped body
[evhttp_request.cc : 236] RAW: Got zlib error: -4
[evhttp_request.cc : 199] RAW: Failed to uncompress the gzipped body
[evhttp_request.cc : 236] RAW: Got zlib error: -4

Maybe because too many instances in body.gz?
I also have this problem. Looks like if I send less than 100 instances in one request, it works; if I send 500 instances in one request, it would fail. The even more confusing thing is that the boundary isn't clear: sometimes the max number of instances that tf serving can handle is 200, and sometimes it's 188, etc, etc. It's not stable at all.... (If you're curious, I was using binary search to find the boundary)

Could anyone explain this?
I'm using tf serving 1.12.0

Is the maximum size of unzipped body 10M? (based on kMaxUncompressedBytes in this code: b94f6c8)
In my case, the total size of 200 instances is less than 300 KB, so it shouldn't be a problem, but sometimes it fails. Why? Could you please help answer this? @wenbozhu

PS: I'm using tf serving 1.12.0 image in docker hub, in case it's related to this issue.

Thanks!

wenbozhu · 2019-04-12T14:14:58Z

You mentioned both zlib error -3 and -4 .. and I suppose the first one is a typo?

-4 means the compressed data is corrupted or the server fails to allocate memory ... and is it possible that body.gz is wrong or curl will double compress if you specify C-E: gzip manually ...?

shlomiken · 2019-04-12T22:21:34Z

You mentioned both zlib error -3 and -4 .. and I suppose the first one is a typo?

-4 means the compressed data is corrupted or the server fails to allocate memory ... and is it possible that body.gz is wrong or curl will double compress if you specify C-E: gzip manually ...?

This is not a type - i got both -3 and -4 (maybe on different calls , i now verified and mostly get -3)
The file was a successful to predict JSON file (26MB) - which i zipped using zip json.gz file.json

wenbozhu · 2019-04-14T01:14:41Z

you should use gzip i.e. > gzip file.json

gzip adds gzip headers which include the size of uncompressed data. This is important as we want to make one memory allocation for all the uncompressed data.

shlomiken · 2019-04-14T13:30:21Z

Hi @wenbozhu - thanks for helping out
i have zipped as you suggested with gzip file.json
i run this curl command
curl -v -s --trace-ascii http_trace.log --data-binary @file.json.gz -H "Content-Type: application/json" -H "Content-Encoding: gzip" -X POST http://localhost:8080/v1/models/pos-model:predict

and get this as http trace on curl side

== Info: We are completely uploaded and fine
<= Recv header, 26 bytes (0x1a)
0000: HTTP/1.1 400 Bad Request
<= Recv header, 32 bytes (0x20)
0000: Content-Type: application/json
<= Recv header, 37 bytes (0x25)
0000: Date: Sun, 14 Apr 2019 13:27:42 GMT
<= Recv header, 20 bytes (0x14)
0000: Content-Length: 54
<= Recv header, 2 bytes (0x2)
0000: 
<= Recv data, 54 bytes (0x36)
0000: { "error": "JSON Parse error: The document is empty" }

On Model server i see this error
evhttp_request.cc : 236] RAW: Got zlib error: -4 tenserve_app | [evhttp_request.cc : 199] RAW: Failed to uncompress the gzipped body

wenbozhu · 2019-04-16T20:39:58Z

Thanks for the log, Will look into this

ZhouyihaiDing · 2019-04-17T16:52:05Z

I am trying to debug this but I am not able to reproduce it.

I was using the TF model example.
I increased the number of instances and used the curl command mentioned by shlomiken.
However, I could get the correct response.

shlomiken · 2019-04-17T18:28:44Z

I am trying to debug this but I am not able to reproduce it.

I was using the TF model example.
I increased the number of instances and used the curl command mentioned by shlomiken.
However, I could get the correct response.

Maybe a big file - mine is about 8.5 MB after gzip

ZhouyihaiDing · 2019-04-19T18:30:35Z

Thanks! Still looking at it.
FYI for 10MB limit of uncompressed size, I think it means the data size after the decompression.
Your data is already 8.5MB after gzip. It's very likely the original size is greater than 10MB, which will lead to a -4 error.

shlomiken · 2019-04-19T21:00:11Z

Thanks! Still looking at it.
FYI for 10MB limit of uncompressed size, I think it means the data size after the decompression.
Your data is already 8.5MB after gzip. It's very likely the original size is greater than 10MB, which will lead to a -4 error.

I don't understand - i sent 26MB (uncompressed) without any problem , the 8.5MB is compressed.
So is there or not a limit of 10MB - how i managed to send 26MB ?

wenbozhu · 2019-04-26T23:27:36Z

The limit is for uncompressed data, enforced only when the response is gzipped.

The does cause inconsistent API semantics and we will fix it.

===

The uncompression is broken when the body is too large and we need fix this, to buffer the entire body or to enable streamed uncompression.

Re-open the bug to add the fix and test.

wenbozhu · 2019-04-26T23:42:16Z

Correction: the intent for the current behavior is to use RequestHandlerOptions::set_auto_uncompress_max_size() to limit uncompressed sizes (as a security limit).

If we are expecting responses larger than 10MB, then we need overwrite the default in the model server, e.g. to 100MB.

@netfs

===

The fix in httpserver is still needed.

wenbozhu · 2019-05-03T01:43:30Z

This is fixed in upstream. The default max # of uncompressed bytes is now 100MB.

orf mentioned this issue Sep 14, 2018

Fixes #1091 - Support gzip encoded requests #1092

Closed

gautamvasudevan added the type:feature label Sep 15, 2018

ymodak self-assigned this Sep 27, 2018

gautamvasudevan assigned wenbozhu Oct 2, 2018

gautamvasudevan closed this as completed Oct 17, 2018

wenbozhu reopened this Apr 26, 2019

wenbozhu closed this as completed May 3, 2019

misterpeddy added the type:performance Performance Issue label Nov 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support accepting gzipped requests #1091

Support accepting gzipped requests #1091

orf commented Sep 14, 2018 •

edited

Loading

ymodak commented Sep 26, 2018 •

edited

Loading

orf commented Sep 26, 2018

gautamvasudevan commented Oct 2, 2018 •

edited

Loading

gautamvasudevan commented Oct 17, 2018

orf commented Oct 17, 2018

wenbozhu commented Oct 23, 2018

orf commented Oct 23, 2018 via email

wenbozhu commented Oct 23, 2018

orf commented Oct 24, 2018

wenbozhu commented Oct 24, 2018

orf commented Oct 24, 2018

shlomiken commented Apr 7, 2019 •

edited

Loading

ttang235 commented Apr 12, 2019

ttang235 commented Apr 12, 2019

wenbozhu commented Apr 12, 2019

shlomiken commented Apr 12, 2019 •

edited

Loading

wenbozhu commented Apr 14, 2019

shlomiken commented Apr 14, 2019 •

edited

Loading

wenbozhu commented Apr 16, 2019

ZhouyihaiDing commented Apr 17, 2019

shlomiken commented Apr 17, 2019

ZhouyihaiDing commented Apr 19, 2019 •

edited

Loading

shlomiken commented Apr 19, 2019

wenbozhu commented Apr 26, 2019

wenbozhu commented Apr 26, 2019 •

edited

Loading

wenbozhu commented May 3, 2019

Support accepting gzipped requests #1091

Support accepting gzipped requests #1091

Comments

orf commented Sep 14, 2018 • edited Loading

Feature Request

Describe the problem the feature is intended to solve

Describe the solution

Describe alternatives you've considered

Additional context

ymodak commented Sep 26, 2018 • edited Loading

orf commented Sep 26, 2018

gautamvasudevan commented Oct 2, 2018 • edited Loading

gautamvasudevan commented Oct 17, 2018

orf commented Oct 17, 2018

wenbozhu commented Oct 23, 2018

orf commented Oct 23, 2018 via email

wenbozhu commented Oct 23, 2018

orf commented Oct 24, 2018

wenbozhu commented Oct 24, 2018

orf commented Oct 24, 2018

shlomiken commented Apr 7, 2019 • edited Loading

ttang235 commented Apr 12, 2019

ttang235 commented Apr 12, 2019

wenbozhu commented Apr 12, 2019

shlomiken commented Apr 12, 2019 • edited Loading

wenbozhu commented Apr 14, 2019

shlomiken commented Apr 14, 2019 • edited Loading

wenbozhu commented Apr 16, 2019

ZhouyihaiDing commented Apr 17, 2019

shlomiken commented Apr 17, 2019

ZhouyihaiDing commented Apr 19, 2019 • edited Loading

shlomiken commented Apr 19, 2019

wenbozhu commented Apr 26, 2019

wenbozhu commented Apr 26, 2019 • edited Loading

wenbozhu commented May 3, 2019

orf commented Sep 14, 2018 •

edited

Loading

ymodak commented Sep 26, 2018 •

edited

Loading

gautamvasudevan commented Oct 2, 2018 •

edited

Loading

shlomiken commented Apr 7, 2019 •

edited

Loading

shlomiken commented Apr 12, 2019 •

edited

Loading

shlomiken commented Apr 14, 2019 •

edited

Loading

ZhouyihaiDing commented Apr 19, 2019 •

edited

Loading

wenbozhu commented Apr 26, 2019 •

edited

Loading