Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for unmarshaling a HTTPBody as raw data. #3094

Closed
veith opened this issue Dec 24, 2022 · 8 comments
Closed

Add support for unmarshaling a HTTPBody as raw data. #3094

veith opened this issue Dec 24, 2022 · 8 comments

Comments

@veith
Copy link
Contributor

veith commented Dec 24, 2022

It would be very nice if there is a type which can receive raw binary data as body, and all according metadata via headers (Content-Type video/mp4).

Browsers of today can send real binary data (const binaryData = new Uint8Array(buffer);).

Curl scripts also (curl --data-binary "@filepath").

This would reduce the transferred data size by 33%.

I refer to HttpBody, because this type is only usable as a "root" response type in a transcoded context, why not bidirectional?

Originally posted by @veith in #1781 (comment)

@veith veith changed the title Seems reasonable to add support for unmarshaling a HTTPBody as raw data. Add support for unmarshaling a HTTPBody as raw data. Dec 24, 2022
@johanbrandhorst
Copy link
Collaborator

Thanks for creating a separate issue for this. As described in my original reply, this should be a matter of implementing Unmarshal on our custom HTTPBody marshaler, and adding some tests. This is technically a breaking change for anyone using HTTPBody requests today, so we may want some way to turn off this new behavior. I'm assuming anyone using HTTPBody requests today actually wants the body to be provided verbatim.

@veith
Copy link
Contributor Author

veith commented Dec 26, 2022

I myself already had the fear that this would cause breaking behavior and would prefer if it is disabled by default. I have already specified some few APIs myself that would fall over because of this.

Question: Is there a type set from the gateway project itself? A new message type would avoid this problem. The next problem is the "extensions" field, which is also "difficult" to transport.

A message type RequestHttpBody or RawHttpBody for this feature could look like this:

message RequestHttpBody {
  // The HTTP Content-Type header value specifying the content type of the body.
  string content_type = 1;

  // The HTTP request/response body as raw binary.
  bytes data = 2;
}

I implemented an example that generates pb.gw.go code like the following, because no Unmarshal is needed i tried to extend the generator, another reason is, that any sent "Content-Type" should work, because sending application/octet-stream is not enough. How do i know, what kind of data was received:

func request_Files_UpdateFile_0(ctx context.Context, marshaler runtime.Marshaler, client FilesClient, req *http.Request, pathParams map[string]string) (proto.Message, runtime.ServerMetadata, error) {
	var protoReq UpdateFileRequest
	var metadata runtime.ServerMetadata

	buf := new(bytes.Buffer)
	_, berr := buf.ReadFrom(req.Body)
	if berr != nil {
		return nil, metadata, status.Errorf(codes.InvalidArgument, "%v", berr)
	}
	protoReq.Body = &httpbody.HttpBody{
		ContentType: req.Header.Get("Content-Type"),
		Data:        buf.Bytes(),
		Extensions:  nil,
	}

	var (
		val string
		ok  bool
		err error
		_   = err
	)
...

In the next days I want to try out how this could work with streams.

@veith
Copy link
Contributor Author

veith commented Dec 26, 2022

Which is the way that you would prefer to go?

  • writing a custom message type without the extensions field
  • using HttpBody and define headers for sending the extensions
  • using HttpBody and ignore the extensions at the moment
  • generate the pb.gw.go or implementing a pseudo unmarshaler

When using HttpBody, disable by default is not an option i think.

There is a lot to discuss before settling on a solution.

@johanbrandhorst
Copy link
Collaborator

I don't think we should require a new message type. We can add an option to the default marshaler to enable this behavior to avoid causing breaking changes. Forgive me, but I don't understand why we can't simply implement Unmarshal on the marshaler?

@veith
Copy link
Contributor Author

veith commented Dec 28, 2022

I don't think we should require a new message type. We can add an option to the default marshaler to enable this behavior to avoid causing breaking changes. Forgive me, but I don't understand why we can't simply implement Unmarshal on the marshaler?

As I understood, the runtime.WithMarshalerOption is controlled by content type. HttpBody can have any content type like application/json , that was the reason for me, to patch the protoc-gen-grpc-gateway. Please correct me if i am wrong on this.

To avoid breaking behavior there are following options:

  • Accept an option on protoc-gen-grpc-gateway for using the new behaviour
  • Generate a *.pb.gw.go which can handle both variants and accepts options for the runtime

There is another option which is not breaking: Implement client streaming for HttpBody only when used with path parameters. Because at the moment it is not possible to have client streaming with params.

internal/descriptor/services.go 135

		if md.GetClientStreaming() && len(tmpl.Fields) > 0 {
			return nil, errors.New("cannot use path parameter in client streaming")
		}

This could make sense, because the max message size will not be a problem because the gateway can control the chunk size. The parameters can be transported via ctx as Metadata, if needed. But this variant needs good documentation, to avoid confusion.

@johanbrandhorst
Copy link
Collaborator

Using runtime.WithMarshalerOption("*", marshaler) uses that marshaler for all content types. You can simply use

marshaler := &runtime.HTTPBodyMarshaler{EnableUnmarshal: true}
...
runtime.WithMarshalerOption("*", marshaler)
...

Does that make sense? I also am not sure how this applies to client streaming - using HTTPBody doesn't mean that the gateway will support client streaming requests, as it'll need to buffer the whole request body in memory before passing it to the marshaler still.

@veith
Copy link
Contributor Author

veith commented Jan 8, 2023

Does that make sense? I also am not sure how this applies to client streaming - using HTTPBody doesn't mean that the gateway will support client streaming requests, as it'll need to buffer the whole request body in memory before passing it to the marshaler still.

No, buffering is not needed with HttpBody client stream, because there is no marshaling needed. The implementation of the service on the grpc server side, can decide what they want to do with chunks.

Because this feature request could be breaking, and there is more clarification needed, i suggest to put this feature request on a list for version 3.

The problem I wanted to solve, can also be solved by adding a custom route.

@veith veith closed this as completed Jan 8, 2023
@emcfarlane
Copy link
Contributor

@veith maybe #500 (comment) is a solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants