Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: Promote a field in the returned JSON message to a top-level returned value #707

Closed
ivucica opened this issue Jul 24, 2018 · 20 comments
Closed

Comments

@ivucica
Copy link
Collaborator

ivucica commented Jul 24, 2018

Often pre-existing APIs return a list as a top-level JSON entity. Sometimes, the returned value may be a string or a number.

Example of an API that returns an array is Mastodon's Getting who an account is following call. I've run into other APIs with the same issue.

While I believe the default behavior should be as-is, It would be useful to allow promoting a field of a message to be the actual value of a message.

There are multiple ways to do it; we could add an option to a particular RPC, and we could add a grpc-gateway-specific option to a message. I'm leaning towards attaching it to a message as that allows doing this multiple times.

Proposing the following:

service Accounts {
  rpc GetAccountFollowing (GetAccountFollowingRequest) returns (AccountArray) {
    option (google.api.http) = {
      get: "/accounts/{id}"
    };
  }
  message AccountArray {
    repeated Account account = 1 [(grpc.gateway.options).unwrap_field = true]; // or promote_field?
  }

  message Account {
    string id = 1;
    string username = 2;
    string display_name = 3;
  }

This should replace the following default behavior:

{
  "account": [
    {
      "id": "1",
      "username": "test",
      "display_name": "Esther T."
    },
    {
      "id": "2",
      "username": "test2",
      "display_name": "Esther T."
    }
  ]
}

with the following:

[
  {
    "id": "1",
    "username": "test",
    "display_name": "Esther T."
  },
  {
    "id": "2",
    "username": "test2",
    "display_name": "Esther T."
  }
]

Implementing the filter in the gRPC-Gateway serving binary is reasonably trivial. I've taken net/http/httptest, recorded the output, unmarshalled the JSON into a map[string]interface{}, taken the account key, encoded it into JSON and then output it as the response. It's tolerable.

But it is a hack, and it doesn't affect the protoc-gen-swagger-generated definition.

Since APIs that return arrays as a top level entity, it would be excellent if we could agree on an option-based design that would be respected by both protoc-gen-grpc-gateway and by protoc-gen-swagger.

Is this already possible? Did I miss something? Did I miss an open issue tracking this feature request? Thoughts on where the option would belong -- on the rpc or on the message?

@achew22
Copy link
Collaborator

achew22 commented Jul 24, 2018

In my experience, I've deeply regreted not returning an object at the top level. It made migrating that endpoint a horribly breaking change and required months and months of work to get things back in shape. That said, if you're trying to reverse engineer and take over an existing API then that is what you must do and that is a very valid use case.

Looking through the docs on (HttpRule)[https://github.com/googleapis/googleapis/blob/master/google/api/http.proto], you almost want a "reply body" parameter attached. I haven't seen anything like that, but it might be a thing upstream is receptive to. If you could get it in the HttpRule proto then I would be 1000% in favor of supporting it.

How can I help with this?

@ivucica
Copy link
Collaborator Author

ivucica commented Jul 24, 2018

In my experience, I've deeply regreted not returning an object at the top level. It made migrating that endpoint a horribly breaking change and required months and months of work to get things back in shape.

No objection here. In general, I like the API Design Guide myself. But yeah, it's about adopting an existing API.

you almost want a "reply body" parameter attached. I haven't seen anything like that, but it might be a thing upstream is receptive to. If you could get it in the HttpRule proto then I would be 1000% in favor of supporting it.

I believe that an attempt to add reply_body upstream will almost certainly result in a rejection unless I would include an upstream implementation :-) but for that, I honestly don't have the time nor motivation :-)

I guess I can ask upstream internally.

@johanbrandhorst
Copy link
Collaborator

Personally this seems like breaking a bit too far away from the underlying protobuf definition. It throws backwards compatibility to the wind, and I can't even begin to think of all the corner cases this is going to trigger. Given that this is a field property, have we considered how it might interact with query strings, partial path fields etc? It seems like a can of worms to me. What if you specify it twice? I understand the problem you're trying to solve but I'd be more inclined for us to have a special message type for this kind of thing. Maybe message grpc-gateway.Wrapper{ repeated google.protobuf.Struct element = 1; }? This we could add special cases for in the query parser and any JSPB implementations without having to extend the field descriptor. It'd be a bit like the WKT wrapper types.

@ivucica
Copy link
Collaborator Author

ivucica commented Jul 25, 2018

Personally this seems like breaking a bit too far away from the underlying protobuf definition.

How? This can be a 'considered harmful' compatibility hack to be able to implement certain pre-existing APIs.

It throws backwards compatibility to the wind

How? If you don't specify the new option (or, if upstream agrees to it, whatever-we-end-up-calling reply_body) then you get the current behavior. No changes.

Given that this is a field property, have we considered how it might interact with query strings, partial path fields etc?

But, how do query strings and partial path fields apply if this only changes the response? No interaction with the request messages.

What if you specify it twice?

I considered it; I think it can depend on what's doable in implementation stage. I'm currently personally interested in behavior of a top-level field on a top-level response message getting promoted as a response itself. I guess if you specify it on a repeated message-type field which contains a repeated message-type field where the option is also specified, we could try moving that one step up, too, allowing for an array of arrays?

I'd be more inclined for us to have a special message type for this kind of thing. Maybe message grpc-gateway.Wrapper{ repeated google.protobuf.Struct element = 1; }

This at the very least invades the realm of the gRPC API design; you have to change how your gRPC API looks, when in fact you only want to change the REST API. It means constructing the response becomes very unwieldy.

An option (or a field in google.api.HttpRule) still seems reasonable, non-intrusive and clear inside the .proto file?

@johanbrandhorst
Copy link
Collaborator

How? This can be a 'considered harmful' compatibility hack to be able to implement certain pre-existing APIs.

Fair enough

How? If you don't specify the new option (or, if upstream agrees to it, whatever-we-end-up-calling reply_body) then you get the current behavior. No changes.

I don't mean backwards compatibility with the current gateway, just that if you have a service as you suggest and add a new field to your response message, that's impossible? It's breaking backwards compatibility guarantees given by protobuf. Adding new fields should always be possible, and for that to be true, there has to be a top level messages.

But, how do query strings and partial path fields apply if this only changes the response? No interaction with the request messages.

How do we tell the user this only applies to responses? Adding a field option that only applies in a subset of circumstances will need to obviously not apply in cases where it's not supposed to be used, or we might mislead users.

I considered it; I think it can depend on what's doable in implementation stage. I'm currently personally interested in behavior of a top-level field on a top-level response message getting promoted as a response itself. I guess if you specify it on a repeated message-type field which contains a repeated message-type field where the option is also specified, we could try moving that one step up, too, allowing for an array of arrays?

This seems like a nightmare to implement, are we going to allow this nesting indefinitely? Are we going to just document the limits? What if you put it on a map? A oneof? An enum? We could document that it only works in very specific well-defined circumstances, but it feels a bit like a hack unless we can make it universally applicable.

This at the very least invades the realm of the gRPC API design; you have to change how your gRPC API looks, when in fact you only want to change the REST API. It means constructing the response becomes very unwieldy.

This is a fair point - but constructing messages for specific JSON structures is already what the official well known types encourage us to do. The whole point of the google.protobuf.Struct type (as far as I can tell) is to give predictable JSON rendering. Same for the ScalarWrapper types.

An option (or a field in google.api.HttpRule) still seems reasonable, non-intrusive and clear inside the .proto file?

I just want to make sure we consider all potential use cases of such an option and document the limitations.

@ivucica
Copy link
Collaborator Author

ivucica commented Jul 25, 2018

Thanks for the lively discussion -- but maybe we're straying into the "you are wrong for needing this feature" territory? :)

I don't mean backwards compatibility with the current gateway, just that if you have a service as you suggest and add a new field to your response message, that's impossible? It's breaking backwards compatibility guarantees given by protobuf. Adding new fields should always be possible, and for that to be true, there has to be a top level messages.

But this is not protobuf, this is JSON and it is a risk the developer can take. Again, this is a hack for implementing other, preexisting APIs -- something I would strongly discourage for new APIs.

How do we tell the user this only applies to responses?

In documentation and docstring for the option, I'd assume? How do we describe to the user how our view of the RFC6570 {templates=works/*}?

This seems like a nightmare to implement, are we going to allow this nesting indefinitely?

Recursion is a thing, no? But how deeply this is doable depends on the implementation -- which I didn't look at for now, as I wanted to gauge useful feedback on whether to spend time implementing this.

Are we going to just document the limits?

Yes. I foresee either limit of 1 or no limit.

What if you put it on a map? A oneof? An enum?

I'm not sure what you're asking here. If you put it on a map (I don't know, does grpc-gateway support maps?) you clearly promote the map to a top level object. If you put it on a oneof -- well, how do we encode it in JSON now? Same for enum. Same for string. Same for int. JSON doesn't ban you from any of those being top level values, to my knowledge?

We could document that it only works in very specific well-defined circumstances,

Yes, if that ends up being the implementation.

but it feels a bit like a hack unless we can make it universally applicable.

It is a hack that I would prefer not doing, but existing APIs are returning this today. I am interested in implementing a portion of Mastodon APIs in gRPC + gRPC-Gateway, but I am today blocked on one of the core APIs -- the one that lists statuses posted by a user account -- returning a list.

Any workaround I would do would be a way, way worse hack and much less useful than just upstreaming support relevant annotations for this into gRPC-Gateway.

This is a fair point - but constructing messages for specific JSON structures is already what the official well known types encourage us to do. The whole point of the google.protobuf.Struct type (as far as I can tell) is to give predictable JSON rendering. Same for the ScalarWrapper types.

What gRPC-Gateway does is supposed to be a "predictable rendering" of JSON, already, no? google.protobuf.Struct and google.protobuf.ListValue appear useful if you are representing I don't need arbitrary JSON encoding (which is what the quoted example you gave provides) and I believe it's much hackier to write a translator from a nice, well defined protobuf into a generic google.protobuf.Struct just so I could put it into google.protobuf.ListValue.

Using .Struct/.ListValue would mean that any gRPC (as opposed to REST) users of the same backend would have to translate the same .ListValue and .Struct back into an array of native proto structs. And they'd have to do that in every language. Or they'd have to use the REST API. Or the backend would have to implement an additional, gRPC-only method for, e.g., enumerating statuses, which would not be exposed as part of the same REST API as the remaining methods.

I just want to make sure we consider all potential use cases of such an option and document the limitations.

Sure. What other use cases do you propose users might have?

My purpose for this bug was to hear about what other elegant ways there might be to expose this functionality to developers. Is option on the field the best way? Maybe not. I did mention option on the rpc in the original post. I also liked Andrew's suggestion to integrate reply_body (or equivalent) into google.api.Http, as long as upstream is open to it. None of these changes much for implementation in grpc-gateway.

Any of these helps me annotate the strange behavior once (in the proto), while keeping generated implementation and swagger definition in sync.

I am not sure "developer won't know how to hold the phone right" is useful to decide on the implementation (they can design a broken API in a hundred different ways already) nor is "we must prevent the user from shooting themselves in foot in every way" or "we cannot possibly implement this without covering every single edge case".

Can you provide specific use cases, and specific confusing .proto files that you believe need clarification before implementing and merging something that would support this?

@johanbrandhorst
Copy link
Collaborator

johanbrandhorst commented Jul 25, 2018

I apologize if my replies come off as anything other than constructive criticism.

Isn't this actually possible to solve with .ListValue? Yes the method will be stupid to use for gRPC clients but you could expose two different methods, one wrapping the gRPC method and performing the conversion to and from the .ListValue and .Struct types. Is that too bad? There's always a cost to adding new features so it's not like it's a no brainer just because it can be done.

@ivucica
Copy link
Collaborator Author

ivucica commented Jul 27, 2018

One of the goals is to generate OpenAPI representation from the same source. Unless I am missing something, a .ListValue will not be represented as a list of particular proto messages.

@johanbrandhorst
Copy link
Collaborator

This is true, so is that the real question then? This feature would allow the generated OpenAPI type signature to show repeated types returned, as opposed to having a generic list type. In the light of that, should we not consider adding a swagger specific annotation instead? It would mean the types in the protobuf definition wouldn't match the signature of the function in the swagger API, but at the same time it would avoid us having to create an official message descriptor extension.

What do you think?

@ivucica
Copy link
Collaborator Author

ivucica commented Jul 30, 2018

at the same time it would avoid us having to create an official message descriptor extension.

If we introduce a new gRPC-Gateway-specific option on a message or on rpc, what makes that more or less official than a gRPC-Gateway-but-Swagger-specific field? Adding a Swagger-specific annotation does mean we are updating our "openapiv2" annotation protos.

It would mean the types in the protobuf definition wouldn't match the signature of the function in the swagger API

It would, unfortunately, also mean having to update protoc-gen-grpc-gateway to understand swagger-specific annotations. Otherwise, we are just moving the problem from "we have a hack to decode JSON, strip the outer dictionary, re-encode JSON" to "we have a hack to decode JSON, strip the outer dictionary, re-encode JSON, and then we also annotate the proto so we get the correct swagger file".

Why not have a short-and-simple, nicely-named annotation that says "we need to strip the outer dictionary", and which is correctly understood by both protoc-gen-swagger and protoc-gen-grpc-gateway? The annotation obviously won't work with Google's ESF, but that can be documented.

@ivucica
Copy link
Collaborator Author

ivucica commented Jul 30, 2018

To paste a shorter variant of what I put into #712:

I suggest a careful examination of what's currently present in google.api.HttpRule documentation.

@johanbrandhorst
Copy link
Collaborator

@ivucica I can see you're being intentionally ambiguous, which is fine, but I'm interpreting this as you implying that this kind of thing exists-but-doesnt-exist-publically. That suggestion makes me a bit happier to accept this as it gives me some confidence it has been considered by a larger audience and corner cases dealt with, if any.

So we're back to a gateway specified field option, allowed only for one field per message, repeated or not, and only at the top level of any response hierarchy?

On a side note, that documentation is hilariously obviously not supposed to be public 😂.

Use this only for Scotty Requests.

Oh boy.

@ivucica
Copy link
Collaborator Author

ivucica commented Jul 30, 2018

So we're back to a gateway specified field option, allowed only for one field per message, repeated or not, and only at the top level of any response hierarchy?

If we want to do as upstream did, then it's attached to an rpc :) but it may require some more parsing akin to what's done for body.

I don't mind attaching to message either.

On a side note, that documentation is hilariously obviously not supposed to be public 😂

Psst! Relevant organs have been notified

@johanbrandhorst
Copy link
Collaborator

Attaching to an rpc sounds even better then. Will you try and look at a PR for this?

@ivucica
Copy link
Collaborator Author

ivucica commented Jul 31, 2018

Interesting design requirement was noted in #712 and a good reason for the new field in google.api.HttpRule has been raised (namely additional_bindings). I'm not sure how to solve this elegantly; if we were solving this for an RPC, it'd be fine, but if we try to solve it per .HttpRule it's somewhat harder.

@doroginin n.b. I expect I will file the request for the option ID tomorrow.

@ivucica
Copy link
Collaborator Author

ivucica commented Aug 2, 2018

@doroginin I have filed a request for a new option ID.

@ivucica
Copy link
Collaborator Author

ivucica commented Aug 2, 2018

No need for the option ID or for further hackiness -- #712 will be able to go through once upstream changes land and then we can look at closing this FR.

...this is less work for me than I expected ;)

@johanbrandhorst
Copy link
Collaborator

This sounds like the best outcome for everyone, nice!

@achew22
Copy link
Collaborator

achew22 commented Aug 2, 2018

@ivucica and @johanbrandhorst, you two are the absolute best! Thanks for caring so much about this project 😃

@ivucica
Copy link
Collaborator Author

ivucica commented Aug 17, 2018

I believe this is fixed in #712

@ivucica ivucica closed this as completed Aug 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants