Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protoc-gen-swagger: should well known types be nullable #669

Closed
zheng1 opened this issue Jun 12, 2018 · 28 comments · Fixed by #2215
Closed

protoc-gen-swagger: should well known types be nullable #669

zheng1 opened this issue Jun 12, 2018 · 28 comments · Fixed by #2215

Comments

@zheng1
Copy link

zheng1 commented Jun 12, 2018

  • OpenAPI 3.0 support nullable [link] field to define schema can be null value.
  • go-swagger support vendor extensions for x-nullable [link]

Should protoc-gen-swagger convert well know types to nullable type?

@tmc
Copy link
Collaborator

tmc commented Jun 19, 2018

@ivucica do you want to chime in here?

@tmc tmc added openapi and removed openapi labels Jun 19, 2018
@ivucica
Copy link
Collaborator

ivucica commented Jun 26, 2018

@zheng1, I must admit I am confused at what's being proposed. Which well-known types would you make nullable and under what specific criteria?

Closest could be protobuf's built-in optional, but that's not quite the same semantically.

Finally, I am not aware of PRs that make protoc-gen-swagger generate OpenAPI v3. I believe it currently generates OpenAPI v2, so until that happens, the point is moot. I would not mind the proposal being better documented for when we get a v3 generator, of course.

@marcusljx
Copy link
Contributor

Actually this is a problem we're having over at my workplace with google.protobuf.Timestamp, which, when using protoc-gen-swagger, is converted to the following OpenAPI v2 definition:

definitions:
......
      some_timestamp:
        type: string
        format: date-time

This assumes that the JSON string only fulfils one of these cases:

  • non-existent key (key-value pair not even appearing in the JSON object)
  • empty string ("")
  • filled string ("some_value")

The cases above ignores the edge case that a JSON timestamp can also possibly be null like so:

{
    "some_key" : "some_value",
    "some_bool" : true,
    "some_timestamp" : null
}

As mentioned by @zheng1, go-swagger supports this case through the x-nullable vendor extension (go-swagger/go-swagger#1491).

It would be great if protoc-gen-swagger could have an option flag or something for generating the x-nullable configurations for google.Protobuf.Timestamp.

@johanbrandhorst
Copy link
Collaborator

I think actually this x-nullable extension should probably be enabled on all non-scalar types. I assume message types are already nullable, so maybe the safest thing is to have a whitelist in the generator that inserts this property. Is there anything I can do to help you get a PR with this in @marcusljx?

@ivucica
Copy link
Collaborator

ivucica commented Sep 13, 2018

I assume message types are already nullable

Unless they are proto2 required (which is "considered harmful", but it is possible).

@marcusljx
Copy link
Contributor

I think actually this x-nullable extension should probably be enabled on all non-scalar types. I assume message types are already nullable, so maybe the safest thing is to have a whitelist in the generator that inserts this property. Is there anything I can do to help you get a PR with this in @marcusljx?

I don't mind trying my hand at this @johanbrandhorst, but I'm afraid I'm not sure where to begin.

@johanbrandhorst
Copy link
Collaborator

I think @ivucica should be able to give some pointers.

@ivucica
Copy link
Collaborator

ivucica commented Sep 13, 2018

Not sure when I'll be getting around to studying the problem space. That is, here's some necessary things I don't know off the top of my head:

  • what the x-nullable extension actually is, and how to make use of it
  • where it needs to be written in the output file
  • how to adapt template.go to do this
  • how to specifically check if a field -- whatever its type -- is considered required

Each of these items is relatively short, I'd just need to spend the time to figure them out. Maybe just enumerating them helps you?

However, note: OP says x-nullable is an OpenAPI v3 feature. As I stated in my June 26 comment, I think we are only generating OpenAPI v2 at this time.

If you want to support x-nullable, presumably you should first contribute a new protoc-gen-openapiv3 (by copying protoc-gen-swagger and adapting it to spit out OpenAPI v3), including creating all the necessary tests, and then proceed by adding x-nullable support to it. Unless x-nullable can be used in OpenAPI v2 specs, in which case, by all means, adapt template.go to spit it out for any optional/non-required fields. (Even though all proto3 fields are optional and thus nullable, proto2 is not like that.)

@marcusljx
Copy link
Contributor

x-nullable is not restricted to OpenAPI v3, actually:
https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md#patterned-objects

Thanks for the contribution steps!

@johanbrandhorst
Copy link
Collaborator

If it's not oai3 specific then I think all of the above still apply, but the biggest problem I see is #4 on Ivan's list. It should by no means be impossible but it's no longer as simple as I first thought. Please feel free to give it a go though.

@fgblomqvist
Copy link

This is still very valid. I don't really follow why this would be difficult to implement/change?
The way it'd be used would be something like this:

	".google.protobuf.Timestamp": schemaCore{
		Type:   "string",
		Format: "date-time",
                Nullable: true,
	},

Which would spit out this:

definitions:
......
      some_timestamp:
        type: string
        format: date-time
        x-nullable: true

Are there other parts that need to be changed beyond the template/types?

@johanbrandhorst
Copy link
Collaborator

Thanks for your interest in the project, I'd be happy to review such a PR!

@ivucica
Copy link
Collaborator

ivucica commented Sep 20, 2020

@fgblomqvist Your example's syntax looks like Go code, and not like a .proto option definition. How would you define that
a field is nullable? Just using proto2's required vs optional seems obvious, but all fields in proto3 are optional.

It seems like in OpenAPI 2, x-nullable is a vendor extension: https://stackoverflow.com/a/48114322/39974 -- would you be interested in designing, defining a .proto options schema for, and making generic x-ANYTHING-HERE support work?

As far as OpenAPI 3.0 and 3.1 go, I am not sure we support those in the first place (did someone contribute it?), but if we do, they're different too.

Looking forward to seeing a proposal!

@fgblomqvist
Copy link

@ivucica Yes that's from the template.go file that was mentioned further up. I believe this issue solely focuses on the well-known types, aka. the wrappers that are provided by protobuf (the *.google.protobuf.*). AFAIK they're always nullable (but I'm a bit of a protobuf novice). Since we know that, we can safely output x-nullable for all fields that have one of those types.

I think the x-anything-here could be useful, sure, but it's a broader feature requiring different/more work.
And yes, afaik you don't support 3.0 or 3.1 so no need to worry about those yet (but those are easy enough to adapt to).

@ivucica
Copy link
Collaborator

ivucica commented Sep 20, 2020

TL;DR PR #1665 added the external file support to the v2 branch, which should allow you to tag WKTs. Can you check if this works? Stream-of-thought ramblings below.


I believe this issue solely focuses on the well-known types, aka. the wrappers that are provided by protobuf (the *.google.protobuf.*

It does seem the OP focused on WKT, yes -- I missed that.

AFAIK they're always nullable (but I'm a bit of a protobuf novice)

I don't think individual messages (i.e. types, structs, records, ...) can ever be defined as nullable in protobuf.

Message-type fields can be required or optional in proto2, just like primitive-type fields. I'm not sure if they will still get generated as pointers in Go and other languages, but this doesn't matter whether the (de)serializator will require them to be set.

In proto3, message-type fields will always be optional, which doesn't mean that you want to indicate in the API that you accept null.

Messages themselves have no 'nullability' property (the concept of whether the message itself is required or not). Fields that are typed as WKTs may or may not be required.

However... upon closer inspection, it seems that "making individual schemas (i.e. pb messages, structs, data types) nullable" is exactly what OpenAPI 3 (and presumably OpenAPI 2) is doing.

That is: in OpenAPI, the whole schema (pb message) has to be nullable or not. As a side note, this is confusing to me: why would you define a whole schema as nullable or not? Would it not be prudent to have a single schema defined in one place, separating the concern whether or not it's nullable in various contexts? OpenAPI seems to have chosen to define two schemas, one for nullable or another. Unless I'm misreading something.


Having said this, yes, I assume we could somehow do allow tagging messages as nullable, via an external config, and there could be an additional input field to modify the nullability property. This would allow WKT to be tagged as nullable.

In my view, the best way to approach this would be to reference #1665 which added internal/descriptor/openapiconfig/openapiconfig.proto to the v2 branch. This means you can attach additional values defined in protoc-gen-openapiv2/options.

I propose adding a map<string,CustomExtension> to grpc.gateway.protoc_gen_openapiv2.options.Schema message allowing for extensions, let's say custom_extensions, a oneof for bool, string and other types.

Key of this map would be mapped in the output as x-$KEY, with value that could be bool, string or other types. More flexible option is to use custom_extension_strings, a map<string,string> where the key still gets mapped to x-$KEY, but the value is raw JSON (strings such as null, {"a": "b"}, [5, 6, "ten"], "sonnet" or 19). Refer to #1636.

Once you have all this in place, you can now You can already define your override to x-nullable: true by updating the relevant .yaml file (the one that satisfies openapiconfig.proto). Pass --openapi_configuration=yourfile.swagger.yaml

Yep, while investigating this, I realized that it seems like this is already supported.

Please refer to protoc-gen-openapiv2/options/openapiv2.proto#L192 -- note the extensions field.

Note how there are many extensions defined in examples/internal/proto/examplepb/unannotated_echo_service.swagger.yaml and which get output into https://github.com/grpc-ecosystem/grpc-gateway/blob/ae3f3cc0db241c509ebc7b358039cb9f6af18fca/examples/internal/proto/examplepb/unannotated_echo_service.swagger.json.

Can you please check if this allows you to set x-nullable to true on a WKT?

@johanbrandhorst
Copy link
Collaborator

Me and @ivucica discussed this a bit offline and I think it comes down to the question of whether x-nullable is set on a parameter object (analogous to protobuf message fields) or on the schema object (analogous to protobuf messages). If it is the former, we should be able to determine that statically, (e.g. is the field required?). There's still somewhat of a philosophical discussion about whether an "optional" field is the equivalent of a "nullable" field, but I lean towards that being the case.

Maybe the next step would be to test running one of the swagger generators with a manually enhanced generated swagger.json with some x-nullable sprinkled into the parameter objects where desired. How does that sound @fgblomqvist? I don't think this will be as easy as just always setting x-nullable, since it is not true when the message is used in a required field.

@ivucica
Copy link
Collaborator

ivucica commented Sep 21, 2020

Alright, so I did misunderstand, indeed.

Re-reading the stackoverflow answer for reference of syntax, it looks like this is meant to be properties. However, we are inlining only primitive properties; any message type is going to be $ref'd and the global definition from definitions will be used. Therefore for messages, we need a nullable and a non-nullable schema.

Presumably, we can blindly generate examplepbUnannotatedSimpleMessage and optionalExamplepbUnannotatedSimpleMessage and just $ref the right thing one in the right place. If we want to be extra clean, we can eliminate the unused ones. Hence we'd get the variant of WKTs output as needed.

If we do choose to generate both of the schemas for each of the messages and enums, i.e. anything output to the definitions section, I suppose maybe:

  • defaulting to nullable version for anything optional (incl. all of proto3),
  • supporting google.api.field_behavior and
  • documenting field_behavior is how you flip the switch

would be fine?

And for the primitive types, we set x-nullable inline (i.e. next to a string, boolean, etc). No thoughts on oneofs and anything else. Am I on the right track?

The immediate workarounds specifically for WKTs, for everyone that needs only nullable WKTs, is to try using x-nullable using the override file per @johanbrandhorst's #669 (comment) and my #669 (comment)

@fgblomqvist
Copy link

Thank you both for taking the time to think about this issue!

It does indeed seem like using that override file could potentially work, though, I have yet to investigate/try it myself.

And yes, you're right in that it's perhaps not as easy as it sounds to solve this, due to the "required"-stuff in proto2. I don't think the default should be nullable for all optional things either, since that's not really (afaik) what the proto spec says. In my mind, being optional is simply the same as omitting the field (in terms of JSON).

In proto3, WKTs should be outputted as nullable at all times. That's simple enough. However, I don't know what the rules would look like in proto2 since I haven't worked much with that.

@ivucica
Copy link
Collaborator

ivucica commented Sep 21, 2020

I don't think the default should be nullable for all optional things either,

You are right, in the sense that value of null != absence of value != default value. However, some of it may be equal depending on whether you're talking about proto2 or proto3.

Broadly speaking:

  • proto2 treats unset fields as having default 'zero values'

    • On message fields, null is the closest approximation of an 'unset field'.
    • Go's classic proto API will generate pointer types on optional fields, where null means unset.
  • proto3 deprecates required fields and default values, and treats unset values as having zero values

    • Go's classic proto API will generate pointer struct types on messages, but nothing else, meaning you can technically represent a message as unset...
    • but that's not very useful, as the correct interpretation is really 'no field value has been set', i.e. they all have default zero-values

But that's the use of null in Go. What about JSON?

Only language guide for proto3 specifies the JSON mapping. It does not say it should be emitted except in case of WKT NullValue, and it says null in the incoming traffic can be always interpreted as 'default fields'.

since that's not really (afaik) what the proto spec says

I assume you meant language guides for proto2 and proto3, because the proto specs for proto2 and proto3 are really just defining the language grammar for .proto files, using mostly BNF.

In proto3, WKTs should be outputted as nullable at all times.

Why just WKT? Every message is equally nullable/non-nullable in proto3, and WKTs are not special. I'd say if someone is going to do this, it should be done right for all messages and fields and nost just WKTs. It should be done for both proto2 and proto3:

  • emit both optional and required variant of the schema (bonus points for skipping the ones that turn out unused),
  • then for each field, either use the optional/required schema, or for primitive types use correct one

You are right that some WKTs are a bit special: fieldmasks, timestamps, durations, lists and such are defined to be output as primitive types. But they then become regular primitive types. So, the same principle of 'announcing nullability' should be used for all fields. [You may note that I strangely kept wondering why WKTs are brought up as special. That's because I didn't remember that timestamps and such have special encodings.]

So either we use the optional concept (which is hidden in proto3, but still present because everything is optional) to announce nullability for all primitive types, or we default to not emitting vendor specific stuff, and only use the field_behavior for this.

  • To reduce noise in swagger.json, I propose we don't emit anything for any primitive field, and let API developers specify google.api.field_behavior per field.
  • Handling of non-primitive types -- message types, enums etc -- can be a separate effort, still tracked in this issue.

How does this sound?

@fgblomqvist
Copy link

I think that sounds good.
While I myself don't have the time to contribute to this at the moment unfortunately, at least the goals have been clarified so it should be fairly clear for someone else (now or in the future) to better understand what needs to be done.

Thanks again for taking the time to write such thorough responses.

@ewhauser
Copy link
Contributor

I've added support for google.api.field_behavior (specifically FieldBehavior.REQUIRED and FieldBehavior.OUTPUT_ONLY) in #1806. Having FieldBehavior.OPTIONAL set nullable behavior should be fairly trivial after this change is merged.

@irridia
Copy link
Contributor

irridia commented Mar 6, 2021

Sorry, long post.

I implemented the field behavior approach on your changes, @ewhauser (thx!). Diff attached below for reference.

But... First, the definition of OPTIONAL in the field behavior spec is pretty clear that it's Optional in the proto v2 sense. And it has the attendant restrictions, including that the first (or only) field in a protobuf Message cannot be OPTIONAL.

While this "works", I'd have to change the order of one of my existing messages, and someone else down the road will have to remember that OPTIONAL will be silently ignored on the first field. Combine this with the fact that Nullable isn't really exactly the same as Optional, and I'm stuck with stinkface.

The motivation for me is that google.protobuf.Timestamp over swagger will always be populated with the delightful 0001-01-01T00:00:00Z string, which completely breaks the update_mask logic. Nullable causes go-swagger (e.g.) to use *strfmt.DateTime rather than strfmt.DateTime, which allows detection of a true "unspecified" value.

@ivucica I'm not convinced that, given the orthogonal semantics and restrictions of Proto's Optional, that Optional == Nullable is the best approach. Also, the extensions approach doesn't seem feasible. First, I don't see extensions as a valid field at the field level, and even if that worked, I would have to manually specify every google.protobuf.Timestamp field in all my protos in that yaml file. Which. Yeah. Unless I'm missing something?

Personally, I feel like choosing string/date-time may not have been the right approach for google.protobuf.Timestamp ({secs: int, necs: int}). I feel like it could have been a $ref to a true object rather than a RFC3339Nano string with an obtuse zero value. Thus, generators would likely be doing the "right" thing.

I would think a string with a format would still qualify as a scalar, so nullable non-scalars doesn't really scratch the itch.

I'm also suspicious of the concept of making WKTs nullable by default, since this feels like it conflicts with the protobuf mechanism and would of course break backwards-compatibility. It's also a harsh optional switch to throw, which would change generation for all WKTs.

Abandoning google.protobuf.Timestamp and using a "plain" string with a presumed RFC3339 format is safe, and I would probably already do that if I didn't want to retain the serialization benefits of g.p.T for GRPC clients.

The only somewhat narrow option I can think of is to have protoc-gen-openapiv2 pass x-nullable for string types with a format specified via a flag: these are the only cases I've seen where the 0-value problem causes intractable issues. And later the 3.0 and 3.1 analogues. Thoughts?

Until then, there's always sed -Ee 's/^([ \t]*)("format":[ \t]*"date-time")/\1"x-nullable": true,\n\1\2/'...

--- a/protoc-gen-openapiv2/internal/genopenapi/template.go
+++ b/protoc-gen-openapiv2/internal/genopenapi/template.go
@@ -388,6 +388,7 @@ func renderMessagesAsDefinition(messages messageMap, d openapiDefinitionsObject,
                        schema.MaxProperties = protoSchema.MaxProperties
                        schema.MinProperties = protoSchema.MinProperties
                        schema.Required = protoSchema.Required
+                       schema.Nullable = protoSchema.Nullable
                        if protoSchema.schemaCore.Type != "" || protoSchema.schemaCore.Ref != "" {
                                schema.schemaCore = protoSchema.schemaCore
                        }
@@ -2182,6 +2183,7 @@ func updateSwaggerObjectFromFieldBehavior(s *openapiSchemaObject, j []annotation
                        s.ReadOnly = true
                case annotations.FieldBehavior_FIELD_BEHAVIOR_UNSPECIFIED:
                case annotations.FieldBehavior_OPTIONAL:
+                       s.Nullable = true
                case annotations.FieldBehavior_INPUT_ONLY:
                        // OpenAPI v3 supports a writeOnly property, but this is not supported in Open API v2
                case annotations.FieldBehavior_IMMUTABLE:
--- a/protoc-gen-openapiv2/internal/genopenapi/types.go
+++ b/protoc-gen-openapiv2/internal/genopenapi/types.go
@@ -155,6 +155,7 @@ type schemaCore struct {
        Type     string          `json:"type,omitempty"`
        Format   string          `json:"format,omitempty"`
        Ref      string          `json:"$ref,omitempty"`
+       Nullable bool            `json:"x-nullable,omitempty"`
        Example  json.RawMessage `json:"example,omitempty"`
 
        Items *openapiItemsObject `json:"items,omitempty"`

@ewhauser
Copy link
Contributor

ewhauser commented Mar 8, 2021

@irridia
Copy link
Contributor

irridia commented Mar 8, 2021

Yeah, though to be fair they're annotating all of the fields as if they were proto v2, so I'm not sure it's intended to specifically address nullability of g.p.Timestamp?

@powerman
Copy link

Just to make sure I didn't miss any solution in this thread: at the moment there is no way to distinguish between map<string, google.protobuf.Int32Value> and map<string, int32>, and thus there is no way to send map like {"key1":null,"key2":42} using grpc-gateway? No *.openapi.yml option or sed trick helps?

@irridia
Copy link
Contributor

irridia commented Mar 11, 2021

@powerman Alas, not that I can see off-hand, and I'm assuming there's no google.protobuf markup for this situation. Even for a bare WKT Int32, you'd need the above patch. Applying OPTIONAL to a Map would/should only apply to the map itself, not its values/additional_properties.

And I wasn't able to figure out a workable yaml solution.

Conceivably, one could mark proto maps with some kind of // My value is nullable. comment, then create a tool that walks through generated swagger, matches fields to the original proto, and when the comment is found adds x-nullable to the matching object additional_properties (i.e., value):

          "additionalProperties": {
            "type": "integer",
    >>>     "x-nullable": true,    <<<
            "format": "int32",
          },

I'm just thinking out loud there—I'm not sure if x-nullable is schemaful in additional_properties.

FWIW, after having tried to use maps many times, I now try to avoid them in proto specs, opting for repeated Messages over the wire instead—even though it implies an additional conversion. The GRPC-GW concept is very powerful IMO, but there are rough edges that exist between OpenAPI and GRPC that require occasional compromise, IME.

@powerman
Copy link

I'm just thinking out loud there—I'm not sure if x-nullable is schemaful in additional_properties.

Nope, this change in swagger.json doesn't affect grpc-gateway and it doesn't allow null values.

@irridia
Copy link
Contributor

irridia commented Mar 11, 2021

Not what you want, I'm sure, but this is the only thing I can think of:

message Map {
    string key = 1;
    google.protobuf.Int32Value value = 2; // OAPI_NULLABLE
}

message Thing {
    repeated Map my_fake_map = 1;
}

Then use the script. I'm pretty sure this solution is worse than the problem, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants