Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify data model types and formats #4045

Merged
merged 9 commits into from
Aug 27, 2024
Merged

Conversation

handrews
Copy link
Member

@handrews handrews commented Aug 20, 2024

This is what I think is actually correct. It aligns the table with how JSON Schema behaves. While I dislike re-explaining other specs in the OAS, it seems that a certain amount of explanation is required (because this is truly confusing).

I started with 3.1.1 because it uses unmodified JSON Schema. I think we should do the same with 3.0.4 as I don't think 3.0 actually adds any normative requirements that would change this, but we can sort that out if and when this PR is accepted.

Paging @baywet @mkistler @ralfhandl @darrelmiller

@handrews handrews added the clarification requests to clarify, but not change, part of the spec label Aug 20, 2024
@handrews handrews added this to the v3.1.1 milestone Aug 20, 2024
@handrews handrews requested a review from a team August 20, 2024 21:10
@handrews handrews force-pushed the format branch 2 times, most recently from a309c46 to f8bfbfc Compare August 20, 2024 22:59
versions/3.1.1.md Outdated Show resolved Hide resolved
Copy link
Contributor

@ralfhandl ralfhandl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, minor nits.

How about mentioning type: "integer" in the Schema Object section, similar to #4042 (which I could then close)?

versions/3.1.1.md Outdated Show resolved Hide resolved
versions/3.1.1.md Outdated Show resolved Hide resolved
ralfhandl added a commit to ralfhandl/OpenAPI-Specification that referenced this pull request Aug 21, 2024
Copy link
Contributor

@baywet baywet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I does make things a bit clearer. Was the plan to retro fit 3.0.4 as well?

Copy link
Contributor

@ralfhandl ralfhandl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid referencing multiple JSON RFCs

@handrews
Copy link
Member Author

@ralfhandl

How about mentioning type: "integer" in the Schema Object section, similar to #4042 (which I could then close)?

We should not be mentioning this in the Schema Object section at all. It is just standard JSON Schema. As @darrelmiller noted elsewhere, the point of using JSON Schema draft 2020-12 as-is in OAS 3.1 was to stop having to say things about JSON Schema in our own spec.

I would honestly prefer not to even say this much about the type system, but it's clear that more explanation was needed to properly contextualize the formats table.

But I draw tha line at duplicating JSON Schema info in the Schema Object section.

Copy link
Contributor

@mikekistler mikekistler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a few edits that I hope are improvements. I'm not wed to any of these but requested changes just to ensure we get to discuss these before a merge.

Models are defined using the [Schema Object](#schema-object), which is a superset of JSON Schema Specification Draft 2020-12.

Data types in the OAS are based on the six types supported by the [JSON Schema Specification Draft 2020-12 data model](https://tools.ietf.org/html/draft-bhutton-json-schema-00#section-4.2.1): array, boolean, null, number, object, or string. JSON Schema keywords and `format` values operate on these six types, with certain keywords and formats only applying to a specific type. For example, the `pattern` keyword and the `date-time` format only apply to strings, and treat any instance of the other five types as _automatically valid._ This means JSON Schema keywords and formats do **NOT** implicitly require the expected type. Use the `type` keyword to explicitly constrain the type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an excellent addition! A common problem I see is schemas that contain "properties" and assume that this implies "type: object" -- and I too made this error early on, until a wiser colleague educated me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad this makes sense - I am generally opposed to repeating information from JSON Schema, except where it has clearly been problematic and there isn't an easy single spot to reference in the JSON Schema text. That's why I'm OK with this but do not want to do anything but link to the section on what constitutes an integer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

versions/3.1.1.md Outdated Show resolved Hide resolved
Models are defined using the [Schema Object](#schema-object), which is a superset of JSON Schema Specification Draft 2020-12.

Data types in the OAS are based on the six types supported by the [JSON Schema Specification Draft 2020-12 data model](https://tools.ietf.org/html/draft-bhutton-json-schema-00#section-4.2.1): array, boolean, null, number, object, or string. JSON Schema keywords and `format` values operate on these six types, with certain keywords and formats only applying to a specific type. For example, the `pattern` keyword and the `date-time` format only apply to strings, and treat any instance of the other five types as _automatically valid._ This means JSON Schema keywords and formats do **NOT** implicitly require the expected type. Use the `type` keyword to explicitly constrain the type.

Note that the `type` keyword allows `"integer"` as a value for convenience, but keyword and format applicability does not recognize integers as being distinct from other numbers because [[RFC7159|JSON]] itself does not make that distinction. JSON Schema defines integers [mathematically](https://datatracker.ietf.org/doc/html/draft-bhutton-json-schema-00#section-6.3), meaning that both `1` and `1.0` are considered to be integers for the purpose of the `type` keyword.
Copy link
Contributor

@mikekistler mikekistler Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note that the `type` keyword allows `"integer"` as a value for convenience, but keyword and format applicability does not recognize integers as being distinct from other numbers because [[RFC7159|JSON]] itself does not make that distinction. JSON Schema defines integers [mathematically](https://datatracker.ietf.org/doc/html/draft-bhutton-json-schema-00#section-6.3), meaning that both `1` and `1.0` are considered to be integers for the purpose of the `type` keyword.
Note that the `type` keyword allows `"integer"` as a value for convenience, but JSON Schema keyword and format applicability does not recognize integers as being distinct from other numbers because [[RFC7159|JSON]] itself does not make that distinction. JSON Schema Validation defines integers [mathematically](https://datatracker.ietf.org/doc/html/draft-bhutton-json-schema-validation-00#section-6.1.1) as "any number with a zero fractional part", meaning that both `1` and `1.0` are considered to be integers for the purpose of the `type` keyword.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Section 6.3 "Mathematical Integers" is correct here. There is no 6.1.1 in that draft of JSON Schema.

I am against "any number with a zero fractional part" as a description as it is ambiguous as to whether it refers to the value or the textual representation.

Again, this is a JSON Schema thing. JSON Schema's wording is what matters, and OpenAPI CANNOT change the requirement. The only thing we can or should do here is reference the appropriate section of JSON Schema.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite right that there is no 6.1.1 in that draft. Let me see if I can find what I was looking at. In any event, I don't believe section 6.3 "defines integers -- this is the problem I was trying to fix, so let's find a good solution to that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@ralfhandl ralfhandl Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON Schema Validation defines "integer" in section 6.1.1

or "integer" which matches any number with a zero fractional part

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated my suggested change. @handrews please review.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikekistler the point of this link is to explain why 1.0 is an integer. The core spec §6.3 does that. That is literally the entire purpose of that section.

This text here already explains that the type keyword allows "integer". If you also want to link to the type keyword's section, we can do that on the word type (although I'd rather not- we don't do that for any other mention of any other JSON Schema keyword AFAICT so it would be a confusing inconsistency).

If you want to look up the details of type, it's obvious that you should look up its section in the validation spec. On the other hand, the mathematical nature of integers in JSON Schema is a recurring point of confusion, and it's less obvious where to resolve it. Hence linking to the section that explains it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@handrews When I follow the link to section 6.3 I see this text:

Some programming languages and parsers use different internal representations for floating point numbers than they do for integers.
For consistency, integer JSON numbers SHOULD NOT be encoded with a fractional part.

I don't think there is anything here that "defines integers", and I don't know why I would conclude from this text that "1.0" is an integer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikekistler OK let me think on this more. §6.1.1 of the validation spec still isn't what should go here.

versions/3.1.1.md Outdated Show resolved Hide resolved
@ralfhandl ralfhandl self-requested a review August 22, 2024 12:13
We had an anchor present for data type formats, but without
a section headering.  At this point, there is enough
information to have a section heading.

Also took @mikekistler's suggestion on flipping the first two
table columns, and tried "JSON Data Type" as a compromise
between "Data Model Type" and "Instance Type".
@handrews
Copy link
Member Author

I have added a new commit based on the feedback, doing the following:

  • Change from "Data Model Type" to "JSON Data Type" (which I could have sworn was being used elsewhere but now I can't find it)
  • Flip the columns as @mikekistler suggested
  • Replace the anchor on the paragraph before the table with a section heading that matches the former manually-added anchor (we now have enough text before that point to justify a subsection)
  • Remove the self-reference links in the column headers as @ralfhandl suggested (the format link is still a self-reference after the subsection is added, and the link to data types heading has been moved to the comment under the table about the nature of integers and type as it seemed to me to fit better there.

ralfhandl
ralfhandl previously approved these changes Aug 22, 2024
ralfhandl
ralfhandl previously approved these changes Aug 22, 2024
@ralfhandl
Copy link
Contributor

@mikekistler If you are fine with the current state, please approve and merge. Thanks!


#### Data Type Format

As defined by the [JSON Schema Validation specification](https://tools.ietf.org/html/draft-bhutton-json-schema-validation-00#section-7.3), data types can have an optional modifier property: `format`. As described in that specification, `format` is treated as a non-validating annotation by default; the ability to validate `format` varies across implementations.

The OpenAPI Initiative also hosts a [Format Registry](https://spec.openapis.org/registry/format/) for formats defined by OAS users and other specifications. Support for any registered format is strictly OPTIONAL, and support for one registered format does not imply support for any others.

Types that are not accompanied by a `format` property follow the type definition in the JSON Schema. Tools that do not recognize a specific `format` MAY default back to the `type` alone, as if the `format` is not specified.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW ... I think there is a minor conflict with JSON Schema here, since it has the Format Assertion Vocabulary and:

When the Format-Assertion vocabulary is declared with a value of true, implementations MUST provide full validation support for all of the formats defined by this specification.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikekistler by default what you get is the Format Annotation Vocabulary, to which implementations MAY add whatever opt-in validation they want to provide. This is because some major JSON Schema implementations refused to fully validate format, but did a certain amount of best-effort validation.

I really don't want to get into the Format Assertion Vocabulary here. I'm not sure many implementations support it, and you need to know how to change the meta-schema to include it.

Co-authored-by: Mike Kistler <[email protected]>
@handrews
Copy link
Member Author

@mikekistler I've accepted the change for "JSON data types" instead of just "types", and the one importing the text from the registry. I think I've answered all of the other questions/suggestions.

ralfhandl
ralfhandl previously approved these changes Aug 23, 2024
@handrews
Copy link
Member Author

@mikekistler here's another pass linking to where the equivalence of things like 1 vs 1.0 are discussed. Let me know what you think. I am avoiding linking to the type section because:

  1. I want to keep the point focused on the data model (with type: integer only needing explanation in the context of that data model)
  2. Everyone knows where to find the spec for type anyway
  3. I find showing these examples (1 and 1.0) makes it more clear to people than wording like "zero fractional part"
  4. I think the section on equivalence helps flesh out the rationale- if 1 is an integer (no one disputes this), and 1 and 1.0 are considered to be equivalent by JSON Schema, then 1.0 must also be an integer because "equivalence" means it has to be treated the same

I tweaked some wording to hopefully focus things a bit better.


<a name="data-type-format"></a>As defined by the [JSON Schema Validation specification](https://tools.ietf.org/html/draft-bhutton-json-schema-validation-00#section-7.3), data types can have an optional modifier keyword: `format`. As described in that specification, `format` is treated as a non-validating annotation by default; the ability to validate `format` varies across implementations.
Data types in the OAS are based on the six JSON data types supported by the [JSON Schema's data model](https://tools.ietf.org/html/draft-bhutton-json-schema-00#section-4.2.1): array, boolean, null, number, object, or string.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I understand that JSON Schema actually defines "integer" as a type, why shouldn't we include it here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"integer" is a value for the type keyword. It's not a JSON data type in the data model.

@handrews
Copy link
Member Author

Update: @mikekistler and I have been having a good discussion about whether this section now over-emphasizes JSON Schema's mechanisms compared to OAS behavior, so there is likely to be some rewriting or perhaps another alternative PR.

@miqui
Copy link
Contributor

miqui commented Aug 25, 2024

Update: @mikekistler and I have been having a good discussion about whether this section now over-emphasizes JSON Schema's mechanisms compared to OAS behavior, so there is likely to be some rewriting or perhaps another alternative PR.

@handrews, I am okay with this PR. However, I'll wait for your resolution on the above.

@handrews handrews marked this pull request as draft August 25, 2024 16:21
Copy link
Contributor

@mikekistler mikekistler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's my proposal for describing the OAS types but still clarifying that JSON Schema keywords and format values are applied conditionally base on JSON data type.

versions/3.1.1.md Show resolved Hide resolved
versions/3.1.1.md Outdated Show resolved Hide resolved
versions/3.1.1.md Outdated Show resolved Hide resolved
…mt restrictions


Clarifies that OAS data modeling is based on the types in the `type` validation keyword, but JSON Schema keywords and formats apply (or not) based on the JSON instance data model, which does not include integers.

Co-authored-by: Mike Kistler <[email protected]>
@handrews handrews marked this pull request as ready for review August 26, 2024 18:17
Copy link
Member Author

@handrews handrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, @mikekistler ! This is definitely the best yet, and with this comment I'm giving it my approval for merging (GitHub won't let me submit an approval since I originally submitted the PR).

I'm still pondering whether we ought to say anything about type: number vs type: integer, but that's a subset of the larger "how should programming languages handle type system mismatches" question and need not hold this up, I think.

I've taken the PR back out of draft.

mikekistler
mikekistler previously approved these changes Aug 26, 2024
Copy link
Contributor

@mikekistler mikekistler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 👍

versions/3.1.1.md Outdated Show resolved Hide resolved
Copy link
Contributor

@mikekistler mikekistler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. 👍

I flagged a few links that may?? need to be converted to Respec links. I haven't grokked the Respec link stuff yet, so this might be just misunderstanding on my part.

versions/3.1.1.md Show resolved Hide resolved
versions/3.1.1.md Show resolved Hide resolved
versions/3.1.1.md Show resolved Hide resolved
@ralfhandl ralfhandl merged commit f0ebca0 into OAI:v3.1.1-dev Aug 27, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification requests to clarify, but not change, part of the spec
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants