-
-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vocabularies and "format" #563
Comments
Formats usually serve the dual purpose of a programming-language-independent code-generation hint for the recipient and a production/validation instruction for the sender or an intermediate validator, so pulling Allowing vocabularies to define specific It would also remove the single point of reference for looking up existing formats and their meaning. Unless that vocabulary mechanism is accompanied by a central "registry" or "repository" for format definitions. The "endless stream of requests for standardized formats" just shows how important interoperability is. |
Agreed. I think for a central list, we should look to the IANA registry model. For now, we'll keep a small standard set in one of the major specification drafts. But if vocabularies are identified by URIs, and we define some clear way for a vocabulary to define format values (and similar extensible value sets), then we can probably use vocabulary URIs with fragments to completely identify a format term. |
Sounds like a plan: 👍 |
The link relations IANA registry has a nice model, either:
Either form can be used by anyone. |
If valid values of At the very least, the standard should specify that unimplemented formats should cause a validation error. Otherwise, consumers will get false positives, which contradicts the purpose of using schemas. |
@reitzig many formats are very expensive to validate, or even impossible to validate in a guaranteed way. The point of this issue is to allow people to say "Please only attempt to process this schema if you understand formats X, Y, and Z." That's not possible now, but will be in the future. That will allow those who are happy with best-effort (some people don't expect validation and just want the format to be shown as documentation anyway) to continue to use things as-is, while those who want strict conformance and fail-fast can guarantee that. |
I can see that using Still, inconsistent behaviour concerns me. I don't want tests to fail if runtime is fine, or the other way around, or some clients accept bad replies while others don't. In my mind, it therefore makes more sense to require that That's a very pessimistic perspective, of course. Systems that use only a single validation library are completely fine with an "intermediate" situation. |
With PR #671, we now have the concept of a formal vocabulary in the spec. That's a really complex PR, so here is the TL;DR (which is still pretty long- sorry):
There's more to the PR, but these are the key points for what I want to say about What does this have to do with
|
Just to clarify, implementations should still ignore unrecognized keywords (i.e. keywords not defined by any of the vocabularies), correct? |
Correct- this is covered in the PR, at least I think it is. When in doubt, the PR takes precedence, I just know few people will slog through the PR (thanks again for doing that, btw!), so I summarized it a bit here. |
Do I understand correctly that for declaring that validation requires support for certain keywords for a certain (set of) schema(s), I have to write a new meta schema? That strikes me as odd. The "have to" part, that is. It makes sense if I have a large number of similar schemas and the infrastructure to make a meta-schema available. However, in small use cases -- say I have a schema describing my log format which I use in automated tests -- it would be much more convenient to extend the vocabulary of a standard meta-schema in the schema. Regarding formats, a simple convention could be to have one (dummy) URL per format, for example
Furthermore, I'm thinking about a vocabulary that triggers a strict mode where we require support for all used keywords (imho a very natural choice):
(Is there something like a strict mode in the spec?) |
@reitzig unless you've declared new keywords not already defined by the draft schema, then, no, you don't need to write additional meta-schemas. As @handrews said, if you can come up with a better, complete option that covers all of the same scenarios, please write it up (with examples). Right now, this is where we are. This issue has been open for 8 months now, and I'm sure he's been working on the idea longer than that. Defining each format in a separate vocabulary seems inefficient. In his Requiring Specific Formats section, he states that vocabularies could redefine certain formats to make validation required. I'm sure we can figure out a way to make them validation forbidden to support the case where a supplied format is not wanted. |
@reitzig Meta-schemas are how implementations are told what behavior to apply, so yes, if you want to change the behavior, you have to change/write a new meta-schema. This is unlikely to be changed because:
In case it was not clear from my wording above: I'm not satisfied with any current solution to this problem, including the ones I outlined earlier. I think that the correct solution involves having a real vocabulary definition format, which is not going to make the cut for draft-08. I posted this to make sure no one had a better idea before going ahead with punting this to draft-09 (technically it was never in the draft-08 milestone, but I had hoped to pull it in).
The solution I proposed would allow that.
That's too magical of a behavior for something that is really a very narrow problem. Most keywords are either supported or not, |
As we've worked with the concept of vocabularies and further tried to figure out what to do with For example, a date-time vocabulary could provide several keywords replacing the We're not going to drop Therefore, I'm closing this as irrelevant due to a change in direction. For the current status of |
@handrews is there any library that supports |
We have numerous requests for additional formats, or ideas that could be implemented with formats (see #152, #312, json-schema-org/json-schema-vocabularies#45, json-schema-org/json-schema-vocabularies#49, #542).
There have also been a number of discussions around the implementation requirements, particularly
which becomes more burdensome as we add more formats. One idea has been to say that you only need to implement certain subsets (e.g. you could implement the date and time formats but skip the URI/IRI formats, but you shouldn't just implement uri-reference and not uri). But there's no good way to convey support levels. Which brings us to:
which makes interoperability very challenging.
format
is really only reliably useful as a semantic annotation, not as validation.We need a better story here, if for no other reason than to figure out how to manage the endless stream of requests for standardized formats. Currently, that's the only way to achieve any level of interoperability, so there is a high motivation to push for inclusion.
If we go with vocabulary support along the general lines of #561, I feel like this should help manage the variations. A vocabulary could include specific
format
values (and likewise forcontentType
andcontentEncoding
, I suppose).I've split this out from #561 because it's not clear how it would work. Using a meta-schema, as #561 proposes, doesn't work well for
format
values becauseenum
is difficult to combine. The typicalallOf
used to compose vocabularies produces the intersection ofenum
s rather than the union. ButanyOf
, which would work, would often produce unexpected behaviors otherwise.Some other mechanism may be required. Therefore this is filed in the future milestone as a question. If we come up with a broadly supported proposal quickly, it can be added to draft-08, but otherwise it's fine as a follow-on in a later draft.
The text was updated successfully, but these errors were encountered: