-
-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal to add keyword metadata to vocabularies by defining a new vocabulary #1257
Conversation
Shouldn't new vocabularies be prototyped in https://github.com/json-schema-org/json-schema-vocabularies first? |
Probably, but not for this. That is for extension vocabs. This is intended to be integrated into the spec as it proposes modifications to the spec's meta-schemas. From the readme:
|
Why should it be in the spec rather than as an extension? |
Because it's intended to augment the specification's meta-schemas. Note how the spec's meta-schemas now declare this new vocab in the |
I'll have to look at this in more detail later, but here's my thoughts after a quick look. Knowing applicators is useful because it gives us a way to know what is a schema, but the rest of the information I can't see any real use for at the moment other than documentation. It occurs to me that this would lock down applicators to only being object-value, array-value, or in-place. We don't have any other examples and we've avoided keywords with more complex structure than that, but do we want to effectively disallow them? |
It may not be useful to you, but for me, it's useful knowing what type an annotation is expected to be. |
I don't understand. An applicator can look at the current instance (in-place) or the child of an object or array. There's not really anything else given the JSON data model. I can see where you may want to look at a combination of these, but I don't see any new options. |
I didn't mean to imply that wasn't useful. I just meant I hadn't had a chance to think about it enough to see how it was useful. Please do share your vision of how you would use those values.
I'm referring to a keyword that has (for lack of a better term) sub-keywords. I don't have time to come up with a good example, so hopefully a bad one will get the point across for now.
This is an applicator that is an object, but only of it's properties is a schema and the other is a number. Like I said, we don't have any keywords like this and we avoid doing things like this, but do we want to forbid this? Maybe we do. I'm not presenting an opinion at this point, just showing the example. |
@jdesrosiers I think for that example, But a keyword I've considered having a crack at a vocabulary for, which would fall outside of the ones described here, would be one that applies a schema to an arbitrary location identified by pointer, something like: {
"descendentsByPointer": {
"/foo/bar": {"type": "object"},
"/baz/name": {"type": "string"}
}
} This would be an inplace applicator (for an empty pointer), a child applicator, or an applicator for an arbitrary descendent. I'm not suggesting this vocabulary needs to address my half-baked unimplemented keyword idea, just throwing out thoughts. |
More broadly (than my reply above), I'm quite interested in this. It would help centralize metadata for each keyword, information that each implementation currently has to implement in code, to some degree. (At least, implementing unevaluated to rely on any inplace applicator, any implementation would have to code some form of inplace-applicator metadata.)
I'm uncertain about this. It seems maybe outside the scope of the metaschema. I would say the metaschema's purpose is describing the structure and validity of schema documents, whereas this has to do with the operations of processing the schema. This would significantly broaden the ideas of what the metaschema is for. Did you experiment with that idea of a separate document and end up finding that it worked better in the metaschema?
I questioned this at first and tried to find a structuring that would put the information on subschemas describing each keyword, which seemed nicer. But you are right, it doesn't fit well there. |
When we had previously discussed this metadata, we had explored options for the separate file and came up with nothing that really worked well. This is the first thing I've seen that seems to accomplish this goal. EDIT Unfortunately, I think a lot of that conversation is buried in history between myself and @handrews as he and I played with several ideas. |
@gregsdennis I finally remembered to take a look at this 😅 There are a lot of great ideas in here. A fair amount of this overlaps with the (at least) three keyword description formats I've come up with since we originally talked about having a separate file. (You haven't seen any of them because I haven't felt any of them were compelling enough to be worth proposing). However, I'm skeptical that a vocabulary of JSON Schema keywords is the right way to accomplish this, even if we want to inline such descriptions into meta-schemas. There are a couple of reasons for this:
|
I'm not sure I follow this. Are you talking about how it's not included in the main The vocabulary vocabulary defines keywords that only serve to provide details about the keywords another vocabulary defines.
Yes, I wrote that before you shared your slides. The core keywords have behavior, but that behavior is toward the schema rather than toward the instance. This proposal could (and probably should) be expanded to encompass your analysis. But for now, let's consider it a subset and evaluate it as such.
None of this makes sense to me. I think you think this proposal is doing more than is intended. |
OK, but how does an implementation figure out how to read the new keywords in, say, |
This data is still mostly informational. I don't expect an implementation to be able to figure out what to do with an applicator or assertion without some additional coding, just like today. I don't think we'll ever get to the point where an implementation can "just know" what to do with a new keyword that's not a pure annotation. If this isn't what you're talking about, perhaps you can elaborate on what you mean by "take action." What action would you expect from an implementation? |
meta/applicator.json
Outdated
@@ -2,7 +2,8 @@ | |||
"$schema": "https://json-schema.org/draft/next/schema", | |||
"$id": "https://json-schema.org/draft/next/meta/applicator", | |||
"$vocabulary": { | |||
"https://json-schema.org/draft/next/vocab/applicator": true | |||
"https://json-schema.org/draft/next/vocab/applicator": true, | |||
"https://json-schema.org/draft/next/vocab/vocabulary": false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This says that schemas using https://json-schema.org/draft/next/meta/applicator
as a meta-schema are able to use the vocabulary vocabulary, which I don't think is what you mean. You need this meta-schema's meta-schema (currently declared as https://json-schema.org/draft/next/schema
) to include the vocabulary vocabulary in $vocabulary
(buffalo buffalo buffalo buffalo buffalo...). In which case it probably would not be the default meta-schema because most schemas won't need to (and shouldn't) use the vocabulary vocabulary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need this meta-schema's meta-schema (currently declared as
https://json-schema.org/draft/next/schema
) to include the vocabulary vocabulary in $vocabulary
But if that's just https://json-schema.org/draft/next/schema
, it doesn't solve the problem you're describing.
What I want is for meta-schemas that describe vocabularies to be able to use these new keywords. Does that mean that we need a dedicated vocabulary-describing meta-schema? Then have (e.g.) applicator meta-schema reference that one in $schema
?
So
{
"$schema": "https://json-schema.org/draft/next/schema",
"$id": "https://json-schema.org/draft/next/meta/vocabulary",
"$vocabulary": {
"https://json-schema.org/draft/next/vocab/core": true,
"https://json-schema.org/draft/next/vocab/applicator": true,
"https://json-schema.org/draft/next/vocab/unevaluated": true,
"https://json-schema.org/draft/next/vocab/validation": true,
"https://json-schema.org/draft/next/vocab/meta-data": true,
"https://json-schema.org/draft/next/vocab/format-annotation": true,
"https://json-schema.org/draft/next/vocab/content": true,
"https://json-schema.org/draft/next/vocab/vocabulary": true
},
"$ref": "https://json-schema.org/draft/next/schema",
... // new vocab-vocab properties
}
Then https://json-schema.org/draft/next/meta/applicator
and friends all have $schema: https://json-schema.org/draft/next/meta/vocabulary
?
I think this also provides a pre-packaged meta-schema for other custom vocab meta-schemas to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if it has $schema: https://json-schema.org/draft/next/schema
then it can't itself use the keywords it defines. So does it need to be its own meta-schema? That line of logic leads us to needing to resolve Keyword for identifying bootstrapping rules #217 first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$vocabulary
is (effectively) an annotation. So when you use it in a meta-schema, it annotates the (non-meta-)schema and says "this schema can use this vocabulary".
If this were not the case, then every non-meta-schema would have to declare $vocabulary
, which would be a mess.
So yes, if you want other vocab meta-schemas to use the vocabulary vocabulary (VV), then the VV needs to be in their meta-schema's $vocabulary
. Since we would not want to put the VV in the default meta-schema, yes that means they would need a different meta-schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This data is still mostly informational. I don't expect an implementation to be able to figure out what to do with an applicator or assertion without some additional coding, just like today. I don't think we'll ever get to the point where an implementation can "just know" what to do with a new keyword that's not a pure annotation.
If this isn't what you're talking about, perhaps you can elaborate on what you mean by "take action." What action would you expect from an implementation?
At minimum, we need to be able to use any vocabulary description to automatically determine what keywords are part of that vocabulary, which will let us distinguish known keywords from optional (and not directly-supported) vocabularies from completely unknown keywords. Otherwise, it's just JSON-formatted documentation, and that doesn't seem useful to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So yes, if you want other vocab meta-schemas to use the vocabulary vocabulary (VV), then the VV needs to be in their meta-schema's $vocabulary. Since we would not want to put the VV in the default meta-schema, yes that means they would need a different meta-schema.
I think this means that you agree with my commented approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gregsdennis yes, I think so (I think I got confused by the next comment after that)
This is an interesting idea, but I think it still needs work. It was good to get it down and have a chat about it. |
This will likely be of most interest to @handrews and maybe @jdesrosiers.
Since the introduction of vocabularies, the idea of them being self-descriptive has been tossed around with abstract ideas but not much in the concrete space. This is a proposal that provides some of that meta-data that we've been looking for via a new vocabulary specifically designed for vocabulary meta-schemas.
This proposal adds three new keywords:
applicators
The value of this keyword is an object whose keys are all of the keywords defined in the vocab which have applicator behavior.
The values of each property is the kind of applicator that keyword is:
objectChild
,arrayChild
, orinPlace
.This references #602, in which @handrews classified applicators as being "object-child" (meaning they look at the children of an object instance), "array-child" (they look at the children of an array instance), or "in-place" (they look at the instance itself).
assertions
The value of this keyword is an array which contains the names of all keywords defined in the vocab which provide assertion behavior.
annotations
The value of this keyword is an object whose keys are all of the keywords defined in the vocab which either produce annotations or collects them from subschemas.
The value of each property is an object with two properties:
kind
identifies whether the keyword produces annotations, collects them, or both. The value isproducer
,collector
, or an array of these values (much like the array format we use fortype
). This property is required.producedAnnotation
provides a schema for the annotation value if the keyword is an annotation producer. This property is optional. If the keyword is an annotation producer and this property is missing, then the annotation produced by the keyword is expected to be the value of the keyword itself (e.g.title
produces an annotation equal to its value).I've also updated all of the existing meta-schemas to use this new vocab so that you can see how it works. Interestingly [nods to @jdesrosiers], the core keywords don't have any behavior. They are informative only and are self-descriptive rather than instance-descriptive.
I plan on writing up an actual vocabulary for this, but I wanted to get what I had in front of people before I put in that effort.
I know that previous discussions on doing something like this had wanted the vocab URI to actually function as a URL to point to some separate machine-readable document, but I rather like the idea of a vocab-describing vocab and having everything just embedded directly into the meta-schemas.
I think that these keywords go a long way toward building a generic validator. For instance
unevaluated*
depends on annotation results from in-place applicators. With the definition provided in this proposal, a validator can now know what all of the in-place applicators are just from knowing the vocabularies. The implementation ofunevaluated*
doesn't need to change to also know to wait for the results from new in-place applicator keywords defined by extra vocabularies. (This is something that my implementation can't currently handle. I'd need to have code for new in-place applicators specifically, which is really awkward.)That brings me to the other thing that could be included that I haven't added yet: annotation dependencies. For example,
additionalProperties
depends upon the annotation result fromproperties
andpatternProperties
.I had started with an alternate design of adding this information to the subschemas under the
properties
keyword, but I realized that doesn't really work for the same reason thatrequired
was moved out of the subschema into the parent schema going from draft 3 to draft 4.I think introducing these aggregate keywords is more in line with what we already have, even though it can feel somewhat redundant to list the keywords multiple times.