Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance jsonschema.exceptions.best_match to prefer deeper errors #698

Closed
palemieux opened this issue Jun 26, 2020 · 10 comments
Closed

Enhance jsonschema.exceptions.best_match to prefer deeper errors #698

palemieux opened this issue Jun 26, 2020 · 10 comments
Labels
Enhancement Some new desired functionality Error Reporting Issues related to clearer or more robust validation error reporting Needs Simplification An issue which is in need of simplifying the example or issue being demonstrated for diagnosis.

Comments

@palemieux
Copy link

Given the schema below, the following instance fails with a single error:

{'schema': 'C', 'kind': {'schema': 'D', 'foo': 2}}: {'schema': 'C', 'kind': {'schema': 'D', 'foo': 2}} is not valid under any of the given schemas

While it is true that the instance does not match either definitions B or C, could the validator indicate that the instance matched until it reached /kind/foo. Otherwise, schema validation will always fail on the top level anyOf, which is not very helpful.

Perhaps this is already possible and I missed it, or there is a way to design the schema differently?

Sample instance

{
  "schema": "C",
  "kind": {
    "schema": "D" ,
    "foo": 2
  }
}

Schema

{
  "$schema": "http://json-schema.org/schema#",
  "$ref": "#/definitions/A",
  "definitions": {
    "A": {
      "anyOf": [
        {
          "$ref": "#/definitions/B"
        },
        {
          "$ref": "#/definitions/C"
        }
      ]
    },
    "B": {
      "type": "object",
      "properties": {
        "schema": { "const": "B" },
        "name": { "type" : "string" },
        "value": { "type" : "number" }
      },
      "required": [ "schema" ],
      "additionalProperties": false
    },
    "C": {
      "type": "object",
      "properties": {
        "schema": { "const": "C" },
        "name": { "type" : "string" },
        "kind": { "$ref": "#/definitions/D" }
      },
      "required": [ "schema" ],
      "additionalProperties": false
    },
    "D": {
      "type": "object",
      "properties": {
        "schema": { "const": "D" },
        "foo": { "type" : "string" }
      },
      "required": [ "schema" ],
      "additionalProperties": false
    }
  }
}
@Julian

This comment has been minimized.

@palemieux

This comment has been minimized.

@Julian

This comment has been minimized.

@Julian Julian closed this as completed Jun 26, 2020
@palemieux
Copy link
Author

Thanks for the feedback.

More detailed example below.

Configuration

"jsonschema": {
            "hashes": [
                "sha256:4e5b3cf8216f577bee9ce139cbe72eca3ea4f292ec60928ff24758ce626cd163",
                "sha256:c8a85b28d377cc7737e46e2d9f2b4f44ee3c0e1deac6bf46ddefc7187d30797a"
            ],
            "index": "pypi",
            "version": "==3.2.0"
        }

Schema

{
  "$schema": "http://json-schema.org/schema#",
  "$ref": "#/definitions/A",
  "definitions": {
    "A": {
      "anyOf": [
        {
          "$ref": "#/definitions/B"
        },
        {
          "$ref": "#/definitions/C"
        }
      ]
    },
    "B": {
      "type": "object",
      "properties": {
        "schema": { "const": "B" },
        "name": { "type" : "string" },
        "foo": { "type" : "number" },
        "child" :  { "$ref": "#/definitions/A" }
      },
      "required": [ "schema" ],
      "additionalProperties": false
    },
    "C": {
      "type": "object",
      "properties": {
        "schema": { "const": "C" },
        "name": { "type" : "string" },
        "foo": { "type" : "string" },
        "child" :  { "$ref": "#/definitions/A" }
      },
      "required": [ "schema" ],
      "additionalProperties": false
    }
  }
}

Instance

{
  "schema": "C",
  "child" : {
    "schema": "B",
    "child" : {
      "schema": "C",
      "foo" : 2,
      "child" : {
          "schema": "B",
          "foo" : 2
        }
    }
  }
}

Code

import jsonschema
import json
import argparse

parser = argparse.ArgumentParser(description = 'Validate file against JSON Schema')

parser.add_argument(
    'schema',
    type = argparse.FileType('r'),
    help = 'Path of the Schema'
  )

parser.add_argument(
    'file',
    type = argparse.FileType('r'),
    help = 'Path of the file to validate'
  )

args = parser.parse_args()

schema = json.load(args.schema)

doc = json.load(args.file)

validator = jsonschema.Draft7Validator(schema)

print(jsonschema.exceptions.best_match(validator.iter_errors(doc)))

Actual output

'B' was expected

Failed validating 'const' in schema[0]['properties']['schema']:
    {'const': 'B'}

On instance['schema']:
    'C'

Potential output

2 is not of type 'string'

Failed validating 'type' in schema[0]['properties']['foo']:
    {'type': 'string'}

On instance['foo'] at "/child/child"

@Julian Julian reopened this Jun 26, 2020
@Julian
Copy link
Member

Julian commented Jul 2, 2020

I haven't gotten a chance to have a look at your last message, but in that new example it seems more likely there's something we may be able to improve (unclear; best_match is always a bit of a heuristic, and one has to be careful that making one example simpler doesn't make another more complex)

But I think from what I see it's possible best_match could potentially also decide to show errors that are deeper instead of shallower or something (it does something like that now, which is why I'd have to see why it didn't do so in that specific case)

@willson-chen

This comment was marked as off-topic.

@palemieux

This comment was marked as off-topic.

@Julian Julian added the Enhancement Some new desired functionality label Jul 16, 2020
@Julian Julian changed the title Validation error when using anyOf Enhance jsonschema.exceptions.best_match to prefer deeper errors Jul 16, 2020
@bliiben

This comment was marked as outdated.

@Julian

This comment was marked as outdated.

@Julian
Copy link
Member

Julian commented Oct 31, 2023

A mostly simplified version of this example is:

schema = {
    "anyOf": [
        {"$ref": "#/definitions/B"},
        {"$ref": "#/definitions/C"}
    ],
    "definitions": {
        "B": {
            "properties": {
                "schema": { "const": "B" },
                "child" :  { "$ref": "#" }
            },
        },
        "C": {
            "properties": {
                "schema": { "const": "C" },
                "foo": { "type" : "string" },
                "child" :  { "$ref": "#" }
            },
        }
    }
}

instance = {
    "schema": "B",
    "child" : {
        "schema": "B",
        "child" : {"schema": "C", "foo" : 2}
    }
}

from pprint import pformat, pprint
from textwrap import indent
from jsonschema import Draft7Validator, exceptions

Draft7Validator.check_schema(schema)
validator = Draft7Validator(schema)

errors = list(validator.iter_errors(instance))
best = exceptions.best_match(errors)

pprint(
    [
        (each.message, "#/" + "/".join(each.absolute_path), "#/" + "/".join(map(str, each.absolute_schema_path)))
        for each in best.context[2].context[0].context
    ]
)

The issue is essentially that neither error really is deeper. They're both at the same level -- specifically the above produces:

[("'B' was expected",
  '#/child/child/schema',
  '#/anyOf/1/properties/child/anyOf/0/properties/child/anyOf/0/properties/schema/const'),
 ("2 is not of type 'string'",
  '#/child/child/foo',
  '#/anyOf/1/properties/child/anyOf/0/properties/child/anyOf/1/properties/foo/type')]

i.e. they're at the same level both in the instance and schema.

I'm inclined to close this as much as I agree the other message would be better for this case, I can't think of a heuristic that actually would get us there. I'm open to suggestions, but what's in the title here isn't it, and we already do indeed take depth into account.

@Julian Julian closed this as not planned Won't fix, can't repro, duplicate, stale Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Some new desired functionality Error Reporting Issues related to clearer or more robust validation error reporting Needs Simplification An issue which is in need of simplifying the example or issue being demonstrated for diagnosis.
Projects
None yet
Development

No branches or pull requests

4 participants