Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

order of oneOf subschemas changes type coercion result #437

Closed
ryanmeador opened this issue Mar 14, 2017 · 9 comments
Closed

order of oneOf subschemas changes type coercion result #437

ryanmeador opened this issue Mar 14, 2017 · 9 comments
Labels

Comments

@ryanmeador
Copy link
Contributor

What version of Ajv are you using? Does the issue happen if you use the latest version?
5.0.3-beta.0

Ajv options object (see https://github.com/epoberezkin/ajv#options):

{
  coerceTypes: true
}

JSON Schema (please make it as small as possible to reproduce the issue):

{
  type: "object",
  additionalProperties: false,
  properties: {
    "test": {
      oneOf: [{
        type: "number"
      },
      {
        type: "string",
        format: "email"
      }]
    }
  }
}

Data (please make it as small as posssible to reproduce the issue):

{
  test: "10"
}

Your code (please use options, schema and data as variables):

const Ajv = require('ajv');

const options = {
  coerceTypes: true
};

const schema = {
  type: "object",
  additionalProperties: false,
  properties: {
    "test": {
      oneOf: [{
        type: "number"
      },
      {
        type: "string",
        format: "email"
      }]
    }
  }
};

const data = {
  test: "10"
};

const ajv = new Ajv(options);

const validated = ajv.validate(schema, data);

console.log(`Validated? ${validated}`);
console.log(data);

Validation result, data AFTER validation, error messages:

Validated? true
{ test: '10' }

What results did you expect?
I expected the type of the test property to have been coerced to a number, since it was coercion to that type that allowed the validation to pass (note that if you set coerceTypes to false it fails). If you rearrange the two items in the oneOf section of the schema so that the one requiring it to be a number is last, it will correctly coerce the type to a number. The docs explicitly mention anyOf and to my reading indicate this should work, so I'm assuming that oneOf should work too. I haven't tested a variant of this with anyOf because I am having difficulty of reformulating the problem to use that keyword.

Are you going to resolve the issue?
I am willing to put some time into fixin this, but I will need a big hint as to how. This codebase is very complex.

@epoberezkin epoberezkin added the OK label Mar 14, 2017
@epoberezkin
Copy link
Member

epoberezkin commented Mar 14, 2017

@ryanmeador it is expected. oneOf to be valid has to have only one subschema valid, so it has to always validate all of them, and if there is coercion that can happen to make type in the second one valid this coercion will happen, as in your case. Instead you should be using anyOf that is valid if any of the subschemas valid and it always stops validation as soon as it finds such subschema.

In general it is always better to use anyOf (an equivalent of boolean OR) unless you need the additional restriction that oneOf (an equivalent of boolean XOR) imposes and more than one of your schemas can be valid at the same time.

@ryanmeador
Copy link
Contributor Author

ryanmeador commented Mar 14, 2017

Thank you for your swift reply. I'm not sure I understand why this is expected. "10" is not a valid email address, so only one of the oneOf items should succeed. Thus, according to the docs, it is the number entry that passes, and the coercion to number should happen. Is it not fair to restate the method of applying coercion as this: after being successfully validated with coerceTypes == true, data will be left in a state such that it would successfully validate even if coerceTypes == false? As it stands, the data I'm left with is invalid per the schema.

@epoberezkin
Copy link
Member

epoberezkin commented Mar 14, 2017

Is it not fair to restate the method of applying coercion as this: after being successfully validated with coerceTypes == true, data will be left in a state such that it would successfully validate even if coerceTypes == false? As it stands, the data I'm left with is invalid per the schema.

In most cases it is correct, but such guarantee in ALL cases is out of scope of this feature - it requires a complex rollback logic based on remembering in which state the data was before coercion and also understanding what should/shouldn't be rolled back. At the moment coercion is very simple and local and it has no memory of the original state. Given that the problem is easily solved by switching from oneOf to anyOf, I don't think some edge cases when the requirement you state is not met are important enough to justify this complexity.

@ryanmeador
Copy link
Contributor Author

Now that I have a better understanding of the complexity, I agree it isn't a good use of time to prioritize fixing this :) It probably would be worth a quick blurb in the docs to explain the limitation and the workaround, though. If I can come up with something concise I'll submit a PR. Thanks again.

@epoberezkin
Copy link
Member

Related issues:
#395 (justification for coercion matrix and some discussion on rollback logic)
#399 (using custom keywords for coercion)

@epoberezkin
Copy link
Member

+1 for PR updating docs - docs is a huge time-sink

@epoberezkin
Copy link
Member

Is it not fair to restate the method of applying coercion as this: after being successfully validated with coerceTypes == true, data will be left in a state such that it would successfully validate even if coerceTypes == false? As it stands, the data I'm left with is invalid per the schema.

@ryanmeador Another simple example when this statement cannot be true, regardless the level of extra complexity we introduce:

schema:

{
  "allOf": [
    { "type": "boolean" },
    { "type": "number" }
  ]
}

data: true

The validation will succeed with coerceTypes: true, but the data won't be left in the state when it is valid without coercion (and for this schema no such data exists at all).

In general, using this option, we mix two concerns - validation and transformation and that's where the challenges comes from. While for pure validation the order of subschemas shouldn't matter (and it doesn't) for transformation that mutates data it does, so some deeper understanding of the process is required when writing the schemas: "oneOf" should generally be avoided; instead "anyOf" or "if/then/else" (proposed for draft-07, available as part of ajv-keywords package), should be used, that have more predictable outcomes and better error reporting.

@ryanmeador
Copy link
Contributor Author

Hmm, interesting example. In an ideal world, I would the think the procedure would be to try validating without any coercion and only coerce if that fails. Then it should pick a coercion and try to validate the entire thing again with that coercion, if that fails, try the next possible coercion, etc. This avoids the problem of the same value being interpreted multiple ways where convenient, and I think the output is guaranteed to validate without type coercion. It's certainly a lot more complex.

@epoberezkin
Copy link
Member

It's certainly a lot more complex.

exactly. And don't forget it should all work with references, including recursive references, plus there can be multiple coercions, so a problem that you describe is really "finding one set of all coercions for the data instance that would make the data valid according to the schema" - a much more complex problem than the one currently solved. I think solving this problem is out of scope of what Ajv does (validation according to spec), and it's rather theoretic than practical problem - most real world issues can be solved by modifying schemas (from this point of view the sample schema above doesn't make much sense). Same considerations apply to using defaults in schema and here as well Ajv is opting to implement some relatively simple logic instead of complex rollback scenarios (as some validators do, but then they fail some tests in the standard suite and many Ajv tests...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants