Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Schema Generation Yields Undesired Allowed Values #240

Closed
aj-stein-nist opened this issue Sep 22, 2022 · 16 comments · Fixed by #253
Closed

JSON Schema Generation Yields Undesired Allowed Values #240

aj-stein-nist opened this issue Sep 22, 2022 · 16 comments · Fixed by #253
Assignees
Labels
bug Something isn't working XSLT Implementation Issue relates to the XSLT implementation of Metaschema.

Comments

@aj-stein-nist
Copy link
Collaborator

aj-stein-nist commented Sep 22, 2022

Describe the bug

Currently, the JSON Schema generator, for metaschema definitions with allowed-values where allow-other="no", where its target does not have a predicate (target=".") or the target is not the default OSCAL namespace (for any target that is not target=".[has-oscal-namespace('http://csrc.nist.gov/ns/oscal')]"), the JSON generator should not emit the allowed values as part of the JSON Schema. If allow-other="yes", then the current enum style of JSON Schema validation, when needed in such cases like it, is not correctly implemented.

It currently gets emitted like this:

              { "type" : "string" } },
            "required" : 
            [ "type" ],
            "additionalProperties" : false,
            "enum" : 
            [ "equivalent-to",
              "equal-to",
              "subset-of",
              "superset-of",
              "intersects-with" ] }

We need it to be adapted to emit JSON Schema like this:

              "type" : 
              { "anyOf":
              [
                { "type": "string"},
                { 
                  "enum" : [
                    "equivalent-to",
                    "equal-to",
                    "subset-of",
                    "superset-of",
                    "intersects-with" 
                  ]
                }
              ] } }

Who is the bug affecting?

OSCAL tool developers want to convert mappings from OSCAL XML to OSCAL JSON and must have the latter validated with JSON Schema after the fact.

What is affected by this bug?

Schema validation with JSON schemas generated from the scheme generator(s) in this repo.

When does this occur?

Consistently.

How do we replicate the issue?

When debugging issues with use of NIST OSCAL automation from community stakeholders in CIS, we were asked to troubleshoot and debug issues validating the emitted JSON file via ajv -c ajv-formats -s oscal_mapping_schema.json -d CCM_CISControlsv8_Mapping-min.json after converting the mapping in OSCAL XML to OSCAL JSON in CI/CD with GitHub Actions. An example run can be found here with current commit aj-stein-nist/CISControls_OSCAL@cd22c23#diff-b7c9b89f382154d95ec71f7a04dd1b62a11804cd8a3ed0c2d908ad97fba7e66e.

The following errors for this seemingly valid OSCAL JSON document instance are below, despite all the correct values being defined for the given fields and assemblies in the document.

/home/runner/work/CISControls_OSCAL/CISControls_OSCAL/git-content/mappings/json/CCM_CISControlsv8_Mapping-min.json invalid
[
  {
    keyword: 'enum',
    dataPath: '/mapping-collection/mappings/maps/0/relationship',
    schemaPath: '#/properties/relationship/enum',
    params: { allowedValues: [Array] },
    message: 'should be equal to one of the allowed values'
  },
  {
    keyword: 'type',
    dataPath: '/mapping-collection/mappings',
    schemaPath: '#/properties/mappings/anyOf/1/type',
    params: { type: 'array' },
    message: 'should be array'
  },
  {
    keyword: 'anyOf',
    dataPath: '/mapping-collection/mappings',
    schemaPath: '#/properties/mappings/anyOf',
    params: {},
    message: 'should match some schema in anyOf'
  }
]
/home/runner/work/CISControls_OSCAL/CISControls_OSCAL/git-content/mappings/json/CCM_CISControlsv8_Mapping-min.json is invalid.

Expected behavior (i.e. solution)

JSON Schema validates properly.

Other Comments

@aj-stein-nist
Copy link
Collaborator Author

@wendellpiez asked that @aj-stein-nist and others help with testing a potential fix after .5 (approximated 4 hours) of developing a bug to mitigate the fix.

@aj-stein-nist
Copy link
Collaborator Author

Similar behavior in this GHA job run when validating and converting content and this Gitter chat indicate there is a high likelihood this reflects the same bug. I am looking into this to potentially unblock myself of usnistgov/oscal-content#139.

@aj-stein-nist aj-stein-nist self-assigned this Oct 6, 2022
@aj-stein-nist
Copy link
Collaborator Author

I likely will have to sync up with Wendell as it turns out some of the <json-value-key> stuff is more difficult than I thought and this isn't going to be a small cosmetic change.

@aj-stein-nist
Copy link
Collaborator Author

@wendellpiez, the serialized XML form of the example as it should look based on transforming with the basic JSON->XML serializer in XPath 3.0 (as we discussed on the call):

                            <map key="type">
                                <array key="anyOf">
                                    <map>
                                        <string key="type">string</string>
                                    </map>
                                    <map>
                                        <array key="enum">
                                            <string>equivalent-to</string>
                                            <string>equal-to</string>
                                            <string>subset-of</string>
                                            <string>superset-of</string>
                                            <string>intersects-with</string>
                                        </array>
                                    </map>
                                </array>
                            </map>

@wendellpiez
Copy link
Collaborator

Noting that even a value key is not required for data types allowing an empty string, the XML representation is underspecified: <relationship type="x"/> is schema valid since the token` data type supports an empty string. (In 2.0 we can make token require a character but that's not backward compatible for 1.x.)

At a minimum we need to relax the requirement for the nominal field value property in the JSON to avoid properties with no values (or empty string values).

@aj-stein-nist
Copy link
Collaborator Author

Wendell and I paired for this afternoon. We agreed that I would follow up with a minimal Metaschema definition and help stub out tests outside of the more complex definition examples that make it hard to exercise the possible bug, different solutions, and how certain JSON Schema fields should and should not be handled (additionalProperties and required, specifically).

@aj-stein-nist
Copy link
Collaborator Author

aj-stein-nist commented Oct 13, 2022

Wendell and I had met a few times and paired and he agreed it is best he work on moving forward the Metaschema "mini" (minimally viable model) using the current version to exposed different allowed values sets and write relevant XSpec tests. I got a little ahead of myself and attempted last night to figure out how I can revive the Metaschema test-suite but it is pretty badly broken for now, as I learned in aj-stein-nist@2e67d27 last night. I will unassign myself for now, but sync up and assist accordingly and support Wendell as-is.

@aj-stein-nist aj-stein-nist removed their assignment Oct 13, 2022
wendellpiez added a commit to wendellpiez/metaschema that referenced this issue Oct 13, 2022
aj-stein-nist added a commit to usnistgov/OSCAL that referenced this issue Oct 14, 2022
Per request from @wendellpiez for debugging, I am going to set this up
to add to a test submodule and see emmitted JSON Schema result and
validation against catalogs in usnistgov/oscal-content examples.
aj-stein-nist added a commit to aj-stein-nist/oscal-content-forked that referenced this issue Oct 14, 2022
Per request from @wendellpiez for debugging, I am going to set this up to add to a test submodule and see emmitted JSON Schema result and validation against catalogs in usnistgov/oscal-content examples.
@wendellpiez
Copy link
Collaborator

Now in a working branch we have the beginnings of XSpec, with analysis:

https://github.com/wendellpiez/metaschema/tree/issue240-jsonschemaBugA/test-suite/metaschema-xspec/json-schema-gen

The schema marked CORRECTED3 is both correct and complete, testing both the token datatype and applicable enumerations (i.e., any enumerations not to be suppressed by virtue of a narrowing @target or by allow-other='yes'). Instance documents marked 'valid' or 'invalid' demonstrate validations over data for testing. The draft XSpec casts two of the four examples into their intended form. Running the XSpec shows the bugs in the current process.

Next up:

  • discuss: do we have the correct target model for JSON Schema (does the corrected 3 version look okay);
  • complete XSpec; repair bugs;
  • test out
  • return to functional validations for generated schemas (do they validate the correct cross sections of instances)

@aj-stein-nist
Copy link
Collaborator Author

Looking good so far, just some quick feedback because it is a branch and not a PR yet (note to self: this was reviewed in the above branch at commit wendellpiez@d045cc8):

  • Most look good. I call out the ones that could use improvements below.
  • Sub-schema with "$id" of "#field_oscal-value-testing-mini_constrained-closed" is close but likely needs a slight modification: allOf should be switched to oneOf or you would need to a token-value to match all four enum values of "one", "two", "three", "four" at the same time (because of presumed allow-other being set to no in Metaschema).
  • Sub-schema for "#field_oscal-value-testing-mini_constrained-open" can work with anyOf so long as the source Metaschema implicitly or explicitly resolves to true() for has-oscal-namespace('http://csrc.nist.gov/ns/oscal') does seem to work. I took some time to re-read the JSON Schema spec this evening and the MVP does appear to work correctly.

image

https://gist.github.com/aj-stein-nist/cff37e5580da4cce0d8b689ddd28f486
https://gist.github.com/aj-stein-nist/1b9a14c074980128b9bcb26dc8e070df

I will review again in the morning to make sure I didn't miss anything.

@aj-stein-nist
Copy link
Collaborator Author

aj-stein-nist commented Nov 1, 2022

Looking good so far, just some quick feedback because it is a branch and not a PR yet (note to self: this was reviewed in the above branch at commit wendellpiez@d045cc8):

* Most look good. I call out the ones that could use improvements below.

* Sub-schema with `"$id"` of `"#field_oscal-value-testing-mini_constrained-closed"` is close but likely needs a slight modification: `allOf` should be switched to `oneOf` or you would need to a `token-value` to match all four `enum` values of `"one"`, `"two"`, `"three"`, `"four"` at the same time (because of presumed `allow-other` being set to `no` in Metaschema).

* Sub-schema for `"#field_oscal-value-testing-mini_constrained-open"` can work with `anyOf` so long as the source Metaschema implicitly or explicitly resolves to `true()` for `has-oscal-namespace('http://csrc.nist.gov/ns/oscal')` does seem to work. I took some time to re-read the JSON Schema spec this evening and the MVP does appear to work correctly.

I will review again in the morning to make sure I didn't miss anything.

@wendellpiez, ping ping. I know we discussed via Gitter some limited conversation with Dave, but had you looked at my feedback? It appears the branch (at least remote in github.com/wendellpiez) is still as-is, and I want to make sure we get to this eventually. Lemme know!

@aj-stein-nist
Copy link
Collaborator Author

Talked with Dave and Wendell. Will move forward with stabilizing the unit tests and then push the fix soonish.

* Sub-schema with `"$id"` of `"#field_oscal-value-testing-mini_constrained-closed"` is close but likely needs a slight modification: `allOf` should be switched to `oneOf` or you would need to a `token-value` to match all four `enum` values of `"one"`, `"two"`, `"three"`, `"four"` at the same time (because of presumed `allow-other` being set to `no` in Metaschema).

This should be fixed in Wendell's branch code as discussed.

* Sub-schema for `"#field_oscal-value-testing-mini_constrained-open"` can work with `anyOf` so long as the source Metaschema implicitly or explicitly resolves to `true()` for `has-oscal-namespace('http://csrc.nist.gov/ns/oscal')` does seem to work. I took some time to re-read the JSON Schema spec this evening and the MVP does appear to work correctly.

We will drop constraints in this case, as agreed. (To A.J.: the thing you were proposing you will not do, just drop them, the GitHub Gists are illustrative but it ends here.)

aj-stein-nist added a commit to aj-stein-nist/oscal-content-forked that referenced this issue Nov 1, 2022
Per request from @wendellpiez for debugging, I am going to set this up to add to a test submodule and see emmitted JSON Schema result and validation against catalogs in usnistgov/oscal-content examples.
@aj-stein-nist
Copy link
Collaborator Author

Talked with Dave and Wendell. Will move forward with stabilizing the unit tests and then push the fix soonish.

* Sub-schema with `"$id"` of `"#field_oscal-value-testing-mini_constrained-closed"` is close but likely needs a slight modification: `allOf` should be switched to `oneOf` or you would need to a `token-value` to match all four `enum` values of `"one"`, `"two"`, `"three"`, `"four"` at the same time (because of presumed `allow-other` being set to `no` in Metaschema).

This should be fixed in Wendell's branch code as discussed.

So this turns out not be the case and a minor detail but caused a good deal of miscommunication on my part. I think I more fully understand the schemas in question now. I will finish double-checking the schemas and my take, but @wendellpiez, the example model, the valid JSON examples conforming to those schemas, and the intentionally invalid examples that will not conform do look correct.

I will continue working on aligning the transform and the tests to send to you for review.

@aj-stein-nist
Copy link
Collaborator Author

I talked with @wendellpiez yesterday and we agreed he can focus on the actual fixes once he works down his work queue and will sync up with me, if not today, when we are back to work (so Monday). I will continue on Friday to attempt a fix and structure the tests, but I will keep him the primary assignee as he will lead the fixes on the transform, unless I can present a solution that works before he comes back to me on this bit.

@aj-stein-nist
Copy link
Collaborator Author

Moving this to Sprint 61, still there is some more work to do. :-(

aj-stein-nist added a commit to aj-stein-nist/metaschema that referenced this issue Dec 27, 2022
aj-stein-nist added a commit to aj-stein-nist/metaschema that referenced this issue Dec 28, 2022
@aj-stein-nist aj-stein-nist linked a pull request Dec 28, 2022 that will close this issue
7 tasks
@david-waltermire
Copy link
Collaborator

david-waltermire commented Jan 5, 2023

Based on our conversation today, the following should guide the use of anyOf or allOf.

  • Use anyOf if the allowed values are open. i.e. allow-other=yes
  • Use allOf if the allowed values are closed. i.e. allow-other=no

aj-stein-nist added a commit to aj-stein-nist/metaschema that referenced this issue Jan 5, 2023
Reflect Dave's feedback and adjust code per the comment in:
usnistgov#240 (comment)
aj-stein-nist added a commit to aj-stein-nist/metaschema that referenced this issue Jan 6, 2023
aj-stein-nist added a commit to aj-stein-nist/metaschema that referenced this issue Jan 17, 2023
aj-stein-nist added a commit to aj-stein-nist/metaschema that referenced this issue Jan 19, 2023
@aj-stein-nist
Copy link
Collaborator Author

Hopefully wrapping this up before the end of the week.

aj-stein-nist added a commit to aj-stein-nist/metaschema that referenced this issue Jan 24, 2023
aj-stein-nist added a commit that referenced this issue Jan 25, 2023
* Fix JSON Schema generation for #240 and add tests.

* Adapt test suite approach per PR feedback.

* Final cleanup of unused parameter.

The shift in test approach, with the shell entrypoint at
ist-metaschema-MAKE-JSON-SCHEMA.xsl and not with the previous
make-json-schema-metamap.xsl replaced means that parameter, which
was not working particularly well in the previous approach.
aj-stein-nist added a commit to aj-stein-nist/OSCAL-forked that referenced this issue Jan 25, 2023
aj-stein-nist added a commit to aj-stein-nist/oscal-content-forked that referenced this issue Jan 30, 2023
aj-stein-nist added a commit to aj-stein-nist/OSCAL-forked that referenced this issue Jan 31, 2023
 issue.

Determine if this submodule update fixes CI/CD breakage with JSON schema
or if other bugs and development issues leave it broken.
david-waltermire pushed a commit that referenced this issue Mar 9, 2023
* Fix JSON Schema generation for #240 and add tests.

* Adapt test suite approach per PR feedback.

* Final cleanup of unused parameter.

The shift in test approach, with the shell entrypoint at
ist-metaschema-MAKE-JSON-SCHEMA.xsl and not with the previous
make-json-schema-metamap.xsl replaced means that parameter, which
was not working particularly well in the previous approach.
nikitawootten-nist pushed a commit to nikitawootten-nist/metaschema-xslt that referenced this issue Jul 21, 2023
* Fix JSON Schema generation for usnistgov/metaschema#240 and add tests.

* Adapt test suite approach per PR feedback.

* Final cleanup of unused parameter.

The shift in test approach, with the shell entrypoint at
ist-metaschema-MAKE-JSON-SCHEMA.xsl and not with the previous
make-json-schema-metamap.xsl replaced means that parameter, which
was not working particularly well in the previous approach.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working XSLT Implementation Issue relates to the XSLT implementation of Metaschema.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants