-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core(jsonld): Structured data validation updates #8137
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hurray for having some real usage-driven changes here now! :D 🎉
@@ -41,7 +41,7 @@ module.exports = function validateJsonLD(json) { | |||
if (name.startsWith('@') && !VALID_KEYWORDS.has(name)) { | |||
errors.push({ | |||
path: path.join('/'), | |||
message: 'Unknown keyword', | |||
message: 'Unknown keyword ' + name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we put this in quotes or after a colon or something? this also makes we wonder what our i18n story is here 😬
@@ -17,7 +17,7 @@ describe('JSON validation', () => { | |||
`); | |||
|
|||
assert.equal(errors.length, 1); | |||
assert.equal(errors[0].path, 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably use strict equal here then, these would have been failing before I guess
* @returns null | number - line number of the path value in the prettified JSON | ||
*/ | ||
function getLineNumberFromJsonLDPath(obj, path) { | ||
// To avoid having an extra dependency on a JSON parser we set a unique key in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has that shipped sailed with jsonlint-mod
is there any way to leverage that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also deja vu, I feel like I've reviewed this before so sorry if you've answered this in the previous PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has that shipped sailed with jsonlint-mod is there any way to leverage that?
I don't see a direct way to leverage jsonlint-mod here. It just uses Douglas Crockford's JSON parser.
I guess we could parse it ourselves or find a small parser that we can include?
also deja vu, I feel like I've reviewed this before so sorry if you've answered this in the previous PR
Yup, I sent you a pre-PR to get some initial feedback.
// but key provided by validator is "author" | ||
const keyParts = key.split('/'); | ||
const relativeKey = keyParts[keyParts.length - 1]; | ||
if (relativeKey === pathPart && currentObj[key] !== undefined) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like if we have currentObj[key]
as undefined
we could just return before this whole loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I don't see why we'd need it at all actually...
} else { | ||
currentObj = currentObj[key]; | ||
} | ||
keyFound = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like keyFound
is unnecessary, we could just return here and throw if we ever make it through the loop without returning
@@ -169,7 +171,9 @@ describe('schema.org validation', () => { | |||
}`); | |||
|
|||
assert.equal(errors.length, 1); | |||
assert.equal(errors[0].invalidTypes[0], 'http://schema.org/Article'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this name is a bit confusing to me for how it's used. the type isn't invalid, right? it's just the type of the entity that has some other invalid stuff going on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entity is not a valid type
instance because it has a property that doesn't exist on type
.
I originally just called that array types
, but then it wasn't clear that having that array means invalid. (And we use that array to generate the Invalid Event: unexpected property asdf
message.)
Maybe we can call the array invalidForTypes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh gotcha, I actually like the direction you were headed with .types
then. I think it's clear something is invalid by virtue of the fact that we are giving an error. Maybe typeOfInvalidEntity
or something similar if we're going the more explicit route?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm at validTypes
right now, since we only include the types in the validation error if they are valid. (And we don't want to show invalid types in the UI, because in that case they'll be part of the error message.)
errors.push({ | ||
invalidTypes: error.invalidTypes, | ||
message: error.message, | ||
// get rid of the first chunk (/@type) as it's the same for all errors | ||
path: | ||
'/' + | ||
path | ||
.slice(0, -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait a second, the path is backwards? I think we're missing some tests that have paths of length >2 then 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can look into this more tomorrow, but what makes you think the path is backwards? Here's an example of the path:
[ 'http://schema.org/author', '0', 'http://schema.org/colleague', '0', '@type' ]
For this:
{
"@context": "https://schema.org",
"@type": "Article",
"author": {
"@type": "Person",
"displayNaame": "Sally",
"colleague": {
"@type": "Person",
"displayNaame": "Sally"
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh the comment confused me. It says it's getting rid of the "first chunk" but it's removing the last chunk. That made me think it is backwards and all paths started with /@type
, but I see now :)
maybe update the comment? a test for longer paths still couldn't hurt though sometime :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh, yeah good catch, that's a very confusing comment! Added a test case for deeper nesting too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is looking good so far to me, just some surface level comments
types/structured-data.d.ts
Outdated
| "json" | ||
| "json-ld" | ||
| "json-ld-expand" | ||
| "schema-org"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, where is this leading |
coming from :P Not even prettier does this, I don't think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my prettier definitely does this in other projects, not sure about here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my prettier definitely does this in other projects, not sure about here :)
maybe it's a typescript prettier thing? I couldn't get it to repro in their playground
@@ -49,7 +49,7 @@ function findType(type) { | |||
* | |||
* @param {string|Array<string>} typeOrTypes | |||
* @param {Array<string>} keys | |||
* @returns {Array<string>} | |||
* @returns {Array<{message: string, key?: string, typesOfInvalidEntity?: Array<string>, path?: string}>} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe pull into a typedef so these properties can have docs?
* @param {string} path | ||
* @returns {null | number} - line number of the path value in the prettified JSON | ||
*/ | ||
function getLineNumberFromJsonLDPath(obj, path) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these two are utility-y enough, maybe they're worth exposing and having explicit tests for them?
They're getting test coverage right now but I doubt it's much, and it would have the benefit of documenting them a bit more. e.g. these aren't called with literals anywhere, so it's not possible to know what the path
string generally looks like without also looking at the code that generates paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, adding some tests is a good idea 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@patrickhulce LGTY? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, yes! LGTM :)
Summary
Updates to the SD validation logic:
We use the entity type to include them in the error messages and link to their schema.org URL so the user can see what properties are valid:
From the audit's perspective the validator would ideally generate the full error message including the markdown link, but that logic is in the audit to make the sd validator more re-usable.
Related Issues/PRs
Part of #4359