Work in progress - alpha release!
I need a html test suite and I wasn't able to find a good one (without relying on third party services), so I rewrote npm/structured-data-testing-tool and npm/web-auto-extractor to a less resource hungry package, that also works in a web browser.
It is basically a rewrite of the structured-data-testing-tool by Iain Collins and it isn't fully implemented, yet.
- extract meta data from website
- metatags
- grouped by twitter, og, og:image...
- jsonld
- microdata
- rdfa
- some more ???
- metatags
- validate data
- for SEO best practices
- valid properties
- match schema.org
- valid data types
- multiple usage options
- browser based ui
- limited ui via bookmarklet
- cli
- js/node api
- connect with a (self hosted) nu validator instance
Save this repo in /path/to/xampp/htdocs/seo-meta-validator
, run Apache and open your browser http://localhost/seo-meta-validator
.
It works for testing websites on localhost, but not with external urls due to CORS.
You have to install Cockpit CMS manually. I don't want to redistribute the whole thing (>10MB) with this repo. Navigate in your shell to /path/to/seo-meta-validator
and clone Cockpit with
git clone https://github.com/agentejo/cockpit.git ui/lib/cockpit
or download Cockpit CMS and unzip it into /path/to/seo-meta-validator/ui/lib/cockpit
.
If you downloaded Cockpit and you want to login, navigate to http://localhost/seo-meta-validator/ui
in your browser (user: admin, password: admin).
Now you have to setup the schema api.
Navigate to path/to/seo-meta-validator/ui
# download latest all-layers.jsonld file from schema.org github repo
# it will be stored in /ui/cpdata/storage/uploads/schemas
./cp schemaorgapi/download
# import jsonld file
./cp schemaorgapi/import
Create a browser bookmark and copy the content from dist/bookmarklet.js
in the location field.
Open a external website. Click on the bookmark and it loads the script metavalidator.min.js
with a simple overlay to run tests.
It may fail due to CORS.
The built bookmarklet has a hard coded url to http://localhost/seo-meta-validator/dist/metavalidator.min.js
. Change the url if your host url differs.
Not fully implemented yet. Run node bin/cli.js --url https://example.com
to see a very limited cli output.
npm run build
andnpm run watch
to rebuild js an css filesnpm run build:js:dev
andnpm run watch:js:dev
to rebuild js files without minifying (for debugging)npm run build:bookmarklet
...- see
package.json
for more build scripts
- How to deal with mixed licensed content? I publish everything under MIT, sdtt is ISC licensed and the browserified jmespath is Apache 2.0 licensed.
- collaboration or new project?
- collaborate with Iain Collins and merge my rewrite to his tool or
- create my own project
- split project into multiple packages/repos, e. g. "test suite" and "meta-extractor"...?
- SEO preset
- length checks for title, description
- length checks for og:title, og:description
- length checks for twitter:title, twitter:description
- ...
- validate schemas
- allowed properties (fails for strings that should be parsed as object, e. g.
"author": "A. Smith"
should be parsed as"author": {"@type":"Thing","name":"A. Smith"}
) - invalid properties (data type validation) - partially implemented
- ...
- allowed properties (fails for strings that should be parsed as object, e. g.
- microdata/rdfa parser schould match spec
- check for valid url
- avoid running the same preset tests multiple times (e. g. preset "Default" & "Google")
- entry point with different DOMParser for node/cli usage
- babel, polyfills...
- metatags should be grouped (e. g. og with multiple og images)
- performance tests/optimization
- catch error if url is not parsable
- self tests
- single tests should have conditionals (currently only presets have conditionals)
- presets should have multiple conditionals (currently only one conditional test is possible) - some data has multiple fallbacks, e.g. if no
twitter:title
is present, Twitter will fallback toog:title
- Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource --> use optional api proxi with server side url fetching
- bookmarklet fails sometimes through Content Security Policy --> use a full bundle, that also includes schemas - should work, because than only one script is added via browser
- more explicit styles because it shares it's styles with the tested website
- cli
- groups
- multiple schema instances
- number of tests doesn't match
- load schemas
- auto detected tests are disabled
Copyright 2020 Raffael Jesche under the MIT license.
See LICENSE for more information.
- A lot of the code is copied and modified from structured-data-testing-tool, (c) Iain Collins, ISC License.
- The parsers are rewritten variants from web-auto-extractor, (c) Dayan Adeeb, MIT License.
- Some internal js helpers are from Cockpit CMS (App.js), (c) Artur Heinze, MIT License
- The built script for browser usage contains the browserified/minified
- jmespath, (c) James Saryerwinnie, Apache 2.0 License
- validator.js, (c) Chris O'Hara, MIT License