Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output structured data in {{ghost_head}} #3900

Closed
6 tasks done
JohnONolan opened this issue Aug 30, 2014 · 19 comments
Closed
6 tasks done

Output structured data in {{ghost_head}} #3900

JohnONolan opened this issue Aug 30, 2014 · 19 comments
Assignees
Labels
feature [triage] New features we're planning or working on P2 - High [triage] High priority for immediate patch release
Milestone

Comments

@JohnONolan
Copy link
Member

NB: This issue is currently a WIP specification document.

In order to make Ghost more useful as a blogging platform, it should be outputting structured data which allows published content to be more easily machine-readable. This allows content to be easily discoverable in search engines as well as popular social networks where blog posts are typically shared.

There are a couple of contexts for this, the most important of which are listed below - along with a proposed implementation structure.

One additional consideration is that it should be possible for apps to modify and extend this output. Eg. a Disqus app might want to add a commentCount property, and social apps might want to add interactionCount values.

As a part of this issue, Privacy.md should be updated with details of what data is being output and why. There should also be a flag available for config.js which disables automatic structured data from being output at all.

Comments, additions or suggestions on the proposed structure below are welcome.

Tasks

  • Update Privacy.md (@JohnONolan)
  • Add flag to disable output in config.js
  • Implement Schema.org output
  • Implement Open Graph output
  • Implement Twitter Cards output
  • Implement hooks to allow apps to modify output

Schema.org

With the news that Google is ending support for Authorship came the followup of a renewed focus on Schema.org microformats. Until now we've been shy to adopt this despite multiple PR's to add it to Casper, mainly because it makes a horrendous mess of the markup and no clear indication that it was a focus of major search engines.

Now there's a clear focus from a major search engine, and I've also recently discovered the JSON Linked Data (JSON-LD) format which means that not only can we have uncluttered markup - but we can also keep the structured data abstracted from the theme. Meaning Ghost (not the theme) does the implementation. Good news all round.

The following is a proposed sample ouput in {{ghost_head}} on post.hbs following the schema.org Article and BlogPosting specifications.

Should be tested against http://www.google.com/webmasters/tools/richsnippets

<script type="application/ld+json">
{
  "@context": "http://schema.org",
  "@type": "Article",
  "publisher": "{{@blog.title}}",
  "author": {
      "@type": "Person",
      "name": "{{author.name}}",
      "image": "{{author.image}}",
      "url": "{{author.url}}",
      "sameAs": "{{author.website}}"
    },
  "headline": "{{title}}",
  "url": "{{url absolute='true'}}",
  "datePublished": "{{date format='YYYY-MM-DDTHH:mm:ssZ'}}",
  "dateModified": "{{date updated_at format='YYYY-MM-DDTHH:mm:ssZ'}}",
  "image": "{{image}}",
  "keywords": "{{tags}}",
  "description": "{{meta_description}}"
}
</script>

Open Graph

The below represents proposed output for post.hbs in {{ghost_head}} based on the Open Graph protocol specification.

NB: The og:author value has been deliberately omitted due to conflicting information on how it should be correctly used. Based on this it is probably more useful not to specify it explicitly, rather than provide invalid data to one or more services.

Should be tested against https://developers.facebook.com/tools/debug and https://developers.pinterest.com/rich_pins/validator/

<meta property="og:site_name" content="{{@blog.title}}" />
<meta property="og:type" content="article" />
<meta property="og:title" content="{{title}}" /> 
<meta property="og:description" content="{{meta_description}}..." />
<meta property="og:url" content="{{url absolute='true'}}" />
<meta property="og:image" content="{{image}}" />
<meta property="article:published_time" content="{{date format='YYYY-MM-DDTHH:mm:ssZ'}}" />
<meta property="article:modified_time" content="{{date updated_at format='YYYY-MM-DDTHH:mm:ssZ'}}" />
{{#foreach tags}}
  <meta property="article:tag" content="{{name}}" />
{{/foreach}}

Twitter Cards

The below represents proposed output for post.hbs in {{ghost_head}} based on the Twitter Cards specification.

Should be tested against https://dev.twitter.com/docs/cards/validation/validator

<meta name="twitter:card" content="summary_large_image" />
<meta name="twitter:title" content="{{title}}" />
<meta name="twitter:description" content="{{meta_description}}" /> 
<meta name="twitter:url" content="{{url absolute='true'}}" />
<meta name="twitter:image:src" content="{{image}}" />

NB: all image meta tags should be conditional on whether the post has an image

@JohnONolan JohnONolan self-assigned this Aug 30, 2014
@JohnONolan JohnONolan added this to the 0.5.x Feature Release Backlog milestone Aug 30, 2014
@ErisDS
Copy link
Member

ErisDS commented Aug 30, 2014

I think we can mark the hook task as done, as ghost_head is already extensible by apps.

An app can grab the ghost_head content as an array of strings containing each piece of markup with all the data in place and process it.

We may however want to change this from an array of strings to some other structure: perhaps an object with sub arrays for each section, or even a template and a JSON object containing the data to be merged in. I'm kinda thinking out loud, and wondering if an array of templates and a JSON object might be awesome. Templates, as opposed to strings, meaning the data like URL, title etc is still separate.

What you would get in the current ghost_head filter implementation:

[
... loads more stuff...
'<meta property="twitter:card" content="summary" />',
'<meta property="twitter:title" content="My Blog Post" />',
'<meta property="twitter:description" content="A description" /> ',
'<meta property="twitter:url" content="http://something.com" />',
]

What might be better:

[
... loads more stuff...
'<meta property="twitter:card" content="{{twitter_card}}" />',
'<meta property="twitter:title" content="{{twitter_title}}" />',
'<meta property="twitter:description" content="{{twitter_description}}" /> ',
'<meta property="twitter:url" content="{{twitter_url}}" />',
],
{
   ... loads more stuff...
    twitter_card: "summary",
    twitter_title: "My Blog Post",
    twitter_description: "A description",
    twitter_url: "http://something.com"
}

@jguerin
Copy link

jguerin commented Sep 5, 2014

This is a MUCH better way to do metadata. I've been modifying Casper (very poorly). Looking forward to reverting those changes :)

@novaugust
Copy link
Contributor

I thought the ld-json was so cool that I tested it out on novaugust.net. Sadly, looks like google isn't interested yet: http://www.google.com/webmasters/tools/richsnippets?q=novaugust.net

@JohnONolan
Copy link
Member Author

@novaugust Very lame. But I'm tempted to suggest that it's still the right approach, based on the fact that the testing tool may in fact be more behind than the algorithm - and either way it should be future-proof...

@Tutsumi
Copy link

Tutsumi commented Sep 11, 2014

@novaugust it worked, you just need to point it at an article instead of a page with little to no info. http://www.google.com/webmasters/tools/richsnippets?q=http%3A%2F%2Fwww.novaugust.net%2Fblog%2F2014%2F09%2Fi-went-into-the-cirque%2F

@novaugust
Copy link
Contributor

Thanks @Tutsumi, but my inner pages (/blog) are running a different application that uses standard <meta> tags to put that information out there. In short, they're irrelevant to the ld+json discussion.
The homepage, however, is using ld+json as a test for this issue .

(this is about meta info, which isn't the content you actually see in a page - it's all hidden away in the head. even if my homepage is just five lines and three words, it still has plenty going on in the <head> tag)

@JohnONolan
Copy link
Member Author

@novaugust After a bit of further research, I believe quite strongly that this is "the future" - http://blog.heppresearch.com/2014/03/24/json-ld-finally-google-honors-invisible-data-for-seo/ - and the correct way forward

@novaugust
Copy link
Contributor

I'm 100% with you there senior. I was just saying, "looks like it's not there yet". But, it looks like this webmaster tool is ready to go with json-ld: https://www.google.com/webmasters/markup-tester/u/0/events I put my markup in it and it correctly pulled things out. I'm betting (well, hoping) they'll have everything hooked up in no time :D

@jguerin
Copy link

jguerin commented Sep 13, 2014

It looks like ld-json's Person object doesn't have a 'location' property:

    The property http://schema.org/location is invalid for this type of object.

@JohnONolan
Copy link
Member Author

Location is clearly used in the 2nd example here: http://schema.org/Person

@jguerin
Copy link

jguerin commented Sep 14, 2014

Location isn't a valid property of Person. That second example's Location property is for the MusicEvent object.

chorrell added a commit to chorrell/ghost-old.horrell.ca that referenced this issue Sep 24, 2014
This uses JSON-LD. See TryGhost/Ghost#3900

This may become a built in feature of Ghost.
@JohnONolan
Copy link
Member Author

Updated to correct some meta tags and remove Location from Person in og tags

@cobbspur
Copy link
Member

cobbspur commented Oct 2, 2014

was wondering about how configurable we will want this. Do we want this to be an all or nothing thing or a bit more configurable. We could have flags in in the config of just structuredData: true, or individual components like openGraph or possibly - structuredData { openGraph: true, twitter: false....... etc
Thoughts?

@JohnONolan
Copy link
Member Author

Simple to start: Initially all or nothing - fine grained config if there's demand for it.

@cobbspur
Copy link
Member

cobbspur commented Oct 2, 2014

k

cobbspur added a commit to cobbspur/Ghost that referenced this issue Oct 8, 2014
issue TryGhost#3900
- uses isPrivacyDisabled helper to see if useStructuredData has been disabled in config.js
- adds an array of promises to deal with asynchronous data
- resolves asynchronous data then adds open graph tags after canonical link
- featured image and tags are only added if present
- open graph tags only added on post and page
- adds unit test to check correct data is returned
- updates other unit tests to reflect changes
@cobbspur
Copy link
Member

note: twitter:card should be either content="summary" or content="summary_large_image" if there is a post cover image

@ErisDS ErisDS added the P2 - High [triage] High priority for immediate patch release label Oct 14, 2014
@ErisDS ErisDS closed this as completed in 23e98aa Oct 17, 2014
@JohnONolan
Copy link
Member Author

Just wanted to follow up and confirm that our implementation of JSON-LD is indeed being recognised and utilised by Google.

image


image

@JohnONolan
Copy link
Member Author

Google Structured Data testing tool has now also been updated and correctly detects everything: https://www.google.com/webmasters/tools/richsnippets?url=http://john.onolan.org/open-source-culture/

@ErisDS
Copy link
Member

ErisDS commented Dec 16, 2014

This is awesome 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature [triage] New features we're planning or working on P2 - High [triage] High priority for immediate patch release
Projects
None yet
Development

No branches or pull requests

6 participants