Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEO 2022 #2888

Closed
6 tasks done
rviscomi opened this issue Apr 12, 2022 · 56 comments · Fixed by #3119
Closed
6 tasks done

SEO 2022 #2888

rviscomi opened this issue Apr 12, 2022 · 56 comments · Fixed by #3119
Assignees
Labels
2022 chapter Tracking issue for a 2022 chapter ASAP This issue is blocking progress
Milestone

Comments

@rviscomi
Copy link
Member

rviscomi commented Apr 12, 2022

SEO 2022

SEO illustration

If you're interested in contributing to the SEO chapter of the 2022 Web Almanac, please reply to this issue and indicate which role or roles best fit your interest and availability: author, reviewer, analyst, and/or editor.

Content team

Lead Authors Reviewers Analysts Editors Coordinator
@SophieBrannon @SophieBrannon @itamarblauer @mobeenali97 @dwsmart @patrickstox @johnmurch @TusharPol @derekperkins @csliva @jroakes @MichaelLewittes @foxdavidj
Expand for more information about each role 👀
  • The content team lead is the chapter owner and responsible for setting the scope of the chapter and managing contributors' day-to-day progress.
  • Authors are subject matter experts and lead the content direction for each chapter. Chapters typically have one or two authors. Authors are responsible for planning the outline of the chapter, analyzing stats and trends, and writing the annual report.
  • Reviewers are also subject matter experts and assist authors with technical reviews during the planning, analyzing, and writing phases.
  • Analysts are responsible for researching the stats and trends used throughout the Almanac. Analysts work closely with authors and reviewers during the planning phase to give direction on the types of stats that are possible from the dataset, and during the analyzing/writing phases to ensure that the stats are used correctly.
  • Editors are technical writers who have a penchant for both technical and non-technical content correctness. Editors have a mastery of the English language and work closely with authors to help wordsmith content and ensure that everything fits together as a cohesive unit.
  • The section coordinator is the overall owner for all chapters within a section like "User Experience" or "Page Content" and helps to keep each chapter on schedule.

Note: The time commitment for each role varies by the chapter's scope and complexity as well as the number of contributors.

For an overview of how the roles work together at each phase of the project, see the Chapter Lifecycle doc.

Milestone checklist

0. Form the content team

  • May 1: The content team has at least one author, reviewer, and analyst

1. Plan content

  • May 15 The content team has completed the chapter outline in the draft doc

2. Gather data

  • June 1: Analysts have added all necessary custom metrics and drafted a PR (example) to track query progress
  • June 1 - 15: HTTP Archive runs the June crawl

3. Validate results

  • August 1: Analysts have queried all metrics and saved the output to the results sheet

4. Draft content

  • September 1: The content team has written, reviewed, and edited the chapter in the doc

5. Publication

  • September 15: The completed chapter and all required metadata and figures are converted to markdown and submitted to GitHub
  • September 26: Target launch date 🚀

Chapter resources

Refer to these 2022 SEO resources throughout the content creation process:

📄 Google Docs for outlining and drafting content
🔍 SQL files for committing the queries used during analysis
📊 Google Sheets for saving the results of queries
📝 Markdown file for publishing content and managing public metadata
💬 #web-almanac-seo on Slack for team coordination

@rviscomi rviscomi added 2022 chapter Tracking issue for a 2022 chapter help wanted Extra attention is needed labels Apr 12, 2022
@mobeenali97
Copy link

Interested to Contribute for SEO.

@rviscomi rviscomi added this to the 2022 Content Planning milestone Apr 12, 2022
@MichaelLewittes
Copy link

I would like to be part of "Publication" and edit the SEO chapter. I am a long-time editor and journalist, with 10 years of experience in SEO as well.

@daviddenn
Copy link

Hi! I'd like to throw my hat in for section coordinator!

@daviddenn
Copy link

On second thought, I think I'd be better as a content lead!

@dwsmart
Copy link
Contributor

dwsmart commented Apr 13, 2022

Happy to be a reviewer if there's need!

@patrickstox
Copy link
Contributor

I'm happy to come back as a reviewer this year and also provide any guidance I can based on my experience last year.

@SophieBrannon
Copy link
Contributor

I'd love to get involved as an author for the SEO section!

@rviscomi
Copy link
Member Author

Comment from @Tiggerito in #2911:

The SEO chapter could check the use of the new indexifembedded meta tag option if possible. The custom metrics code would have to look into iframe content.

The article also uses 'value' in place of 'content' for the meta tag. I wonder how many people make this mistake on their meta tags? Googlebot supports 'value', but the custom metrics code does not.

@MichaelLewittes
Copy link

Great topic @Tiggerito. Read the documentation in January and all the subsequent takes on it from various industry trades (Search Engine Land, Search Engine Roundtable et al).

@jroakes
Copy link
Contributor

jroakes commented Apr 18, 2022

I will throw my hat in for Analyst again this year. Have a lot of clean up I want to do to make the queries more concise and readable.

@csliva
Copy link
Contributor

csliva commented Apr 19, 2022

I’d be interested in helping as an analyst if extra support is needed.

@derekperkins
Copy link

I'll also help as an analyst

@johnmurch
Copy link

Happy to help :)

@derekperkins
Copy link

Not sure how much we're wanting to rock the boat, but at a quick glance, a lot of queries are doing custom javascript functions that could be better represented with some newer BigQuery JSON functions.
https://cloud.google.com/bigquery/docs/reference/standard-sql/json_functions

@rviscomi
Copy link
Member Author

@derekperkins yeah that sounds great! That seems like it would make the queries much more readable and may even be a bit faster to run.

If you're up for it as a proof of concept, could you take a couple of last year's queries, copy them over to this year's SQL dir, and open a PR using the newer functions?

DM me on Slack if you need to get set up to use our BigQuery account.

@foxdavidj
Copy link
Contributor

Hey @SophieBrannon, would you be interested in taking the Chapter Lead role for the SEO chapter?

As the Chapter lead you'd be the primary author and the key person responsible for pulling the entire chapter together. Details on the role and commitment here

We'd love to have you 🎉 🎉

@itamarblauer
Copy link

Happy to author alongside @SophieBrannon on the SEO chapter 😄

@TusharPol
Copy link

Happy to contribute as an author for the SEO chapter :)

@foxdavidj
Copy link
Contributor

Hey @SophieBrannon, friendly ping. Would you be interested in being the Chapter Lead for SEO this year?

See my above comment for more info

@SophieBrannon
Copy link
Contributor

SophieBrannon commented Apr 30, 2022 via email

@rviscomi rviscomi removed the help wanted Extra attention is needed label May 2, 2022
@patrickstox
Copy link
Contributor

patrickstox commented May 11, 2022 via email

@SophieBrannon
Copy link
Contributor

The meeting link was posted above this morning, and also in Slack - https://meet.google.com/aqh-cpyk-zez

I don't have everyone's email addresses so if they can be shared I can send round a calendar invite (Sorry I'm new to GitHub so if there's a way to get this from here then do let me know!)

@SophieBrannon
Copy link
Contributor

Just confirming that this is everyone who it would be great to meet at 4pm UK time today using the following link: https://meet.google.com/aqh-cpyk-zez

@itamarblauer @mobeenali97 @dwsmart @patrickstox @johnmurch @TusharPol @derekperkins @csliva @jroakes @MichaelLewittes @foxdavidj

For anyone that can't attend, we can run through the agenda (see link below) and record the call for reference unless there are any objections to this: https://docs.google.com/document/d/17L74ytEJWnmq4aQkZOMMonh9wFOAqu3rWAcCtyUoQ44/edit

If anyone has any questions, please don't hesitate to let me know.

@TusharPol
Copy link

Hey everyone,

Just finished reading the 2021 version. (I'm a first-time contributor so really excited for the 2022 edition!!)

I have a few thoughts/ideas on new metrics we could introduce this year. For example, the percentage of websites utilizing server-side rendering (SSR) vs. client-side rendering (CSR). I think this will be an interesting angle to cover.

In my experience, the former is preferable to the latter for a variety of SEO-specific reasons. I know there're workarounds to solve added SEO-level complexities that come with CSR websites, and I also know Google is constantly enhancing its JS rendering capabilities. But what about other search engines that are not up to par with Google’s current JS rendering capabilities (cough Bing cough).

Let me know what you all think :)

@patrickstox
Copy link
Contributor

As mentioned on the call, I figured it would be useful to share documents from the previous year for inspiration.

2021 Outline/Draft: https://docs.google.com/document/d/14uW8-6F1-AWxf1grr8ExyJs4YBPyMFWs4iGJ2ESZqDI/edit#heading=h.jkvloh50dr7
2021 Data/Spreadsheet: https://docs.google.com/spreadsheets/d/11hw7zg4dpIY8XbQR5bNp5LvwbaQF0TjV0X5cK0ng8Bg/edit#gid=1936997045
2020 Outline/Draft: https://docs.google.com/document/d/1q83OZAEd_oYtqwCQ1qkkpZ52T9BLPwkIMDEhXb--Xv8/edit#

@foxdavidj
Copy link
Contributor

@SophieBrannon Looks like the outline is coming along nicely! Just wanted to give a heads-up that we are fast approaching the date where any new custom metrics need to be written, tested, and merged into the web crawler (May 27).

Would be a good idea to work with your team on identifying if your chapter has need of new custom metrics so those can be written by the deadline.

@SophieBrannon
Copy link
Contributor

Thanks @patrickstox @rviscomi @TusharPol for your feedback and comments yesterday, I've made some slight tweaks based on this.

@jroakes @csliva @derekperkins the outline is complete so over to you for the data. There's a couple of custom metrics in there, so let us know if there's still time to get this data and we can adjust accordingly -
https://docs.google.com/document/d/15udJOrPhwV0yWP8cviwRyapSssCUrLxkX29KbJNJH4Q/edit#

Any questions let me know.

@foxdavidj
Copy link
Contributor

foxdavidj commented May 17, 2022

@SophieBrannon Since the outline is done, can you check off the #1 checkbox in the top comment? Looking good.

Also, it may be helpful if you add another section to the planning doc that specifically states what custom metrics you're looking for, along with documentation etc about them. So analysts can get a grasp very quickly what you're looking for. For example, "invalid <head> elements" itd be great if you linked to the articles Google has published detailing them.

@foxdavidj
Copy link
Contributor

@SophieBrannon How are the custom metrics coming along? Want to check if the analysts have made some progress on them, since they need to be written, approved, and merged by EOD Friday

@SophieBrannon
Copy link
Contributor

Hey @foxdavidj quick update on the custom metrics in case you haven't seen in Slack:

We have three new custom metrics this year!!! Which is exactly three more than we managed last year

  1. Valid head: This is a very very hard one to do based on how Chrome fixes things, but Colt took it on and is now merged.
  2. Robots meta metric rewrite including iframe parsing for robots html and headers.
  3. Most disallowed/allowed user agents in robots.txt. This was written last year but this is the first almanac to use the data.

@foxdavidj
Copy link
Contributor

Just saw in Slack. Very exciting :)

@foxdavidj
Copy link
Contributor

@derekperkins @csliva @jroakes Now that the crawl has started, please create a PR (example) to track the progress of writing the queries needed for the chapter.

Heads up to @SophieBrannon, as you'll likely be needed to confirm what needs to be queried

@foxdavidj
Copy link
Contributor

@derekperkins @csliva @jroakes Any update on the draft PR for the chapter queries (example)?

@csliva csliva mentioned this issue Jun 20, 2022
51 tasks
@foxdavidj
Copy link
Contributor

@SophieBrannon @derekperkins @csliva @jroakes The June crawl has completed and all of the data is available to start being queried.

How are you all feeling about having all of the queries written by the end of the month? If there's anything you have questions about, just let me know

@SophieBrannon
Copy link
Contributor

Just a quick update we're a little behind on the content at the moment

@mobeenali97 @dwsmart @patrickstox @johnmurch @TusharPol @MichaelLewittes

I'll let you know as soon as it's ready for reviewing & editing and what a realistic ETA is for it (I'm hoping by EOD Sunday latest).

@rviscomi rviscomi added the ASAP This issue is blocking progress label Sep 3, 2022
@SophieBrannon
Copy link
Contributor

@SophieBrannon
Copy link
Contributor

@rviscomi can you confirm how many homepages were crawled as part of this project? Is it ~8 million or more?

@rviscomi
Copy link
Member Author

rviscomi commented Sep 6, 2022

There were 8.4M websites surveyed in this year's dataset. The 2022 Methodology page isn't published yet, but you could plan to link there to avoid having to get the number exactly right, for example See the [Methodology](../methodology) page for more info on the sample size.

@SophieBrannon
Copy link
Contributor

that's perfect thanks @rviscomi we've already tied in an internal link opp for the Methodology I just wanted to confirm an average number to reference to. Thanks!

@SophieBrannon
Copy link
Contributor

Hey @tunetheweb / @rviscomi can we get Itamar's website and Twitter links added to the Contributors section?

https://twitter.com/ItamarBlauer
https://www.itamarblauer.com/

Thanks so much!

@rviscomi
Copy link
Member Author

@SophieBrannon @itamarblauer feel free to open a PR with any suggested changes. We'll merge and release any updates in batches as needed. Here's the config file with Itamar's info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2022 chapter Tracking issue for a 2022 chapter ASAP This issue is blocking progress
Projects
None yet
Development

Successfully merging a pull request may close this issue.