Luke Davis


Some thoughts on Seirdy's takeaways from the Google content doc leak

Filed under: SEO | tech

While looking through a webring that one of my other sites is on, I noticed Seirdy’s name. I’m familiar with their work having featured their review of search engines on LOGiCFACE in 2023. So I had a look on their site again and noticed an article on the Google Content Warehouse API documentation leak.

If you don’t know/remember, there was an accidental leak of internal documents last year that referred to Google’s 14,000+ search ranking factors and a lot of API references and variable names that turned the SEO industry into a foaming frenzy (more than usual, anyway). And yes, I’m a year late but I know some people are still probably referring back to this, especially with recent Google lawsuits that could affect the search landscape.

Thoughts on thoughts on facts

Seirdy did a good job of going through some of the key takeaways and quelling any of the generated hysteria around it all. I’ll be honest: despite having been an SEO for 6 years next month (and a technical SEO for 3½), I tried to steer clear of all the blog posts and hot takes on the leak. I didn’t even read the Mike King post that Seirdy referred to when it was doing the rounds and it’s all because I got really tired of having to engage with industry thinkpieces.

They tend to follow a rhetoric of “you need to throw away everything you know about this and start focusing on that otherwise you/your clients are dead in the dirt” or general snark against others. Everyone’s making big money, either for themselves or multi-billion dollar corporations that don’t care about what any of us say unless it affects their profit margins. I don’t have space for feelings of betrayal from Google employees because they didn’t divulge information about search algorithms.

It’s not that deep

But back to the point of this. I agree with Seirdy that this isn’t as significant as some SEOs claim. We don’t know nearly enough to make any statistically significant judgements and, when this was hot off the press, we certainly couldn’t have reworked client strategies based off a leak. There are some interesting variable names and connotations for what they could mean if used in production but that’s it. Even if Google reps did “lie” about a domain authority metric or the use of click data, so what? If they said “can’t say, sorry”, I’m sure that would have been spun for engagement, just as much as “no, we don’t use that” when people do tests and suggest the opposite.

And it comes down to something else that Seirdy said that I agree with:

I still despise how the SEO industry and Google have started an arms race to incentivize making websites worse for actual users, selecting against small independent websites. I do maintain that we can carve out a non-toxic sliver of SEO: “search engine compatibility”. Few features uniquely belong in search engine, browser, reading mode, feed reader, social media link-preview, etc. compatibility. If you specifically ignore search engine compatibility but target everything else, you’ll implement it regardless. I call this principle “agent optimization”. I prefer the idea of optimizing for generic agents to optimizing for search engines, let alone one (1) search engine, in isolation. Naturally, user-agents (including browsers) come first; nothing should have significant conflict with them.

Search engine something?

The term “search engine optimisation” has felt off for a while now and none of the alternatives work for me either. It’s naive to think that “helpful content” will always get you ranking highest. Do users take the time to read below-the-fold content on every e-commerce category page? Or is that there so search engines can understand the page and rank it accordingly? We all know what’s going on but perhaps we’re the liars here. We should focus less on gaming systems to help companies get richer and start focusing more on things like user experience: how easy it is to navigate a site and improve session quality, reduce friction, and make these journeys worthwhile. If it sounds like I’m biased towards web performance, that’s because I am. What I won’t waste time on is one API variable amongst 14,000 others from a leaked document just because it’s called RankSiteHigherIfDestroyedStillTrueTimesInfinityPlusOne.

Questions about my blog I am not a content creator