Some thoughts on search engines
Filed under: AI | SEO | the Internet | tech
I wrote some thoughts about search engines on Bluesky and why we can’t build a better search engine than Google without eclipsing its power and magnitude and I decided to elaborate on those points here:
People keep saying “someone needs to build a search engine that gives you what you’re looking for” and forget/don’t know that to get the best results, you’d need an index and crawlers as good as Google’s.
There will be trade-offs no matter which way you look and as soon as you bring AI into the mix, you’re basically doing what Google does (but maybe with arguably different motives).
It’s also worth noting that Google has been using AI in search for a very long time—it’s just not been generative until recently. And AI isn’t the root of the problem here—greed and profit above anything is, which is why we have all these layoffs, as opposed to “AI taking your job” (which it isn’t).
I think we also underestimate the size of the Web we’re wanting to crawl, index, and return results for. Common Crawl, the data archive used by most popular large language models, is only a portion of the Web, and not a very big one despite its proliferation in use. But people treat it as if it’s “the Web” or a very good representation of it, enough to use it via chatbots and AI software as a search engine. Big mistake. That false use case is part of why we get so much misinformation from it, which is unfortunately finding itself onto actual search engines in the form of terrible AI-generated articles for the sake of SEO.
It’s taken Google decades to get to this level of power. You’re not going to get anyone to build bigger and better any time soon.
Update - 23rd Dec 2024: I had some more thoughts!
As people flock to other search engines and extol their virtues, it’s important to understand the trade-offs. There’s no doubt that Google Search has fallen off. You get your viewport filled with ads, knowledge panels, and AI Overviews. And when you find the organic links, the quality is questionable. Is it AI, is it an “ultimate” guide masquerading as a list of sponsored ads (written by AI), or is it Reddit? Either way, you probably won’t find what you’re looking for or you think you’ve found it and discover that the information is incorrect.
There’s a lot of minsinformation out there and I think it’s less that Google doesn’t know how to deal with it and more that they’re not that bothered about dealing with it as long as people keep using their platform and the ad revenue keeps flowing in.
So with all that, you decide to use a different search engine. Bing is OpenAI-pilled so if AI turned you off, that’s a no.
Or perhaps you want to try Perplexity… oh wait, that’s AI too and they plagiarise.
Or SearchGPT… whoops, AI again.
How about the smaller search engines like DuckDuckGo, Ecosia, Brave Search, Mojeek, Yep, Kagi? Now, some of those do use AI (even though they didn’t at the start) but they’re free from the clutches of Google.
But there’s the rub. If they rely on their own index, it’s going to be miniscule compared to Google’s and with a smaller index, your usual queries won’t generate expected results. Where you might expect a Wikipedia page and brand website after searching a brand, you might get a less relevant blog post and a listicle. That’s down to the index and the search algorithms and how they order relevance.
Suddenly, you find yourself in the 90s where SEO involved keyword stuffing and handcrafted web directories ruled the roost. It’s nice for the nostalgia factor but frustrating when you need a proper answer to a query.
I haven’t used Kagi but I’ve heard good things. The only downside is the subscription which I think is a good idea but it’s a new commitment for people who’ve spent decades using Google for free and that’ll be a barrier for entry (which again, I don’t disagree with!)
I’ve written all of this down because a lot of people come with the notion that “if you don’t like Google, you should simply use a different search engine” without understanding the trade-offs and making those compromises.