Some brief thoughts on search engines

I wrote some thoughts about search engines on Bluesky and why we can’t build a better search engine than Google without eclipsing its power and magnitude and I decided to elaborate on those points here:


People keep saying “someone needs to build a search engine that gives you what you’re looking for” and forget/don’t know that to get the best results, you’d need an index and crawlers as good as Google’s.

There will be trade-offs no matter which way you look and as soon as you bring AI into the mix, you’re basically doing what Google does (but maybe with arguably different motives).

It’s also worth noting that Google has been using AI in search for a very long time—it’s just not been generative until recently. And AI isn’t the root of the problem here—greed and profit above anything is, which is why we have all these layoffs, as opposed to “AI taking your job” (which it isn’t).

I think we also underestimate the size of the Web we’re wanting to crawl, index, and return results for. Common Crawl, the data archive used by most popular large language models, is only a portion of the Web, and not a very big one despite its proliferation in use. But people treat it as if it’s “the Web” or a very good representation of it, enough to use it via chatbots and AI software as a search engine. Big mistake. That false use case is part of why we get so much misinformation from it, which is unfortunately finding itself onto actual search engines in the form of terrible AI-generated articles for the sake of SEO.

It’s taken Google decades to get to this level of power. You’re not going to get anyone to build bigger and better any time soon.

I've started a linkblog