Infinite Spam Cannons And Their Antidotes

The advent of AI-generated content has created a new challenge for search engines to organize the infinite new glutton of information in a useful way. I discuss the need for us content producers to find ways to stand out from the 'infinite spam cannon' and offer solutions for optimizing our content to be recognized and rewarded by search engines.

By Michael Levin

Sunday, May 7, 2023

I’ve had difficulty producing one particular deliverable for work because of the deep thought-work that had to go into supporting it. But now I have. I’m what you would call long in the tooth in the field of SEO. If you don’t know the expression long in the tooth, it comes from horses. As horses age, their gums recede, making their teeth look longer. Or was that rabbits who have teeth that grow continuously?

And so it is with this long-in-the-tooth view I look the long-in-the-tail strategies of yore in SEO. Hold on for a wacky few paragraphs for me to delve into the new world realities. We start out with something called an O-scale problem. Or more specifically, O to the power of O. Follow along!

Whether or not cosmic inflation was real or not in the birth of our universe, the concept of exponentially growing “O-scale” problems is a thing. You’ve heard of linear vs. exponential? Well exponential isn’t anything compared to an exponential to the power of an exponential. The hockeystick-shaped curve of inflation plotting say the traveling salesperson problem of every traveling salesperson is… uh, quite big. It’s not infinity, but it’s a pretty big haystack in which to find a Needle, and that’s Google’s new problem:

Organizing all the world’s information so that it is useful and satisfies and “makes happy” the user. Otherwise, you will not continue to be their product to serve up to their actual customers, advertisers. You must remain happy for Google to get and keep their actual customers, the lifeblood, main mission and indeed actual purpose of any organization, for else it ceases to exist.

Starting with the GPT-3 API made available to developers in 2020, the next generation spam cannon made possible. If it wasn’t clear then, it certainly was by 2022 with generative transformers like Midjourney, ChatGPT and whatnot. This equals a new exponential O-scale problem that breaks the original Google problem statement. Few are acknowledging how this breaks the dumb old crawl-and-index model. It would simply be to stresses and expensive to crawl an index it all.

The arms race has changed, and it’s up to us content producers to figure out how not merely to survive in this new landscape, but to be findable at all. I mean like how do you not completely disappear when a script kiddie less expert than you can appear 1000x more expert than you by just prompting an AI to make or look that way? You’re toast.

Google is encountering practically infinite new content overnight and every night. The new tech can keep 2 virtual-philosophers who don’t really exist taking ad nauseam to each other on a website with a steady flow of conversation modeled on smart folks, with the output being passable enough to make sense and fool folks. https://infiniteconversation.com/

Even non-AI-generated content like this as I type onto not even Copilot assisted SimpleNote can find its audience anymore simply by virtue of existing and genuinely being the best match for people interested in such content. Those days are over, because of the signal-to-noise ratio. I’m in competition with so much noise the signal is unlikely to ever be meritoriously recognized, offered up, and over time promoted and stabilized to receive the amount share of searcher-attention that could/should be coming it’s way.

The world is as different now after the infinite spam cannon of content existing as it was 25 years ago before the rise of AltaVista and Google, when crawl-and-index worked. There’s plenty of bandaids that can be applies as the infinite spam cannon goes through its power-on warmup phase. That’s what we’re experiencing now. But give it 2 more years or so and it’s going to be out of hand. So what is Google, Microsoft, Baidu and every other search engine company to do? And what does it mean for each of us, the individual users of search who need and rely upon or every day?

Well the first thing is to make your own version of the infinite spam can on better than anyone else’s. The main problem with ChatGPT and Midjourney are that they trained on the same base base of content for everyone using or. It doesn’t matter your prompting skills, the possibilities of output are the same for you as anyone else using the system. The range of all possible outputs from a system is called it’s phase space. The phase space for you using ChatGPT or Midjourney is the same phase space as anyone else using those systems, no matter your prompts.

So? Change your phase space! Train your infinite spam cannon generator on paywalled proprietary data that nobody else has access too. Your own creative product output might be a good start, but then you need to have the genuine skills in the first place, and nobody using the infinite spam cannon wants that. They’re all the same shortcut takers and cheaters who used doorway page generators in the early days of SEO who already had the ability to infinitely “remix” source content into hundreds of thousands or millions of low-quality pages.

You need not create an infinite spam cannon of your own, but whatever you do needs to be able to cut through it like a knife. Of all your other business problems you’re solving, foremost among them must always be why will your content not get lost on the signal? You are on the infinite spam cannon game like it or not, if the reason for generating content is for it to find an audience. That means either exchanges between human beings link-trading, social media and word of mouth style. Or it means search. If there’s some 3rd way other than publishing in a way where things can he found (search) or being explicitly told by someone you know and trust (referral), let me know.

So if you’re not in the spam cannon-making game, you’re contending with it. And to contend with it through better and better promoting within the same phase space has diminishing returns because that’s precisely what everyone else is doing.

You must expand your phase space to be different from theirs. You must train your own models with content-in that is qualitatively different than everyone else, and that means beyond the free web crawl that OpenAI did in 2021 and still leans on. It means training on paid-for, proprietary or otherwise original content. And it still means publishing that content in a traditionally SEO-optimized fashion so that whatever digests it in search of originality and newness that will come to characterize high E-E-A-T, or whatever amorphous marketing acronym Google uses next.

In summary, use the same generative transformer models as your competitors to optimize your own unique content. Do what you can to make your content at sufficient scale so that the generative trick of GPT actually works for you as well as it does on the common pre-trained models out there like ChatGPT. This might mean acquiring exclusive training rights to large paywalled content collections or other data that will improve and differentiate output. This improves output must still be search optimized and made available as input to Google and other search systems that should be intelligent enough to recognize and reward this fact.