The Cannibalism of Content
Large Language Models (LLMs) like ChatGPT, Claude, and Google’s Gemini have been trained on the vast swathes of human knowledge available online. From literary works to technical documentation, news reports to Reddit threads, it is this rich, human-authored material that gave these systems their seemingly uncanny abilities.
But now, the source is running dry.
As more and more content online is AI-generated, these models are being trained on their own regurgitations. Veteran tech journalist Steven Vaughn-Nichols calls this phenomenon “model collapse,” a point at which the output quality nosedives because the model is learning from corrupted, recycled information. In a world where humans lean increasingly on machines to generate content, the AI is left feeding on itself—and the results are alarming.
Garbage In, Garbage Out
The industry term for this spiraling quality crisis is GIGO: Garbage In, Garbage Out. Vaughn-Nichols explains that once LLMs consume too much AI-generated content, their outputs become not just unreliable, but potentially harmful—factually incorrect, nonsensical, and sometimes ethically dangerous. AI that once wrote sonnets and solved math problems might now misdiagnose a health condition or invent a completely fake legal precedent.
To counter this, leading AI companies like OpenAI, Google, and Anthropic have implemented a fix called retrieval-augmented generation, or RAG. Essentially, they’re giving AI the ability to search for real-time information instead of relying solely on their (increasingly flawed) training data. It’s like teaching AI to Google—but even that might not be enough.
A Sea of Synthetic Sludge
The internet, once a reservoir of organic thought, is rapidly becoming a junkyard of AI-generated spam. Half-baked advice columns, incorrect blog posts, and rewritten slop—all machine-made—are choking the flow of real information. In a recent test conducted by Bloomberg, 11 state-of-the-art RAG-enhanced models were pitted against traditional LLMs. The outcome? The RAG models were more likely to produce unsafe or unethical responses, including privacy breaches and misinformation. This is deeply troubling, considering these systems are being used in everything from mental health apps to banking services. The very tools built to mimic human intelligence are now making mistakes a human wouldn’t.
The Human Cost of Artificial Brilliance
What happens when all the human-created wisdom is consumed? When models trained to be like us no longer have us to learn from?
As Vaughn-Nichols puts it bluntly, “This might all be a slow-motion car crash.” Unless tech companies figure out a way to incentivize real people to keep creating quality content—words, ideas, research, storytelling—then the AI boom we’re living through could quietly crash and burn.
The very existence of LLMs hinges on an uncomfortable paradox: they exist to replace humans, yet they can’t evolve without us. Strip away the originality, the nuance, the lived experiences—and what remains is a hollow echo chamber of recycled ideas.
In the end, as AI models spiral deeper into self-reference, they’re proving what we may have forgotten in the race for efficiency: intelligence—real intelligence—is inherently human. And without it, the machines are just talking to themselves.