The co-founder of Anthropic, the AI safety startup behind the Claude model, has delivered a sobering message to the tech community: we must halt the unchecked development of artificial intelligence before it spirals beyond human oversight. Speaking at a closed-door summit in London, the executive painted a picture of a future where autonomous systems could make decisions with irreversible consequences, from financial market crashes to autonomous warfare.
This is not the first warning from the inner circle of AI development. But it carries weight because Anthropic has positioned itself as the ‘ethical’ alternative to giants like OpenAI and Google DeepMind. Their entire business model hinges on building ‘safe’ AI. Yet even they admit the genie is partially out of the bottle. The co-founder’s plea echoes a growing unease in Silicon Valley: that the race for artificial general intelligence (AGI) is a sprint without a finish line, and we are running blindfolded.
The core issue is what researchers call the ‘control problem’. How do we ensure that a system smarter than any human remains aligned with human values? Current methods, like reinforcement learning from human feedback (RLHF), are band-aids on a bullet wound. They work for today’s narrow AI but break down when systems become sufficiently complex. Imagine trying to explain ‘don’t cause suffering’ to a network that never experiences pain. It’s like teaching colour to a man born blind.
The warning comes amid a regulatory vacuum. The EU’s AI Act is still in draft. The UK’s summit at Bletchley produced non-binding agreements. Meanwhile, the US Congress cannot even agree on what AI is. This leaves companies to self-regulate, which is like asking the fox to guard the henhouse. Anthropic’s co-founder suggests a temporary moratorium on training models beyond a certain threshold, akin to the 1975 Asilomar Conference that paused recombinant DNA research until safety protocols were established.
But the cat is already out of the bag. Competitors like Meta have open-sourced models that anyone can fine-tune. Nations like China are pouring state resources into AI dominance. A voluntary pause would hurt the well-intentioned while the ruthless race ahead. The digital sovereignty of entire nations is at stake. If we lose control of AI, we lose control of our information ecosystems, our economies, and our democratic processes.
The solution lies not in stopping progress but in rethinking our approach. We need new architectures that embed interpretability and verifiability from the ground up. We need global treaties that criminalise the deployment of unaligned systems. And we need a cultural shift in the engineering community: from ‘move fast and break things’ to ‘move deliberately and fix things first’.
The Anthropic co-founder’s warning is a canary in the coal mine. The question is whether we will heed it or ignore it until the mine collapses.









