A furious row has erupted between US-based AI firm Anthropic and Chinese tech giant Alibaba over allegations of unauthorised data extraction from critical language models. The dispute, which threatens to undermine cross-border AI collaboration, has prompted the UK’s Digital Minister to publicly call for intervention, fearing a “digital sovereignty” crisis.
The conflict began when Anthropic accused Alibaba of systematically scraping proprietary training data from its Claude models, a move it claims violates both intellectual property rights and data protection laws. Alibaba has denied the accusations, countering that its own large language model (LLM) development has independently achieved similar benchmarks. The company’s spokespeople have not yet provided evidence to support their claim, leaving industry observers sceptical.
This is not a he-said-she-said dispute. At the centre is a fundamental question of how we maintain trust in AI development. The technology relies heavily on vast datasets, often scraped from the public internet. But the line between legitimate research and theft has never been thinner. Anthropic has released data suggesting that Alibaba’s server logs indicate repeated access to its API in a pattern consistent with bulk extraction. The company warns that such actions could lead to catastrophic model mirroring, where a rival replicates core capabilities without the costly safety testing.
Alibaba has responded with its own technical analysis, arguing that the traffic in question came from automated red-teaming exercises designed to test model robustness. But analysts note that this explanation is unusual given that such tests are typically conducted offline or with permission. The silence from Beijing has only amplified tensions.
UK Digital Minister Baroness Jones has now stepped in, calling for an urgent meeting with both companies and the OECD’s AI Governance Working Group. In a statement, she stressed that “the era of AI extraction without consent must end. The UK will not become a battleground for corporate espionage in the digital realm.” Her intervention reflects growing concern in Whitehall that the UK’s status as a neutral AI hub could be compromised. The government is reportedly considering new legal frameworks to regulate cross-border AI data flows, including mandatory disclosure of extraction activities and real-time monitoring of API usage.
Tech watchers in London note that this case highlights the tension between the open science ethos of AI research and the commercial realities of a fiercely competitive market. The race to achieve general intelligence has created incentives for shortcuts. But as the economist Carl Frey argued recently, such behaviour risks triggering a “race to the bottom” where safety and fairness are sacrificed for speed.
For the average user, this dispute might seem abstract. But its consequences are direct. If AI models become trained on stolen data, their reliability diminishes. We could see unexpected biases or errors if training data is not carefully curated. Worse, if companies retreat behind firewalls, we lose the benefits of open collaboration that have driven progress in healthcare, climate modelling, and education.
What happens next? The Digital Minister has given both companies 14 days to produce evidence of their claims. If no resolution is reached, she will ask the Competition and Markets Authority to investigate. The implications for the sector are huge. A precedent of aggressive litigation could stifle innovation, while inaction would legitimise a culture of taking.
One thing is clear: the existing norms of AI development are no longer adequate. We need a new social contract for AI data, one that respects property while enabling progress. The UK has a chance to lead here, not by picking sides, but by establishing principles that protect the integrity of the technology we all depend on.







