What Anthropic Is Alleging Against Alibaba
In one of the most consequential IP disputes to emerge from the global AI arms race, Anthropic has publicly accused Alibaba of conducting what it describes as "industrial-scale AI extraction" — systematically scraping, reverse-engineering, or otherwise harvesting outputs from Anthropic's Claude models to train competing systems. The allegations, which surfaced in reporting by Cybernews, arrive at a particularly charged moment: China has simultaneously unveiled a new large language model described as a direct rival to Claude, reportedly dubbed Mythos.
For developers, IT decision-makers, and policy professionals watching the AI landscape, the timing and nature of these claims carry significant weight. This is not simply a corporate dispute. It cuts to the heart of how proprietary AI systems can be protected in an era of API access, prompt injection, and model distillation — and it raises uncomfortable questions about whether existing legal frameworks are remotely adequate to address AI-native forms of intellectual property theft.
Anthropic has not disclosed the full technical details of its allegations publicly, but the language used — "industrial-scale" — suggests a sustained, systematic operation rather than opportunistic scraping. According to reporting from Reuters Technology, AI companies have grown increasingly concerned about the use of model outputs to bootstrap competing systems, a technique sometimes called "model distillation" or "knowledge distillation," which can dramatically accelerate the development of rival models by leveraging the capabilities already embedded in frontier systems.
How Industrial-Scale AI Extraction Actually Works — and Why It's So Hard to Stop

To understand why this accusation is so significant, it helps to understand the mechanics of what Anthropic is alleging. Model distillation — using the outputs of a large, high-quality model to train a smaller or competing model — is a well-documented technique in machine learning research. When done at scale, a well-resourced actor can query an API millions of times across diverse prompts, collect the responses, and use this synthetic dataset to fine-tune or even pre-train a rival model. The result can be a system that approximates the capabilities of the original at a fraction of the true development cost.
As research published on arXiv has demonstrated, distillation attacks on proprietary models are technically feasible even through standard API endpoints. The challenge for model providers is that while they can detect anomalous querying patterns — unusual volumes, systematic prompt structures, repetitive queries — definitively proving that the outputs were used for competitive model training is far more difficult in a legal context.
Anthropic's challenge is compounded by the fact that Alibaba, as a Chinese technology conglomerate, operates largely outside US legal jurisdiction. Even if Anthropic could establish a clear causal chain between its outputs and Alibaba's Mythos model, the enforcement mechanisms available to a US company against a Chinese competitor are limited — particularly in the current geopolitical climate.
"The question is no longer whether model distillation at scale constitutes intellectual property theft — the question is whether any existing legal framework can actually catch up with the speed at which this is happening."
— AI policy analyst commenting on the Anthropic-Alibaba disputeInside Mythos: What We Know About China's New Claude Rival
The simultaneous announcement of Mythos — a large language model positioned explicitly as a competitor to Western frontier AI systems — adds a provocative layer to the dispute. While technical benchmarks for Mythos have not been independently verified at the time of writing, the model's unveiling aligns with a broader pattern of Chinese AI development that has accelerated markedly since the international response to DeepSeek's R1 model earlier this year sent shockwaves through Western AI research communities.
According to Wired's AI coverage, China's top technology firms — Alibaba, Baidu, Huawei, and ByteDance — have all dramatically increased investments in foundation model development, driven partly by US chip export controls that have restricted access to Nvidia's most advanced GPUs. This hardware pressure has, counterintuitively, accelerated Chinese investment in model efficiency and alternative training methodologies — precisely the kind of environment where knowledge distillation from competitors becomes strategically attractive.
For European and global enterprises evaluating AI vendors, Mythos raises immediate questions about provenance and trust. If a model has been trained — in whole or in part — on extracted outputs from another system, what does that mean for the integrity of its outputs? What liability does an enterprise assume by deploying a system whose training lineage is disputed? These are not abstract concerns for privacy professionals and IT decision-makers; they are live due diligence questions.
| Model | Developer | Origin | Key Claim | Regulatory Status |
|---|---|---|---|---|
| Claude | Anthropic | USA | Safety-first frontier LLM | Subject to US AI EO; EU AI Act review |
| Mythos | Alibaba | China | Claude-class capabilities | Subject to Chinese AI regulations |
| Qwen | Alibaba | China | Open-weight multilingual model | Partial open source release |
| GPT-4o | OpenAI | USA | Multimodal flagship model | Subject to US AI EO; EU AI Act review |
| Mistral Large | Mistral AI | France/EU | European sovereign alternative | EU AI Act compliant pathway |
Why the US-China AI Rivalry Now Has a Legal and Regulatory Dimension

The Anthropic-Alibaba confrontation is best understood not as an isolated corporate dispute but as a flashpoint in what is increasingly a state-level contest over AI supremacy. The US government has imposed sweeping export controls on advanced semiconductors destined for China, and the Chinese government has responded with both indigenous chip development programs and a strategic push to build competitive AI models by any means available. In this context, Anthropic's accusations — if substantiated — would represent one of the first documented instances of a major Chinese technology company being publicly accused of systematically harvesting a US AI company's proprietary model outputs.
For European stakeholders in particular, this dispute underscores the strategic value of AI sovereignty. The EU AI Act, which entered into force in 2024 and is being phased in through 2027, does not currently contain specific provisions addressing model distillation attacks or the extraction of proprietary training data through API abuse. This is a regulatory gap that policymakers, legal teams, and standards bodies will need to address urgently, as the technique is not unique to state-sponsored actors — it is accessible to any well-resourced commercial competitor.
According to MIT Technology Review, the broader question of AI intellectual property is one of the most contested frontiers in tech law. Courts in the US are still working through foundational cases about whether training data itself can constitute copyright infringement; the question of whether model outputs are protectable intellectual property is several steps further along in legal complexity. Without clearer frameworks, companies like Anthropic are essentially trying to protect a new class of asset with legal tools designed for an older world.
Frontier AI Model Development — Estimated Investment Scale