Home > News > ChatGPT Maker Suspects China’s Dirt Cheap DeepSeek AI Models Were Built Using OpenAI Data — and the Irony Is Not Lost on the Internet

ChatGPT Maker Suspects China’s Dirt Cheap DeepSeek AI Models Were Built Using OpenAI Data — and the Irony Is Not Lost on the Internet

Author:Kristen Update:Mar 19,2025

OpenAI has voiced suspicions that China's DeepSeek AI models, known for their remarkably low cost, may have been developed using data from OpenAI. This has prompted strong reactions, including a statement from Donald Trump characterizing DeepSeek as a wake-up call for the U.S. tech industry. Nvidia, a major player in the GPU market crucial for AI, experienced a significant stock drop of 16.86%, the largest in Wall Street history, following DeepSeek's emergence. Other tech giants like Microsoft, Meta, Alphabet, and Dell also saw their stock prices decline.

DeepSeek boasts that its R1 model offers a significantly cheaper alternative to Western counterparts like ChatGPT, leveraging the open-source DeepSeek-V3. This model reportedly requires substantially less computing power and had an estimated training cost of just $6 million. While this claim has been contested, it has nonetheless raised concerns about the billions being invested in AI by American tech companies, unsettling investors. DeepSeek's popularity surged, reaching the top of the U.S. most downloaded free app chart, fueled by discussions surrounding its effectiveness.

Bloomberg reported that OpenAI and Microsoft are investigating whether DeepSeek utilized OpenAI's API to integrate OpenAI's AI models into its own. OpenAI acknowledged that Chinese companies, among others, consistently attempt to extract data from leading U.S. AI companies, a process known as distillation, which violates OpenAI's terms of service. OpenAI emphasized its commitment to protecting its intellectual property and its collaboration with the U.S. government to safeguard advanced AI models.

David Sacks, President Donald Trump's AI czar, indicated substantial evidence suggests DeepSeek employed distillation techniques to leverage OpenAI models. He anticipates leading AI companies will implement measures to prevent similar occurrences.

DeepSeek is accused of using OpenAI’s model to train its competitor using distillation. Image credit: Andrey Rudakov/Bloomberg via Getty Images.

The situation is further complicated by the irony of OpenAI's position, given previous accusations of its own data acquisition practices. Criticisms have highlighted OpenAI's reliance on vast amounts of internet data, including copyrighted material, in the creation of ChatGPT. This hypocrisy was pointed out by various observers.

OpenAI previously stated that creating AI tools like ChatGPT without copyrighted material is impossible. In a submission to the UK's House of Lords, OpenAI argued that the breadth of copyright protection makes it practically impossible to train leading AI models without using copyrighted works. They suggested limiting training data to public domain materials would be insufficient for modern AI needs.

The use of copyrighted materials in training AI models has become a major point of contention. The New York Times sued OpenAI and Microsoft, alleging unlawful use of its work. OpenAI countered that its training practices constitute "fair use." This followed a lawsuit filed by 17 authors, including George R. R. Martin, alleging widespread copyright infringement. Adding another layer of complexity, a 2018 U.S. Copyright Office finding upheld by a district judge stated that AI-generated art cannot be copyrighted due to the lack of a human creative element.