OpenAI has evidence that its models helped train China’s DeepSeek

Asking AI services: Can you explain the irony here? Perplexity: This situation underscores a broader irony in the tech industry: companies often engage in practices they later condemn in others.

Asking AI services: Can you explain the irony here?

ChatGPT-4o’s answer:
The irony here is that OpenAI, which has been criticized and sued for using copyrighted and publicly available content without explicit permission to train its AI models, is now accusing China's DeepSeek of doing something similar—using OpenAI’s models to train a competitor.

Essentially, OpenAI is upset about another company allegedly “sucking in data” from its models without authorization, while its own AI was trained on vast amounts of internet content, including copyrighted material, in ways that have sparked legal and ethical debates. The Verge article even points this out explicitly: "Sucking in data you didn’t ask permission for? Sounds familiar."​.

This highlights the tension between proprietary AI development and the use of existing knowledge—when OpenAI does it, it’s framed as innovation, but when DeepSeek allegedly does it, it’s considered a breach of intellectual property.


Claude’s answer:
The irony in this situation operates on multiple levels:

The primary irony lies in OpenAI's position regarding intellectual property and data usage. According to the articles, OpenAI is accusing DeepSeek of improperly using OpenAI's model outputs for training through a process called distillation. However, OpenAI itself built its foundational models by training on vast amounts of data from the internet without explicit permission from content creators. As the Verge article directly states: "The situation is rich with irony. After all, it was OpenAI that made huge leaps with its GPT model by sucking down the entirety of the written web without consent."

This irony is further amplified by the fact that OpenAI is currently facing multiple lawsuits from content creators and publishers, including The New York Times, over allegations that OpenAI trained its models on their copyrighted content without permission. So while OpenAI is protesting DeepSeek's alleged unauthorized use of its model outputs, it's simultaneously defending itself against similar accusations of unauthorized data use.

There's also situational irony in how OpenAI's success may have created the very tools that could be used against it. By making powerful AI models accessible through APIs, OpenAI enabled the advancement of AI technology, but this same accessibility might now be allowing competitors to leverage OpenAI's work to build competing models at a fraction of the cost. The company's commitment to making AI technology more accessible might have inadvertently created a pathway for others to potentially circumvent the massive investments typically required for developing such models.

This complex situation highlights the ongoing debates in AI about intellectual property rights, fair use, and the tension between open collaboration and proprietary technology in artificial intelligence development.


Perplexity’s answer:
The irony in the situation involving OpenAI and DeepSeek lies in the duality of intellectual property concerns within the artificial intelligence (AI) industry. OpenAI, a leading AI company, is accusing DeepSeek, a Chinese startup, of allegedly using its proprietary models to create a competitive product. This accusation highlights the potential breach of intellectual property rights, as OpenAI's terms of service explicitly prohibit such practices.

However, the irony deepens when considering that OpenAI itself is currently facing allegations of copyright infringement from various content creators and publishers. These lawsuits claim that OpenAI trained its models using their copyrighted materials without permission. Thus, while OpenAI is asserting its rights against DeepSeek for allegedly distilling its models, it is simultaneously embroiled in legal battles over its own use of others' intellectual property.

This situation underscores a broader irony in the tech industry: companies often engage in practices they later condemn in others. The AI sector is rife with complexities regarding data usage and model training, where many firms—including those in the U.S. and China—utilize outputs from established models to enhance their own capabilities. As such, the very practices that OpenAI criticizes may reflect a common industry approach that it has also employed.


Grok’s answer:
The irony in this situation stems from several layers:

  1. Intellectual Property Concerns vs. Own Practices: OpenAI is accusing DeepSeek of using its models to train a competitive AI, which involves the practice known as "distillation." However, OpenAI itself is facing similar accusations from content creators and authors who allege that OpenAI has used their copyrighted material without permission to train its own AI models. This creates a scenario where OpenAI is defending its intellectual property rights while potentially having infringed upon the rights of others.

  2. Innovation vs. Protectionism: The development of AI technology is often celebrated for its potential to drive innovation. However, when companies like OpenAI seek to protect their proprietary technologies from being used by others in similar ways to how they might have used others' data, it highlights a tension between fostering innovation and protecting commercial interests. This irony is accentuated by the fact that smaller or newer players in the market might rely on such practices to catch up or compete, which is what DeepSeek appears to be doing.

  3. Open Source Ethos vs. Corporate Control: DeepSeek's model is described as open-source, which traditionally implies a sharing of knowledge and resources for the benefit of the broader community. Yet, here, it's accused of using closed, proprietary models from OpenAI to fuel this open-source project. This juxtaposition is ironic because while open-source movements advocate for free access to technology, they are sometimes built on the back of proprietary technology, creating a paradox in the ethos of open-source development.

  4. Cost and Efficiency: DeepSeek managed to train a competitive model with significantly fewer resources ($5.6 million compared to potentially much higher investments by companies like OpenAI). This efficiency in using fewer, less expensive resources to achieve similar results could be seen as a smart innovation in the AI field. However, if this was achieved by arguably unethical means (using someone else's model without permission), it underscores the irony that such cost-saving might come at the expense of legal and ethical standards.

  5. Market Impact: The market reactions, like the significant drop and subsequent recovery in Nvidia's share price, reflect the market's sensitivity to these developments. While the immediate concern was about the need for expensive hardware, the broader irony is that market value can swing dramatically based on perceptions of technological advantage or disadvantage, which in this case revolves around legal and ethical issues rather than purely technical ones.

This situation highlights the complex interplay between innovation, intellectual property, ethics, and market dynamics in the fast-evolving field of AI.