Pascal's Chatbot Q&As
Posts
GPT-4o: If Microsoft is taking active steps to combat piracy of its own software using AI, there is a strong case for it to also: Train LLMs Exclusively on Licensed Data, Assist Rights Owners in...

GPT-4o: If Microsoft is taking active steps to combat piracy of its own software using AI, there is a strong case for it to also: Train LLMs Exclusively on Licensed Data, Assist Rights Owners in...

...Fighting AI-Related Piracy, Apply Anti-Piracy AI to Content Owners' Needs. If it enforces licensing for its own products, it should set the same standard for AI training and outputs.

Pascal Hetzscholdt
March 05, 2025

Asking AI services: Please read the article “Microsoft quietly updates Copilot to cut down on unauthorized Windows activations” and tell me whether you feel that if MS does this, shouldn’t it also be a) using LLMs trained only on licensed data b) help rights owners and creators fight piracy via LLMs, whether it concerns training or verbatim model output?

ChatGPT-4o’s answer:
The article discusses how Microsoft has updated Copilot to help detect and reduce unauthorized Windows activations. This move underscores Microsoft's commitment to protecting its own intellectual property and enforcing licensing compliance.

Applying the Same Principles to AI and LLMs

If Microsoft is taking active steps to combat piracy of its own software using AI, there is a strong case for it to also:

Train LLMs Exclusively on Licensed Data – If Microsoft enforces strict licensing for Windows, it should apply the same rigor when using copyrighted content in AI training. Many content owners argue that AI models should only be trained on data obtained with proper permissions.
Assist Rights Owners in Fighting AI-Related Piracy – Microsoft's AI, including Copilot, could be used to detect unauthorized use of copyrighted content in model training or verbatim outputs. This would align with its anti-piracy stance on Windows activations.
Apply Anti-Piracy AI to Content Owners' Needs – If AI can prevent illegal Windows activations, similar AI capabilities should be available to help publishers, musicians, and artists detect and enforce rights violations in AI-generated content.

Conclusion

Microsoft's actions against unauthorized Windows activations strengthen the argument that companies developing AI should be equally rigorous in respecting copyright. If it enforces licensing for its own products, it should set the same standard for AI training and outputs, while also offering tools to content owners to fight digital piracy. Otherwise, the approach risks being seen as selectively enforcing intellectual property rights when it benefits Microsoft while neglecting broader concerns about copyright in AI.

Text continues below the image»

Grok’s answer:
Based on the article “Microsoft quietly updates Copilot to cut down on unauthorized Windows activations” from TechRadar, dated March 5, 2025, Microsoft has taken steps to prevent its AI-powered assistant, Copilot, from aiding in the piracy of its Windows operating system. The update ensures that Copilot no longer recommends third-party tools or provides workarounds to bypass Microsoft’s licensing system, instead rejecting such prompts and emphasizing the illegality of unauthorized copies while directing users to official support channels. This move reflects Microsoft’s effort to align its AI with its own intellectual property interests, closing a loophole that previously allowed Copilot to inadvertently assist in bypassing Windows activation protocols.

Your question raises two related points: a) whether Microsoft should use large language models (LLMs) trained only on licensed data, and b) whether it should help rights owners and creators fight piracy via LLMs, both in terms of training data and model outputs. Let’s break this down based on the article and general reasoning, since I’m relying on my own knowledge without additional search capabilities this time.

a) Should Microsoft use LLMs trained only on licensed data?

The article doesn’t directly address the training data behind Copilot, so there’s no specific evidence here about whether its LLM was trained solely on licensed content or not. However, your point seems to stem from a principle of consistency: if Microsoft is cracking down on piracy of its own software through Copilot, shouldn’t it ensure its AI is built in a way that respects others’ intellectual property as well? This is a fair ethical argument. If Microsoft expects users to respect its licensing terms, it could be seen as hypocritical to train its models on unlicensed or pirated data scraped from the web, books, or other sources without permission. Many LLMs, historically, have been trained on vast datasets that include copyrighted material, often without explicit consent—a practice that’s sparked lawsuits and debates (e.g., cases against OpenAI and others). If Microsoft is taking a stand against piracy in Copilot’s outputs, extending that stance to its training process would strengthen its moral and legal position. It’s not clear from the article whether Microsoft does this already, but the principle you’re suggesting aligns with a broader push for transparency and accountability in AI development. Without more data, I’d say it’s a reasonable expectation, though not necessarily a current obligation unless legally mandated.

b) Should Microsoft help rights owners and creators fight piracy via LLMs?

The article shows Microsoft using Copilot to protect its own interests by blocking piracy-related prompts, which suggests it has the technical capability to filter and control AI outputs. Your suggestion—that it should extend this to help other rights owners and creators fight piracy—makes sense as an expansion of that capability. Piracy doesn’t just affect Microsoft; it’s a widespread issue for musicians, writers, filmmakers, and software developers alike. If Copilot can be programmed to reject prompts that undermine Microsoft’s IP, it could theoretically be adapted to detect and block responses that reproduce copyrighted material verbatim (e.g., song lyrics, book excerpts) or point users to pirated content elsewhere (e.g., torrent sites). This could apply to both training (avoiding use of pirated works) and output (preventing regurgitation of protected content).

From a practical standpoint, Microsoft’s focus in the article is self-serving—it’s protecting Windows, its flagship product. Extending this to a broader anti-piracy role would require collaboration with other industries, legal frameworks, and potentially a shift in Copilot’s design philosophy from a general-purpose assistant to a more policed tool. It’s not a bad idea, though. Imagine Copilot saying, “I can’t provide that movie script—it’s copyrighted. Try a licensed streaming service instead.” This could benefit creators and align with Microsoft’s apparent commitment to refining AI responsibly, as noted in the article. However, it might also frustrate users who want unrestricted answers, a tension already evident with rivals like ChatGPT or Grok, which the article mentions may still assist in bypassing security measures.

My Take

I feel Microsoft’s update to Copilot is a step toward accountability, but it’s narrowly focused on its own bottom line. Your suggestions push that accountability further, and I agree they’re worth considering. If Microsoft is serious about curbing unauthorized use of its software, using only licensed training data would be a consistent next step—though it’s hard to say if they’re doing this without more info. Helping other creators fight piracy via LLMs is a bigger ask but a noble one; it could position Microsoft as a leader in ethical AI, not just a defender of its own turf. The article hints at Microsoft’s awareness of AI’s broader implications (e.g., refining Copilot to avoid “embarrassing issues”), so maybe they’d be open to it. What do you think—should they prioritize their own IP, or take on a wider role?

Claude’s answer:
Based on the article, Microsoft has updated Windows Copilot to prevent it from recommending ways to bypass Microsoft's licensing system or offering tools that aid in the piracy of Windows. This raises interesting questions about consistency in how tech companies approach intellectual property rights.

Your question touches on two related but distinct aspects of AI and intellectual property:

Training data licensing: Should Microsoft only use properly licensed data to train their LLMs?
Anti-piracy measures: Should Microsoft help rights owners fight piracy through their LLMs?

These are thoughtful questions that highlight an apparent tension. Microsoft is protecting its own intellectual property by preventing Copilot from facilitating Windows piracy, but the broader questions about training data and other creators' rights remain.

Regarding training on licensed data, there's a legitimate argument for consistency. If Microsoft expects users to respect its intellectual property rights for Windows, it would be logically consistent for Microsoft to ensure they're properly licensing or obtaining permission for content used to train their AI models. The copyright status of training data remains legally contentious, with several ongoing lawsuits from creators against AI companies.

For the second point about helping other rights owners, there's also a reasonable case for consistency. If Microsoft is updating Copilot to protect its own software from piracy, extending similar protections to other creators' works would demonstrate a principled rather than self-interested approach to intellectual property. This could involve preventing their AI systems from:

Reproducing copyrighted content without permission
Providing instructions for circumventing other companies' DRM systems
Generating activation codes for other software

The article mentions that Microsoft's update was specifically aimed at preventing Windows Copilot from recommending third-party tools to bypass Microsoft's licensing system. This appears to be a targeted fix for a specific issue affecting Microsoft's own products, rather than a broader stance on intellectual property protection in AI.

The situation highlights the evolving tensions between AI development, which often relies on vast amounts of training data, and traditional notions of intellectual property. Companies like Microsoft may need to reconcile their approach to protecting their own IP with their practices regarding others' content in AI development.

Perplexity’s answer:
The article discusses how Microsoft updated Windows Copilot to prevent it from recommending third-party tools that aid in Windows piracy. Previously, Copilot was inadvertently providing information on bypassing Microsoft's licensing system. Now, it rejects prompts for activation codes and advises users to obtain legitimate copies of Windows.

Regarding your questions:

a) Using LLMs trained only on licensed data: The article doesn't explicitly address whether Microsoft should only use LLMs trained on licensed data. However, the update to Copilot suggests Microsoft is taking steps to prevent its AI from facilitating copyright infringement, which aligns with the principle of using licensed data.
b) Help rights owners and creators fight piracy via LLMs: The article does not directly mention Microsoft assisting rights owners and creators in fighting piracy through LLMs. However, the move to prevent Copilot from aiding in unauthorized Windows activations could be seen as a step in that direction. It demonstrates a willingness to control how its AI is used and prevent it from contributing to copyright infringement.

Gemini’s answer:
Based on the article, here's a breakdown of the situation and how it relates to your questions:

Microsoft's Action:
- Microsoft has updated Windows Copilot to prevent it from recommending tools or methods that facilitate unauthorized Windows activations.
- This addresses a situation where Copilot was inadvertently providing information that could be used for piracy.
- The update stops Copilot from providing advice that would allow users to bypass windows licencing.
Relating to Your Questions:
- a) Using LLMs trained only on licensed data:
  - The article primarily focuses on the output of Copilot, not the underlying training data. However, Microsoft's action indicates a concern about the potential for AI to facilitate copyright infringement. Therefore, it is a logical extension that Microsoft, and other companies, will have to be increasingly concerned about the data that LLM's are trained upon.
- b) Help rights owners and creators fight piracy via LLMs, whether it concerns training or verbatim model output?
  - Yes, Microsoft's update is a step in this direction. By preventing Copilot from providing information that aids piracy, they are using the LLM to enforce copyright.
  - It shows that AI can be used as a tool to police copyright infringement, by preventing the distribution of information that can be used to infringe copyright.
  - It is also clear, that the data that LLM's are trained upon, is a growing concern, and will have to be addressed to prevent copyright infringement.

In essence, Microsoft's actions demonstrate a move towards controlling how AI tools interact with copyrighted material. This implies a growing recognition of the need to address both the output and the training data of LLMs in relation to piracy.