• Pascal's Chatbot Q&As
  • Posts
  • Overall, the evidence presented appears to be robust, combining technical claims, specific examples of copyright infringement, economic analysis, and admissions from Perplexity’s own executives.

Overall, the evidence presented appears to be robust, combining technical claims, specific examples of copyright infringement, economic analysis, and admissions from Perplexity’s own executives.

Perplexity should have prioritized securing licenses, building a more transparent and cooperative business model, and addressing the risks of false attribution.

Question 1 of 4 for ChatGPT-4o: Please analyze the lawsuit of Dow Jones and New York Post against Perplexity and tell me what the main grievances are

ChatGPT-4o’s answer:
The main grievances in the lawsuit filed by Dow Jones & Company, Inc. and NYP Holdings, Inc. against Perplexity AI, Inc. revolve around several key allegations of copyright infringement and trademark misuse. Here's a summary of the primary grievances:

  1. Massive Copyright Infringement: Perplexity is accused of illegally copying vast amounts of copyrighted content from Dow Jones and NYP Holdings (which includes the content of The Wall Street Journal and New York Post) to use in its "retrieval-augmented generation" (RAG) database. This content is then used to provide answers to users' queries without proper authorization or licensing, thereby infringing the plaintiffs' copyrights.

  2. Unlicensed Use of Content: The plaintiffs claim that Perplexity has been using their content without licensing agreements, while other AI companies have secured such licenses. This unlicensed use undermines the plaintiffs' revenue streams, which depend on licensing, subscriptions, and advertising.

  3. Substitution for Original Sources: Perplexity encourages its users to "Skip the Links," effectively diverting traffic and revenue from the plaintiffs' websites. This harms the publishers by substituting their original content with summaries or paraphrased outputs provided by Perplexity's AI, causing a loss of advertising and subscription revenues.

  4. False Attribution and Hallucinations: Perplexity is accused of generating outputs that falsely attribute fabricated or incorrect content (referred to as "hallucinations") to The Wall Street Journal and New York Post. This dilutes the value of the plaintiffs' trademarks and harms their reputation for accuracy, potentially confusing consumers into believing these hallucinations are authentic content.

  5. Trademark Dilution: The plaintiffs also claim that Perplexity’s use of their trademarks in association with false or misleading outputs causes harm to their famous trademarks by blurring or tarnishing their distinctiveness. This is especially problematic when hallucinated content is attributed to the plaintiffs' publications, damaging their reputation and consumer trust.

  6. Failure to Engage in Licensing: Perplexity is criticized for ignoring offers to discuss a potential licensing agreement, as highlighted by the plaintiffs' outreach in July 2024, which was met with no response from Perplexity. This refusal to engage in a legitimate business model is seen as further evidence of willful infringement.

In summary, the lawsuit targets Perplexity for willful copyright infringement, loss of revenue due to illegal use of content, and trademark dilution due to false attributions, all of which severely harm the plaintiffs’ business interests​


Question 2 of 4 for ChatGPT-4o: What are the most surprising, controversial and valuable statements made in the complaint?

ChatGPT-4o’s answer:
The complaint contains several surprising, controversial, and valuable statements that highlight the depth of the plaintiffs' grievances and the broader issues surrounding AI's use of copyrighted content. Here are some notable ones:

Surprising Statements:

  1. Perplexity’s Alleged Refusal to Engage in Licensing: The complaint mentions that the plaintiffs (Dow Jones and NYP Holdings) reached out to Perplexity in July 2024 to discuss licensing agreements, but Perplexity did not respond. This is surprising given that other AI companies have engaged with the plaintiffs, making licensing deals to use their content legally. This refusal shows an apparent disregard for finding a cooperative solution.

"Months ago, in July 2024, Plaintiffs sent a letter to Perplexity putting it on notice of the legal issues raised by Perplexity’s unauthorized use of Plaintiffs’ copyrighted works and offering to discuss a potential licensing deal. Perplexity did not bother to respond."

  1. Perplexity’s Business Model Based on "Freeriding": The lawsuit frames Perplexity’s core business model as fundamentally reliant on stealing content, implying that it’s built on unethical practices. The plaintiffs allege that Perplexity encourages users to “Skip the Links” to original sources and bypasses legitimate business models of licensing.

"Perplexity’s core business model involves engaging in massive freeriding on Plaintiffs’ protected content to compete against Plaintiffs for the engagement of the same news-consuming audience."

Controversial Statements:

  1. Accusation of Systematic Copyright Violation: The plaintiffs assert that Perplexity’s approach is not just a few incidental cases of copying but a "massive" and "systematic" violation of copyright law. This positions Perplexity as an intentional and persistent violator, not just an accidental or naive offender.

"Perplexity engages in this wrongful conduct in order to deliver on its promise that its users can 'Skip the Links' to the websites of original content creators."

  1. Allegation of False Attribution via “Hallucinations”: One of the most damaging claims is that Perplexity not only reproduces content without permission but also creates "hallucinations" (factually incorrect outputs) that are falsely attributed to the plaintiffs’ publications. These AI-generated errors, combined with the use of the plaintiffs' trademarks, could severely damage the reputation of The Wall Street Journal and New York Post.

"Perplexity’s hallucinations can falsely attribute facts and analysis to content producers like Plaintiffs, sometimes citing an incorrect source, and other times simply inventing and attributing to Plaintiffs fabricated news stories."

  1. Comparison to Other AI Companies: The lawsuit subtly contrasts Perplexity with other AI companies like OpenAI, which have entered into legitimate licensing agreements with the plaintiffs. This adds to the controversy by suggesting that Perplexity is deliberately choosing not to comply with industry norms.

"Other AI companies have engaged with Plaintiffs and other publishers, resulting in legitimate market-based licensing solutions."

Valuable Statements:

  1. Impact on News Publishers’ Revenue Models: The complaint emphasizes that Perplexity’s practices are not only about copyright infringement but directly attack the financial foundation of news publishing. By encouraging users to skip the links to original articles, Perplexity undercuts subscription-based and advertising-based revenue streams, which are crucial for the survival of high-quality journalism.

"Any company that copies news publishers’ original works to enable and encourage consumers to 'Skip the Links,' as Perplexity urges, purposefully diverts readers’ attention to itself and away from the original news publishers and siphons off for itself the revenue opportunities that support news and journalists."

  1. Generative AI as a Substitute for News: The complaint also argues that Perplexity’s AI generates direct substitutes for the plaintiffs' articles. This adds a new layer to copyright issues in AI, raising questions about the extent to which AI outputs, even when paraphrased, can act as unauthorized substitutes for the original content.

"Other times, Perplexity turns Plaintiffs’ copyrighted articles into paraphrases or summaries of those copyrighted works that similarly serve as substitutes for accessing Plaintiffs’ copyrighted works on Plaintiffs’ own websites and/or licensed websites."

  1. Long-Term Harm to Journalism: The lawsuit argues that Perplexity's practices, if left unchecked, could ultimately reduce the overall quality and quantity of original content. This could have a chilling effect on journalism as it undermines the financial incentives to produce high-quality news and analysis.

"Undermining the financial incentives to create original content will result in less content being generated and/or less quality content being generated, which will also reduce the amount of content available to power AI’s growth."

These statements collectively highlight the broader stakes involved in this lawsuit—beyond just monetary damages, the case touches on the future of journalism, ethical AI development, and the integrity of digital content distribution.


Question 3 of 4 for ChatGPT-4o: What kind of evidence is being presented and do you feel that the evidence is robust?

ChatGPT-4o’s answer:
The evidence presented in the complaint appears to be comprehensive and robust, aimed at demonstrating the systematic nature of Perplexity's alleged copyright infringement and trademark violations. The plaintiffs rely on multiple types of evidence, including technical practices, specific instances of content usage, and economic harm. Below is a breakdown of the key evidence presented and an evaluation of its robustness:

Types of Evidence Presented:

  1. Technical Evidence of Content Scraping and Indexing:

    • The complaint asserts that Perplexity engages in web crawling and scraping to obtain vast amounts of copyrighted content from The Wall Street Journal, New York Post, and other sources. This content is allegedly copied into Perplexity’s "retrieval-augmented generation" (RAG) index, which is used to generate answers to users’ queries.

    • Evidence supporting this includes descriptions of how Perplexity uses its RAG technology to store and retrieve the plaintiffs’ content, along with accusations that Perplexity ignores web protocols like robots.txt, which are designed to block unauthorized scraping.

    Evaluation: If plaintiffs can provide logs, technical data, or expert testimony proving that Perplexity's crawlers accessed their sites and circumvented protections like robots.txt, this would be strong evidence of unauthorized copying.

  2. Examples of Specific Copyright Violations:

    • The plaintiffs provide detailed examples of Perplexity reproducing their content verbatim or in paraphrased form. Specific instances are cited where Perplexity allegedly delivered entire articles or summaries of paywalled content to its users without permission.

    • Screenshots and direct comparisons between Perplexity outputs and the original articles are included to illustrate these infringements. For example, one output included the entire text of a New York Post article, and another summarized a Wall Street Journal article behind a paywall​

    Evaluation: These specific examples are critical, as they showcase the nature and extent of the infringement. Screenshots and textual comparisons provide clear, tangible evidence of unauthorized copying. If the plaintiffs can demonstrate that these outputs occurred frequently, it would strengthen their case.

  3. Economic Harm and Lost Revenue:

    • The complaint emphasizes that Perplexity’s practices of allowing users to “Skip the Links” result in lost revenue for the plaintiffs. This includes losses from subscriptions, advertising, and potential licensing fees.

    • The plaintiffs point to the virtually non-existent click-through traffic from Perplexity’s "cited sources" feature, despite Perplexity receiving millions of queries. This is presented as evidence that Perplexity’s model is designed to act as a substitute for direct engagement with the plaintiffs' content​.

    Evaluation: Demonstrating economic harm is crucial for proving damages. If the plaintiffs can provide data on lost traffic and potential revenue from advertising and subscriptions, as well as missed licensing opportunities, this would provide a strong foundation for their claims of financial damage.

  4. Trademark Dilution and False Attribution (Hallucinations):

    • The plaintiffs provide examples of Perplexity’s outputs generating fabricated content (referred to as "hallucinations") that were attributed to The Wall Street Journal or New York Post. These include specific cases where Perplexity provided inaccurate quotes or fabricated articles, while attributing them to the plaintiffs’ publications​.

    • The plaintiffs also argue that Perplexity’s use of their trademarks in conjunction with these hallucinations damages the integrity and value of their brands.

    Evaluation: If the plaintiffs can provide further instances of such false attributions, this could be particularly damaging to Perplexity’s case. Given that these fabrications mislead users into thinking they are reading legitimate content from reputable publications, the plaintiffs’ claims of brand dilution could resonate with the court.

  5. Perplexity’s Own Marketing Statements:

    • The plaintiffs reference Perplexity’s own marketing language, such as its claim that users can “Skip the Links” and still receive reliable content. They argue that this highlights Perplexity’s intent to replace original content sources, undermining the plaintiffs’ business models​.

    • The lawsuit also cites public statements made by Perplexity’s executives, including CEO Aravind Srinivas, who has spoken about the platform’s reliance on trusted news outlets. These statements are used to show that Perplexity acknowledges the value of using high-quality content but has refused to engage in licensing deals.

    Evaluation: These public statements could be a double-edged sword for Perplexity. They could be seen as admissions that Perplexity understands the importance of copyrighted content, yet it failed to take appropriate steps to license it. These admissions add weight to the argument that Perplexity knowingly engaged in illegal practices.

Robustness of the Evidence:

Overall, the evidence presented appears to be robust, combining technical claims, specific examples of copyright infringement, economic analysis, and admissions from Perplexity’s own executives. Here’s why the evidence is likely strong:

  • Specificity: The detailed examples of verbatim reproduction and paraphrasing of content, along with screenshots and direct comparisons, offer concrete proof of how Perplexity allegedly infringed on the plaintiffs’ copyrights. This specificity is crucial in intellectual property cases, where demonstrating the exact nature of the infringement is necessary.

  • Multiple Evidence Types: The combination of technical evidence, economic arguments, and trademark dilution claims shows that the plaintiffs are taking a multifaceted approach. They are not relying on one type of harm but are addressing both the financial and reputational damage caused by Perplexity’s actions.

  • Intent and Public Admissions: The inclusion of Perplexity’s marketing claims and statements from its executives provides insight into the intent behind the company’s actions. Courts often look at intent when assessing damages, and Perplexity’s own language could be used against them to show that they knowingly engaged in practices that harmed the plaintiffs.

If the plaintiffs can substantiate their technical claims with evidence like server logs, expert testimony, and traffic analysis, their case will be particularly strong. Moreover, if Perplexity cannot provide a legitimate justification for its use of the plaintiffs' content, such as fair use, the plaintiffs' evidence should hold up well in court.


Question 4 of 4 for ChatGPT-4o: What should Perplexity have done differently, how should they respond to this complaint and what does this mean for other AI makers?

ChatGPT-4o’s answer:
Perplexity could have taken several steps differently to avoid the legal troubles they now face with Dow Jones and NYP Holdings. Their actions highlight important lessons for AI companies that rely on copyrighted content for training and generating outputs. Here's a breakdown of what Perplexity should have done differently, how they should respond to the complaint, and what this case means for other AI makers.

What Perplexity Should Have Done Differently:

  1. Securing Proper Licenses from the Start:

    • Perplexity should have entered into licensing agreements with content owners like Dow Jones and NYP Holdings before using their copyrighted material. Many AI companies, including OpenAI, have negotiated content licenses to avoid infringing on intellectual property rights.

    • By failing to engage in licensing discussions, despite being contacted by the plaintiffs in July 2024, Perplexity missed an opportunity to legitimize their business model and prevent the current legal action​.

  2. Building a Transparent Business Model:

    • Instead of relying on the "Skip the Links" model, which clearly undermined the publishers' ability to monetize their content through clicks and subscriptions, Perplexity could have designed their platform to complement original content sources. For example, instead of replacing the original articles, they could have integrated more effective linking back to the source, ensuring that content creators still benefited from traffic.

    • Other AI companies have found ways to work with content providers through partnerships and shared revenue streams. Perplexity could have explored such models instead of building their business on unlicensed content extraction.

  3. Addressing Attribution and Hallucinations:

    • One of the biggest grievances in the complaint is the issue of "hallucinations" — false or fabricated outputs attributed to reputable publications like The Wall Street Journal and New York Post​. Perplexity should have implemented more stringent content verification mechanisms to prevent false attributions. Moreover, while they tout their "cited sources" feature, it clearly wasn’t enough to prevent confusion or provide proper credit to original authors.

    • Implementing better citation practices, ensuring users still visited original sources, and improving fact-checking capabilities within their AI would have mitigated the risk of tarnishing the plaintiffs' reputations.

  4. Better Engagement with Copyright Holders:

    • When the plaintiffs reached out with a letter in July 2024 offering to discuss a licensing deal, Perplexity failed to respond. A more responsible approach would have been to engage in dialogue and explore potential partnerships or licensing agreements​. This would have demonstrated goodwill and a desire to comply with copyright law, potentially avoiding the current lawsuit.

How Perplexity Should Respond to This Complaint:

  1. Engage in Settlement Talks and Licensing Negotiations:

    • Given the weight of the evidence and the potential financial and reputational damage from a protracted legal battle, Perplexity should consider settling the case out of court. This would involve negotiating fair licensing agreements with Dow Jones and NYP Holdings, offering compensation for past infringements, and ensuring compliance moving forward.

    • A settlement could also involve shared revenue models from advertisements or subscription services on Perplexity’s platform, ensuring that content creators receive a portion of the revenue generated from their content.

  2. Acknowledge Mistakes and Commit to Compliance:

    • Perplexity should publicly acknowledge the concerns raised by the plaintiffs and commit to making changes that comply with copyright law. This could include updating its business model to better respect the rights of content creators, reinforcing that they will no longer allow users to bypass original publishers entirely.

    • They should also commit to improving their AI’s attribution and verification processes, which would help restore some goodwill and prevent similar cases in the future.

  3. Strengthen Legal Defenses While Evaluating Risk:

    • If Perplexity chooses to defend itself in court, they will need a strong legal strategy. They could attempt to argue fair use, though this defense is tenuous given the commercial nature of their platform and the direct substitution for original content. They may also try to focus on any non-infringing uses or argue that their "cited sources" feature mitigates some harm, though this will likely not be enough to overcome the plaintiffs' claims.

    • Evaluating the risks of continuing litigation versus the potential benefits of settlement will be key. If Perplexity believes that a loss in court could set a harmful precedent, they may opt to settle.

What This Case Means for Other AI Makers:

  1. The Importance of Licensing and Partnering with Content Owners:

    • This case sets a clear precedent for the necessity of licensing deals when AI models use copyrighted content, especially when that content generates revenue for the AI company. AI makers must proactively seek partnerships with content creators to avoid infringing on intellectual property rights.

    • Large news organizations and publishers are increasingly aware of the value of their content for AI applications, and the market for content licensing is growing. AI companies need to recognize that legitimate access to data often comes at a cost, but this also offers them protection from lawsuits.

  2. Transparency in Business Models:

    • AI companies need to be transparent in how they source and use data. Perplexity’s "Skip the Links" approach directly undermined the financial models of news organizations, drawing users away from original content. AI companies that rely on external content should aim to build models that complement, rather than compete with, content creators.

    • This case also highlights the dangers of building a business model that implicitly harms content creators without their participation. Moving forward, AI makers need to ensure their products contribute to the ecosystem in a sustainable way.

  3. Tackling AI Hallucinations and Attribution Issues:

    • The issue of "hallucinations" in AI-generated content is not unique to Perplexity. AI models sometimes produce inaccurate or fabricated information, which can cause legal and reputational damage if attributed to reputable sources. AI companies must implement stronger safeguards to verify content before it is presented to users, particularly when attribution to copyrighted or trademarked sources is involved.

    • This case highlights the need for stricter standards around content accuracy and attribution in AI-generated outputs. Other AI companies should invest in developing more accurate models and ensure they can correct or prevent false attributions to trusted sources.

  4. Broader Implications for AI and Copyright Law:

    • The Perplexity case adds to the growing conversation around how generative AI interacts with copyright law. It underscores that AI makers cannot claim ignorance or rely on outdated interpretations of fair use when engaging in massive content scraping and generating commercial outputs. Copyright holders are becoming more assertive, and AI companies need to navigate this legal landscape carefully.

    • This case could trigger broader discussions about the need for clearer regulatory frameworks for AI, especially around data usage and copyright infringement. Governments and lawmakers may push for more explicit regulations, making it crucial for AI companies to stay ahead of these developments.

Conclusion:

Perplexity should have prioritized securing licenses, building a more transparent and cooperative business model, and addressing the risks of false attribution. In response to the lawsuit, they should seek to settle, acknowledge their mistakes, and adjust their practices to comply with copyright law. For other AI makers, the lesson is clear: proactive licensing, transparency, and improved accuracy are essential to avoiding similar legal challenges and maintaining good relationships with content creators.