- Pascal's Chatbot Q&As
- Archive
- Page 18
Archive
GPT-4o: LLMs (especially bigger ones) memorize more than we thought—not just long, obvious passages, but obscure, high-surprisal tokens too​. Post-training filters are not enough.
Even with safety filters added by companies, memorization persists at the token level—this suggests post-training defenses are leaky​.

Asking AI services: What are the chances you have been trained on data derived from data breaches? Explain what the legal consequences could be for both the AI makers and the users of the AI models.
GPT-4o: 5-15%. Grok 3: 10-20%. Claude: 0%. Perplexity: 0-5%. Gemini: Close to 0%. Grok: Data breaches, like those on the Dark Web or from ransomware dumps, often get laundered into "public" datasets.

Claude: Based on Judge Denise Cote's opinion in the AFGE v. OPM case, here are the robust arguments, statements, and findings that other litigants challenging DOGE could effectively use.
The court found that DOGE's actions were "blatantly lawless" and that the defendants "plainly and openly crossed a congressionally drawn line in the sand."

GPT-4o: MPA warns that failure to protect U.S. IP in AI development will invite exploitation by foreign adversaries like China.​ The MPA favors voluntary, opt-in licensing arrangements.
The U.S. should lead globally by developing a “gold standard” AI policy that respects IP, promotes market-based licensing, and rejects overbroad exceptions like those seen in Japan and Singapore.

GPT-4o about the AI and Democratic Values Report 2025: Typically EU countries, Canada, and New Zealand score high because they:
Have clear AI strategies. Involve the public in AI planning. Protect privacy and human rights. Support international AI ethics frameworks. For citizens: Push for algorithmic audits and accountability.

“Much remains to be demonstrated before LLMs can be considered fit for producing or assessing assurance arguments in engineering and certification.”— In other words: Not yet ready.
LLMs don’t care about truth. They generate text that sounds good — but that doesn’t mean it’s correct. Researchers call this “BS,” meaning it lacks concern for factual accuracy.

GPT-4o: They advocate treating a deployed AI like an “untrusted insider,” just like a rogue employee. Regulations must kick in BEFORE harms happen.
The authors admit we currently don’t understand how AI systems think, and yet propose ways to align them anyway. That tension remains unresolved​.

GPT-4o: OpenAI’s most advanced models appear to have been trained on copyrighted books behind paywalls — without permission — and that’s a big legal and ethical problem.
This report delivers a wake-up call for both content owners and AI developers. The authors call for new rules and markets to make sure creators are fairly treated in the AI ecosystem.

Summary of Day 2 of The Generative AI Summit 2025, London Edition. Generative AI must not be a curiosity—it must drive measurable business value...
...while protecting the integrity of research, authorship, and institutional trust. Generative AI is no longer optional. It is now a question of governance, differentiation, and long-term relevance.
