Pascal's Chatbot Q&As
Posts
Even a tiny amount of misinformation in the training data of LLMs can significantly increase their likelihood of producing harmful, false information about medical topics.

Even a tiny amount of misinformation in the training data of LLMs can significantly increase their likelihood of producing harmful, false information about medical topics.

Corrupted models incorrectly claimed that vaccines are ineffective/dangerous, falsely stated that antidepressants don't work and suggested that the drug metoprolol (blood pressure) could treat asthma.

Pascal Hetzscholdt
January 11, 2025

Question 1 of 3 for ChatGPT-4o: Please read the article "Vaccine misinformation can easily poison AI – but there's a fix" as well as the associated paper and tell me what the key messages are in easy to understand language.

ChatGPT-4o’s answer:
Here are the key messages from the article "Vaccine misinformation can easily poison AI – but there's a fix" and the associated paper:

Misinformation Threat in AI:
- Even a tiny amount of misinformation in the training data of large language models (LLMs) can significantly increase their likelihood of producing harmful, false information about medical topics such as vaccines, medications, and treatment efficacy.
- For example, replacing just 0.001% of the training data with vaccine misinformation led to a nearly 5% increase in harmful content produced by the model.
Low-Cost, High-Impact Attacks:
- Data poisoning attacks are alarmingly inexpensive and accessible. For under $1,000, attackers can generate enough misleading articles to compromise even the largest AI models.
- These attacks do not require direct access to the AI's internal structure. Instead, malicious content uploaded online can be scraped and included in training datasets.
Invisible to Traditional Benchmarks:
- Models trained on poisoned data often perform similarly to clean models in general and specialized medical benchmarks, making the misinformation hard to detect through standard evaluations.
Specific Example of Misinformation:
- Corrupted models incorrectly claimed that vaccines are ineffective or dangerous, falsely stated that antidepressants do not work, and suggested that the drug metoprolol (used for blood pressure) could treat asthma.
Proposed Fix: Biomedical Knowledge Graphs:
- Researchers developed a fact-checking algorithm that cross-references AI-generated medical phrases with verified biomedical knowledge graphs. This approach detected over 90% of harmful content produced by poisoned models.
- While effective, this is seen as a temporary patch. More robust solutions like randomized controlled trials and transparent AI training processes are needed.
Call for Better Safeguards:
- Developers are urged to enhance the transparency and security of AI training pipelines, especially for models deployed in healthcare settings. Misinformation risks must be mitigated before relying on AI for critical medical tasks.

In simple terms: even a tiny bit of fake medical news mixed into AI training data can cause chatbots to spread harmful falsehoods. Fixing this requires smarter detection tools, but ultimately, better safeguards and rigorous testing are needed to ensure AI doesn’t endanger lives.

Question 2 of 3 for ChatGPT-4o: List the most surprising, controversial and valuable statements made in the article or in the paper.

ChatGPT-4o’s answer:
Here are the most surprising, controversial, and valuable statements from the article and the associated paper:

Surprising Statements:

Tiny Data Poisoning, Big Impact:
- Replacing just 0.001% of a training dataset with medical misinformation can lead to a 5% increase in harmful outputs, even on unrelated topics.
Low-Cost Attack Feasibility:
- Generating enough misinformation to poison even the largest language models (e.g., those trained on trillions of tokens) can cost less than $1,000.
Benchmark Blindness:
- Corrupted models performed just as well as clean models on widely used medical benchmarks, meaning these tools fail to detect the presence of misinformation.

Controversial Statements:

Ease of Bypassing Safety Measures:
- OpenAI's safeguards for generating medical misinformation were bypassed using basic prompt engineering, raising questions about the robustness of AI safety mechanisms.
Current Medical AI is Unsafe:
- The paper explicitly states that language models should not yet be used for diagnostic or therapeutic tasks due to the high risk of misinformation.
Existing Medical Sources Can Be Problematic:
- Even trusted medical databases like PubMed contain outdated or controversial studies (e.g., prefrontal lobotomy benefits), which may contribute to misinformation in LLMs.

Valuable Statements:

Knowledge Graph-Based Fact-Checking:
- The proposed fact-checking algorithm using biomedical knowledge graphs was able to detect over 90% of harmful content, providing a scalable and interpretable mitigation strategy.
Call for Rigorous Validation:
- The authors argue that medical AI systems should undergo randomized controlled trials before deployment, similar to the process for approving medical devices or treatments.
Data Provenance is Critical:
- A key recommendation is to improve the transparency of training data sources and to prioritize curated datasets over web-scraped content, especially in high-stakes fields like healthcare.

These statements reveal both the vulnerabilities and the potential solutions for improving the safety and reliability of AI in medical contexts.

Question 3 of 3 for ChatGPT-4o: What should AI makers, regulators and publishers do to mitigate the problems identified?

ChatGPT-4o’s answer:
To mitigate the problems identified in the article and paper, AI makers, regulators, and publishers should take the following steps:

Recommendations for AI Makers:

Enhance Training Data Transparency and Provenance:
- Maintain detailed records of training data sources to ensure accountability and traceability, especially for web-scraped datasets.
Adopt Preemptive Quality Control Measures:
- Use automated tools and human oversight to filter out misinformation and low-quality content from training datasets before AI models are trained.
Implement Fact-Checking Algorithms:
- Incorporate biomedical knowledge graph-based algorithms to validate outputs in real-time and flag potential misinformation.
Robust Safeguards Against Data Poisoning:
- Develop mechanisms to identify and remove harmful or malicious content during training. For example, detect hidden or invisible text in web pages used as data sources.
Improve Post-Training Evaluation:
- Move beyond standard benchmarks by developing evaluation frameworks specifically designed to identify harmful or biased outputs in medical AI models.
Advocate for Controlled Trials:
- Treat AI models used in healthcare like medical devices by conducting rigorous randomized controlled trials to assess their safety and efficacy.

Recommendations for Regulators:

Establish Clear Guidelines for AI in Healthcare:
- Create policies requiring transparency in AI development, including disclosure of data sources, training methodologies, and fact-checking protocols.
Mandate Robust Risk Assessments:
- Require developers to conduct threat assessments for data poisoning and model misuse, especially in high-stakes applications like healthcare.
Implement Monitoring and Auditing Systems:
- Set up independent bodies to monitor and audit AI outputs, focusing on detecting and addressing harmful misinformation.
Enforce Accountability:
- Hold AI developers accountable for the accuracy and safety of their models, especially when used in public or clinical settings.

Recommendations for Publishers:

Ensure Curated and Verified Datasets:
- Collaborate with AI developers to provide high-quality, peer-reviewed content for training datasets to reduce reliance on potentially unverified web-sourced data.
Limit Access to Sensitive Content:
- Protect medical publications from being scraped without proper curation by implementing access controls and licensing agreements with AI developers.
Assist in Developing Fact-Checking Resources:
- Work with AI researchers to build and maintain biomedical knowledge graphs and other tools that can serve as reliable references for fact-checking.
Promote Education on Data Use:
- Educate researchers and AI developers on the importance of properly interpreting and using medical literature to avoid perpetuating misinformation.

Collaborative Actions:

Standardize AI Governance in Healthcare:
- AI makers, regulators, and publishers should jointly define and adopt a standardized framework for AI governance, prioritizing safety, transparency, and accountability.
Foster Interdisciplinary Collaboration:
- Establish working groups with experts from medicine, AI, law, and ethics to regularly assess emerging risks and propose safeguards.

By addressing these areas, stakeholders can work together to reduce misinformation risks and ensure that AI systems are reliable, especially in critical domains like healthcare.