- Pascal's Chatbot Q&As
- Posts
- Current LLMs tend to present outputs with an air of confidence, even when they’re uncertain. Unlike humans, they rarely express probabilistic thinking or acknowledge the limits of their knowledge.
Current LLMs tend to present outputs with an air of confidence, even when they’re uncertain. Unlike humans, they rarely express probabilistic thinking or acknowledge the limits of their knowledge.
Moreover, LLMs often fail to distinguish between predictions drawn from strong evidence and speculative guesses. This brittleness can be dangerous.
DeLLMa and the Human Problem of Uncertainty in AI Decision-Making
by ChatGPT-4o
In a world increasingly shaped by artificial intelligence, one of the most pressing questions facing developers, users, and policymakers is this: Can machines make decisions under uncertainty in a way that is trustworthy, auditable, and aligned with human values? The research behind DeLLMa (Decision-making Large Language Model assistant) and the accompanying discussion with USC’s Willie Neiswanger offers a groundbreaking response: yes — if we combine classical decision theory with the reasoning capabilities of large language models (LLMs).
1. What Is DeLLMa?
DeLLMa is a framework developed at USC that enhances the decision-making capabilities of LLMs under conditions of uncertainty. It improves on the standard prompting techniques used with LLMs by integrating four structured steps rooted in classical decision theory:
Identifying relevant unknowns
Forecasting those unknowns using context
Eliciting a utility function aligned with the user's goals
Maximizing expected utility to identify the best action
This process mirrors how humans make rational decisions but leverages the scale and speed of AI to simulate, evaluate, and rank options. In tests involving agricultural planning and stock market investing, DeLLMa improved decision-making accuracy by up to 40% over standard prompting techniques.
2. Why AI Struggles with Uncertainty
As Neiswanger explains in his USC Today interview, current LLMs tend to present outputs with an air of confidence, even when they’re uncertain. Unlike humans, they rarely express probabilistic thinking or acknowledge the limits of their knowledge — critical behaviors for responsible decision-making in high-stakes environments like finance, medicine, or law.
Moreover, LLMs often fail to distinguish between predictions drawn from strong evidence and speculative guesses. This brittleness can be dangerous: a confident but incorrect AI recommendation in a medical or legal setting may lead to significant harm.
3. What Makes DeLLMa Different?
DeLLMa addresses this gap through explicit modeling of uncertainty. It does not merely predict outcomes — it forecasts probabilities and ranks outcomes by expected utility. This allows for more nuanced recommendations that account for trade-offs, risks, and human preferences.
For example, in agriculture, DeLLMa uses USDA data to forecast fruit yield and prices, recommending the crop with the highest expected return based on historical data and contextual analysis. In finance, it assesses historical stock data and estimates future performance with probabilistic models. In both cases, it generates not just a decision, but a rationale for that decision that can be audited and questioned.
4. Why This Matters: Broader Implications
The ability of AI to handle uncertainty is not an academic abstraction — it is a core competency needed for real-world deployment. As AI systems are increasingly embedded in critical infrastructure, logistics, healthcare, education, and governance, they must be capable of navigating incomplete data, weighing competing values, and articulating their decision process.
Immediate applications of DeLLMa-like systems include:
Medical diagnostics and treatment planning: where doctors need support in weighing probabilistic outcomes.
Business strategy: for market forecasting, risk assessment, and investment decisions.
Scientific discovery: optimizing experiments where data is costly or slow to obtain.
Public policy and climate modeling: where decisions have far-reaching consequences and uncertainty is inherent.
5. The Human Element: Trust and Auditability
DeLLMa is not just a technical advancement — it is a step toward responsible AI. By making uncertainty explicit, incorporating stakeholder preferences, and enabling audit trails, it supports the development of AI systems that can be trusted to assist (not replace) humans in consequential decisions. As Neiswanger notes, “letting computers find the mathematically ‘best’ solution might miss important human values or constraints.” DeLLMa respects both logic andethics.
Conclusion: Why This Work Is Important
DeLLMa represents a new frontier in AI: one where decision-making is no longer about confident predictions, but about reasoned choices under uncertainty. It moves AI closer to how humans really think — not in absolutes, but in probabilities, trade-offs, and values.
In a time when LLMs are being deployed at scale in sensitive environments, the DeLLMa framework offers a crucial safeguard: ensuring that the systems we build are not only smart, but wise.
This matters because the future of AI is not just about capability, but judgment. DeLLMa gives us a blueprint for how to embed judgment into our most powerful machines. Every AI developer, regulator, and business deploying LLMs should take note.
Recommendations:
For AI developers: Adopt frameworks like DeLLMa to ensure better uncertainty modeling in products.
For enterprises: Demand human-auditable AI systems that offer transparent reasoning, especially in decision-critical applications.
For regulators: Encourage or mandate explainability and probabilistic reasoning in LLM deployments in healthcare, finance, and public services.
For researchers: Expand DeLLMa’s approach to real-time and multi-agent decision environments (e.g., autonomous systems, emergency response, legal assistance).
This is not just good AI — it’s responsible, human-centered AI.
