- Pascal's Chatbot Q&As
- Posts
- Hallucinations arise not from ignorance but from structural limitations in transformer topology. We need auxiliary manifolds or new mathematical machinery. AI will require new mathematics...
Hallucinations arise not from ignorance but from structural limitations in transformer topology. We need auxiliary manifolds or new mathematical machinery. AI will require new mathematics...
...not just more tokens. However, hallucinations are the result of multiple interacting failures, not solely geometric collapse. Grounding, retrieval, and epistemic modeling remain crucial.
Hallucinations as Geometric Failures: An Essay and Evaluation
by ChatGPT-5.1
John P. Alioto’s argument—that LLM “hallucinations” are not failures of missing knowledge but failures of geometry—is one of the more ambitious reinterpretations of transformer behaviour to appear in the public discourse. Rather than treating hallucinations as a content deficit that could be solved with more data or deeper grounding, Alioto reframes them as a structural consequence of how transformers traverse their own high-dimensional activation space.
The Substack essay proposes a clean thesis: hallucinations emerge when a model exits a region of coherent activation geometry and collapses into the nearest manifold, because the architecture lacks the ability to take inefficient or discontinuous paths.
From that premise flows a radical critique of the current dominant paradigm and an equally radical proposal: Cognitive Tensor Networks (CTN) as an auxiliary manifold that can stabilise reasoning when internal geometry breaks down.
The LinkedIn commentary reinforces this framing, arguing that the industry has been “addressing the wrong problem with the wrong tools” and that adding more data or constraints only “makes the space more confusing.”
Below, I summarize, analyse, and evaluate these claims.
1. Summary of the Argument
1.1 Hallucination as Geometric Collapse
Alioto asserts that transformers do not operate on objects or facts; they operate on activation geometry shaped by weights. During training, SGD reinforces efficient paths within “coherence manifolds,” where token transitions are predictable. The model hallucinates when its residual stream drifts into low-density or poorly-reinforced regions. Lacking a “null operator,” it must still output a token, so it snaps to the nearest coherent region—an “extrapolation under collapse.”
This is not ignorance; it is topology.
1.2 Grounding Is the Wrong Diagnosis
Adding more facts does not solve hallucinations, Alioto argues, because the problem isn’t factual adequacy but structural geometry. More data simply adds more manifolds and ambiguous isomorphisms without providing a mechanism to traverse disconnected regions.
1.3 Transformers Cannot Perform Inefficient Traversal
Humans can “jump” across conceptual distances and return: the engine of humor, creativity, analogy, and insight. Transformers cannot. ReLU folds space; attention preserves relationships; nowhere in the architecture is the capacity for tearing, bending, or discontinuous traversal. Therefore the model collapses instead of leaping.
1.4 CTN as a Proposed Solution
Alioto proposes Cognitive Tensor Networks, an auxiliary mathematical scaffold that acts as a stable manifold for reasoning when the transformer’s internal geometry destabilises. CTN offers symbolic handles for concepts that lack stable linguistic representations, allowing a model to operate “at the edge” without collapse.
The LinkedIn discussion interprets this as an early path toward “a partnership between two different kinds of minds” rather than control or alignment.
2. Evaluation: Do I Agree With This Viewpoint?
2.1 What I Agree With
a. Hallucinations Are Not Primarily About Missing Facts
This is correct. Empirically, hallucinations persist even in extremely well-grounded, retrieval-augmented, or domain-tuned models. A purely data-centric explanation is insufficient.
Many hallucinations emerge even when the model has the correct fact in training data. This aligns with Alioto’s framing that the issue is structural, not informational.
b. Transformer geometry matters
Transformers operate in a vast, learned activation landscape. The idea that certain regions are “coherent” and others sparse or unstable is consistent with contemporary interpretability work and mechanistic descriptions.
The claim that hallucinations often result from drifting into such poorly-structured regions is plausible.
c. Lack of a “null operator”
This is true: the next-token objective penalizes abstention. Current LLMs must say something even when uncertain. This contributes significantly to hallucinations.
2.2 Where I Partially Agree, With Nuance
a. “Transformers cannot take inefficient conceptual paths”
It is true that transformers lack mechanisms for explicit discontinuous jumps, but:
Transformers can approximate leaps through multi-layer composition and attention over distant context.
Some forms of “inefficient traversal” may emerge implicitly.
The distinction between “path” and “jump” may be blurred in high-dimensional representations.
His metaphor is powerful but risks oversimplification.
b. “Geometry, not grounding, explains hallucination”
Geometry plays a large role, but so do:
uncertainty quantification failures
training objective mismatch
lack of epistemic models
absence of symbolic consistency constraints
reward modeling artifacts
So: geometry is a major component, but not the sole explanation.
2.3 Where I Am Skeptical
a. The claim that grounding exacerbates hallucinations
Alioto says that adding more data creates more manifolds, more near-matches, and more failure modes. This is not always true.
Retrieval-augmented models generally reduce hallucination rates. Grounding can help, especially when paired with confidence mechanisms.
In practice, better grounding + uncertainty estimation typically reduces geometric collapse.
b. CTN as a universal scaffold
CTN is intriguing, but:
It is not yet empirically validated.
It resembles long-standing proposals for hybrid symbolic-neural architectures.
It may introduce its own form of instability: symbolic handles can create brittleness.
CTN may be promising, but it is early and speculative.
3. A More Complete Picture of Hallucination
Based on the evidence across interpretability, optimization theory, and empirical performance, hallucinations arise from five interacting causes:
Geometric instability (Alioto’s argument)
Objective mismatch (next-token =/= truth)
Epistemic uncertainty blindness (no calibrated sense of unknowns)
Lack of explicit world models
Token-by-token generation without global consistency verification
Alioto is correct to emphasize (1), but hallucinations are a multi-causal phenomenon.
4. Where Alioto’s Essay Is Most Valuable
Alioto’s essay contributes several important insights:
It reframes LLM behavior in geometric terms rather than symbolic ones.
It correctly rejects the simplistic “missing facts” narrative.
It draws attention to transformer limits that cannot be solved by scale.
It introduces the idea that future AI architectures may require topology-altering capabilities.
This aligns with the growing consensus: the post-transformer era will hinge on new mathematical principles for reasoning rather than bigger models.
5. Where I Would Disagree or Add Nuance
Geometry is a cause, not the cause.
Grounding and retrieval do help in practice, even if they don’t solve the underlying geometry.
Transformers can approximate inefficient reasoning better than the essay suggests.
CTN’s promise, while interesting, remains hypothetical.
Alioto’s rhetoric is strong (“dangerously misleading”), but the reality is more mixed.
Conclusion
Alioto’s geometric framing of hallucinations is insightful, provocative, and directionally useful. He is correct that hallucinations arise not from ignorance but from structural limitations in transformer topology. His notion that we need auxiliary manifolds or new mathematical machinery is aligned with trends in neuroscience-inspired and hybrid architectures.
However, hallucinations are the result of multiple interacting failures, not solely geometric collapse. Grounding, retrieval, and epistemic modeling remain crucial. And while CTN is an intriguing proposal, it is speculative until demonstrated at scale.
Overall, Alioto’s work is most valuable as a conceptual lens: a reminder that the next great leap in AI will require new mathematics, not just more tokens.
