• Pascal's Chatbot Q&As
  • Posts
  • Copilot: Most existing work on AI in neuro-oncology only conducted phase I algorithmic evaluation, which cannot reflect the clinical utility of AI in assisting clinical users in the clinical workflow

Copilot: Most existing work on AI in neuro-oncology only conducted phase I algorithmic evaluation, which cannot reflect the clinical utility of AI in assisting clinical users in the clinical workflow

Rigorous clinical evaluation of AI is a prerequisite to clinical deployment, and proposes a four-phase framework analogous to the conventional clinical trials for drugs or medical devices

Question 1 of 4 for MS Copilot: Please read the paper "Evaluating the clinical utility of artificial intelligence assistance and its explanation on the glioma grading task" and tell me what it says in easy to understand language

MS Copilot’s answer: I’ll try to summarize the paper for you. Here are the main points:

The paper is about a study that evaluated how artificial intelligence (AI) can help doctors grade brain tumors (gliomas) based on MRI scans.

The study recruited 35 neurosurgeons who read 25 MRI scans of glioma patients and gave their judgment on the tumor grade, with or without the assistance of AI prediction and explanation.

The AI model was trained on a public dataset of brain MRI scans and tumor grades, and achieved an accuracy of 88.0%. The AI explanation was a color map that highlighted the important regions for the AI prediction.

The results showed that doctors’ accuracy improved from 82.5% to 87.7% when they used AI prediction, and remained at 88.5% when they also used AI explanation. However, the AI explanation did not provide enough information to help doctors discern potentially incorrect AI predictions.

The study also measured doctors’ trust and willingness to use AI, and found that they increased after seeing AI performance metrics, but did not change after using AI prediction and explanation.

The study concluded that AI can assist doctors in the glioma grading task, but also identified the limitations and gaps of existing AI explanation techniques for future improvement.

Question 2 of 4 for MS Copilot: List all problems, challenges and concerns mentioned in the paper and the associated consequences for use of AI in the healthcare industry

MS Copilot’s answer:Here is a list of some problems, challenges and concerns mentioned in the paper and the associated consequences for use of AI in the healthcare industry:

  • AI evaluation gap: The paper points out that most existing work on AI in neuro-oncology has only conducted phase I algorithmic evaluation, which cannot reflect the clinical utility of AI in assisting clinical users in the clinical workflow. The paper argues that rigorous clinical evaluation of AI is a prerequisite to clinical deployment, and proposes a four-phase framework analogous to the conventional clinical trials for drugs or medical devices. The consequence of this gap is that AI may not be validated for its safety, efficacy, and effectiveness in real-world clinical settings, and may face barriers to adoption and implementation by healthcare providers and regulators.

  • AI interpretability problem: The paper acknowledges that the state-of-the-art AI models, namely deep neural networks, are black-box models, and their decision processes are incomprehensible even to AI engineers. The paper asserts that this impedes the clinical use of AI, as clinical users will often require an explanation or justification from AI other than a mere prediction, due to the high-stakes nature of clinical decision-making. The paper leverages the latest technical advances in explainable AI (XAI) as a feature of the AI system in the clinical evaluation study. The consequence of this problem is that AI may not be trusted, accepted, or used by clinical users, and may pose ethical, legal, and social issues in clinical practice.

  • AI explanation quality: The paper reveals the limitations and possible failure reasons of existing AI explanation techniques for clinical utility. The paper shows that the heatmap explanation used in the study did not provide additional information to facilitate physicians’ decision-making, and did not indicate when to rely on AI recommendations and when not to. The paper suggests that the existing heatmap explanation was missing critical information to construct a clinically relevant explanation, such as the pathology of the highlighted image features, and the justification of how the highlighted regions lead to the AI decision. The paper calls for future work to develop new XAI techniques that incorporate more clinically relevant information into the form of AI explanation. The consequence of this challenge is that AI may not be able to provide meaningful, transparent, and actionable guidance to clinical users, and may not achieve its desired clinical impact, such as complementary human-AI performance.

Question 3 of 4 for MS Copilot: What lessons can be learned from this by AI makers and regulators? Can these lessons be useful for other sectors?

MS Copilot’s answer:This is a good question. Based on the paper, some possible lessons for AI makers and regulators are:

  • AI evaluation is essential for clinical implementation. The paper describes a phase II clinical evaluation of AI assistance and its explanation on the glioma grading task, which is a simulated clinical task based on brain MRI. The evaluation showed the clinical utility of AI to assist physicians on the task, and also identified the limitations and clinical usage gaps of existing explainable AI techniques for future improvement. This suggests that AI makers and regulators should conduct rigorous and user-centered evaluations of AI systems before deploying them in clinical settings, and also monitor and update them after deployment.

  • AI regulation should be agile and pro-innovation. The paper refers to the UK’s AI Regulation White Paper, which sets out five principles for regulators to interpret and apply within their remit. These principles are: safety, security and robustness; appropriate transparency and explainability; fairness; accountability and governance; and contestability and redress. The UK’s approach to AI regulation aims to be agile and pro-innovation, allowing regulators to respond rapidly to emerging risks, while giving developers room to innovate and grow in the UK. This implies that AI makers and regulators should balance the trade-offs between innovation and regulation, and adopt a flexible and adaptive regulatory framework that can accommodate different AI applications and contexts.

  • AI explanation should be aligned with human reasoning and needs. The paper reports that the AI explanation provided by the SmoothGrad algorithm, which was a color map overlaid on the MRI to highlight important regions for AI prediction, did not significantly improve physicians’ task performance or achieve complementary human-AI team performance. The qualitative results revealed that physicians had a mixed view of the clinical utility of AI explanation: they saw the heatmap explanation as a useful tool to help them localize important features and easy-to-miss lesions, but they also found that the heatmap explanation only provided limited information on the location of important features, but not on the characteristics, reasons, or contexts of those features. This indicates that AI makers and regulators should design and evaluate AI explanations that are aligned with human reasoning and needs, and provide more information to support clinical decision-making.

These lessons may be useful for other sectors that involve AI applications, such as education, finance, or transportation, as they also require AI systems to be evaluated, regulated, and explained in a way that ensures their safety, accountability, and effectiveness. However, different sectors may have different domain-specific requirements and challenges, so AI makers and regulators should also consider the context and stakeholders of each sector when applying these lessons.

Question 4 of 4 for MS Copilot: What are the remedies suggested in the paper and can you think of any other solutions to the issues identified?

MS Copilot’s answer:The paper does not suggest any specific remedies for the challenges of AI evaluation and explainability in neuro-oncology, but it does point out some possible directions for future work, such as:

  • Conducting phase III randomized controlled trials of AI in real-world clinical settings on prospective cases, and assessing the safety and side effects of AI use.

  • Improving the existing XAI methods by incorporating more clinically relevant information into the form of AI explanation, such as combining text description with feature localization.

  • Developing new XAI techniques that can achieve complementary human-AI performance, where the human-AI team outperforms either AI or human alone.

  • Seeking more clinical input to understand the clinical reasoning and clinical users’ requirements when incorporating AI assistance in their decision-making process.

  • Providing additional training or tutorial sessions to enable clinical users to understand the capability of AI, and incorporate the additional cognitive strategies while interpreting the AI explanation.

Some other possible solutions to the issues identified in the paper are:

  • Developing standardized and validated metrics and benchmarks for evaluating the clinical utility of AI and its explanation in neuro-oncology, and comparing different AI systems and XAI techniques on the same criteria.

  • Incorporating ethical, legal, and social implications of AI use in clinical settings, and ensuring the fairness, accountability, and transparency of AI systems and their explanations.

  • Designing user-friendly and intuitive interfaces for presenting AI predictions and explanations to clinical users, and allowing them to interact with and provide feedback to the AI system.

  • Exploring different ways of human-AI collaboration, such as allowing users to choose or switch among different modes of AI involvement in decision-making, and testing the effects of different sequences of exposure to AI prediction and explanation.

Photo by Marcelo Leal on Unsplash