Shedding Light on Medical AI: The Importance of Transparency According to UW Researchers

“`html

Depiction of a physician surrounded by technological symbols.

In a recent publication, researchers from the University of Washington assert that a vital criterion for implementing medical AI is transparency — that is, utilizing various techniques to elucidate how a medical AI system formulates its diagnoses and outputs.iStock

As discussions continue regarding how generative artificial intelligence will reshape occupations, AI is already modifying health care. AI systems are being employed for a wide array of applications including drug development, diagnostic tasks in radiology, and the compilation of clinical notes. A recent poll of 2,206 medical professionals uncovered that the majority express optimism about AI’s ability to enhance health care efficiency and accuracy, with nearly half of the respondents having utilized AI tools in their work.

However, AI continues to grapple with flaws, hallucinations, privacy issues, and other ethical dilemmas, resulting in significant risks when employed for sensitive and impactful tasks. In a review article released on Sept. 9 in Nature Reviews Bioengineering, University of Washington scholars emphasize that a fundamental standard for implementing medical AI is transparency — that is, utilizing various approaches to clarify how a medical AI system generates its diagnoses and outputs.

UW News conversed with the three co-authors of the paper regarding the significance of transparency in medical AI: co-lead authors Chanwoo Kim and Soham Gadgil, both PhD candidates at UW in the Paul G. Allen School of Computer Science & Engineering, and senior author Su-In Lee, a faculty member at the Allen School.

What distinguishes ethical discussions in medical AI from broader AI ethics conversations?

Chanwoo Kim: The inherent biases in AI systems and the probability of erroneous outputs present significant challenges, particularly in the medical field, since they can directly influence individuals’ health and even lead to life-altering decisions.

The cornerstone for mitigating these issues is transparency: being candid about the data, training, and evaluations that contributed to constructing a model. Understanding whether an AI model is biased begins with recognizing the data it was trained with. Insights derived from such transparency can highlight sources of bias and avenues for systematically addressing these risks.

Su-In Lee: A study from our laboratory serves as a pertinent illustration. During the peak of the COVID-19 pandemic, there was a proliferation of AI models that analyzed chest X-rays to predict whether a patient had COVID-19. In our research, we demonstrated that numerous models were inaccurate: while they claimed near-perfect accuracy of 100% or 99% within some datasets, this accuracy significantly diminished in external hospital datasets. This demonstrates that AI models struggle to generalize in real-world clinical environments. We employed a technique that revealed that the models relied on shortcuts: sometimes, the corners of X-ray images contained various text markings. We showed that these models were utilizing these markings, leading to incorrect results. Ideally, we want models to focus on the X-ray images themselves.

Your publication references “Explainable AI” as a means to achieve transparency. Can you elaborate on that?

SL: The field of Explainable AI emerged around a decade ago, as researchers strived to interpret outputs from new, intricate “black box” machine learning models.

As an explanation, consider a bank client seeking to determine their loan eligibility. The bank will analyze extensive data about that individual, such as age, profession, credit score, etc. They will input this information into a model that will predict the likelihood of loan repayment. A “black box” model only reveals the outcome. However, if the bank’s model permits insight into the factors influencing its decision, individuals can better grasp the reasoning process. That encapsulates the essence of Explainable AI: facilitating a deeper understanding of AI processes.

There exists a variety of methodologies, which we outline in our review paper. The method illustrated in the bank example is referred to as a “feature attribution” method. It connects its output back to the input features.

How can regulations mitigate some risks associated with medical AI?

CK: In the United States, the FDA oversees medical AI under the Software as a Medical Device (SaMD) framework. Recently, regulatory bodies have concentrated on developing a framework to enforce transparency. This encompasses clearly stating the intended functions of AI — specifying use cases for systems and the standards for accuracy and limitations in actual clinical environments, dependent on understanding how a model operates. Also, since medical AI is utilized in clinical environments where conditions change dynamically, AI performance can vary. Therefore, recent regulations aim to ensure that medical AI models are continually monitored throughout their deployment.

Soham Gadgil: New medical devices or pharmaceuticals undergo stringent testing and clinical trials for FDA approval. It is crucial for AI systems to also adhere to similarly rigorous testing and standards. Our laboratory has demonstrated that models, even those appearing accurate during tests, do not consistently generalize in the real world.

In my view, many organizations developing these models lack the motivation to prioritize transparency. Currently, the paradigm suggests that if a model excels in certain benchmarks — these sets of specific, standardized public tests utilized by AI organizations to compare or rank their models — then it is deemed sufficient for use, and likely to achieve widespread acceptance. However, this paradigm is incomplete, as these models can still generate hallucinations and erroneous information. Regulations can promote a focus on transparency in conjunction with model performance.

What role do you envision clinicians playing in the promotion of AI transparency?

CK: Clinicians are essential for attaining transparency in medical AI. When a clinician utilizes an AI model to assist with a diagnosis or treatment, they bear the responsibility of elucidating the rationale behind a model’s predictions since they are ultimately accountable for the patient’s well-being. Therefore, clinicians must be acquainted with the techniques of AI models and even fundamental Explainable AI techniques, enabling them to comprehend the workings of AI models — not perfectly, but sufficiently to explain the mechanisms to patients.

SG: We collaborate closely with clinicians for the majority of our lab’s biomedical research initiatives. Their insights guide us on what we should strive to explain. They inform us when Explainable AI solutions are accurate, whether…
“““html

they hold relevance in healthcare, and ultimately whether these clarifications will benefit patients and medical professionals.

What would you like the public to understand regarding AI transparency?

SL: We ought not to blindly depend on what AI accomplishes. Chatbots sometimes fabricate information, and healthcare AI models can err. Last year, in a different study, we evaluated five dermatology AI applications that can easily be downloaded via app stores. When an individual spots something abnormal on their skin, they capture a picture, and the applications determine whether it’s melanoma or not. Our research revealed that the outcomes were often inaccurate, similar to the COVID-19 AI systems. We employed a novel Explainable AI approach to illustrate why these systems faltered in specific manners — what underlies these errors.

SG: The initial step towards critically engaging with AI can be straightforward. For instance, if someone employs a generative model to seek basic medical advice for a minor issue, they might simply request the model to provide an explanation. Although the explanation may seem credible, it shouldn’t be accepted at face value. If the explanation cites sources, the user ought to confirm that those sources are reliable and ensure that the information is precise. For any potentially significant issues, medical professionals must be consulted. You shouldn’t rely on ChatGPT to find out if you’re experiencing a heart attack.

For additional details, contact Lee at [email protected].

“`

Leave a Reply Cancel reply