Considerations for Equity and Regulations for AI Tools in Oncology

News
Article

Testing AI-powered tools globally across diverse patient groups may ensure that they are accurate and consistent for all patients.

Regina Barzilay, PhD, distinguished professor of AI and Health in the Department of Computer Science at Massachusetts Institute of Technology (MIT) and AI Faculty lead at MIT Jameel Clinic

Regina Barzilay, PhD, distinguished professor of AI and Health in the Department of Computer Science at Massachusetts Institute of Technology (MIT) and AI Faculty lead at MIT Jameel Clinic

Regina Barzilay, PhD, distinguished professor of AI and Health in the Department of Computer Science at Massachusetts Institute of Technology (MIT) and AI Faculty lead at MIT Jameel Clinic, spoke with CancerNetwork® about considerations for ensuring that AI tools are equitable and effectively regulated to yield efficacy across diverse cancer populations. Additionally, she discussed AI-based programs that aim to expand the use of these technologies to collect data on a wider range of cancer types and geographical regions.

Barzilay first discussed the concept behind the Advanced Analysis for Precision Cancer Therapy (ADAPT) program, highlighting its aim to predict escape mutations and recommend appropriate therapy following their development. She then expressed that the lack of NCCN guidelines for AI tools may enable practitioners to use models that have comparatively low accuracy and contain racial bias compared with more effective models. Barzilay described her hope that greater time and experience with AI tools will favor more accurate and effective predictive models.

Furthermore, she outlined the background behind Learning to Cure, a program aimed at developing predictive AI models and retrospectively evaluating them in practices around the world. Highlighting the importance of testing AI models in a variety of geographical regions, ethnicities, and ages, Barzilay explained that the MIRAI model—a deep learning model assessing breast cancer risk based on mammograms—has been tested with 2 million patients, exhibiting consistency across populations.

Additionally, Barzilay touched upon a discrepancy between literature demonstrating the efficacy of emergent AI tools and their exclusion in NCCN guidelines, which may result in the utilization of outdated methodologies for patients seeking optimal cancer care. She further discussed techniques for ensuring the development of equitable and fair tools, highlighting a hesitancy in the field to address the widespread use of inequity-rich tools. Barzilay concluded by emphasizing the design of clinical protocols as a barrier to AI integration while expressing a need to apply the same regulatory scrutiny to AI tools as with other technologies developed in oncology.

CancerNetwork: Can you explain what ADAPT is and how it works?

Barzilay: ADAPT is a program that was started by the Advanced Research Projects Agency for Health (ARPA-H), and the idea was to adapt patient treatment after they have already been placed on therapy. As we know, especially for advanced cancer, the tumor mutates, and the treatment often becomes ineffective [with time]. When it is discovered that it is ineffective, the patient may have few choices.

What ADAPT is aiming to do is take all different types of patient measurements when they are placed on therapy––biopsy results, sequencing, imaging, and monitoring of blood throughout treatment––with the goal of predicting what kind of escape mutation is likely to develop, so one can predict when the patient will stop responding and recommend appropriate therapy for that patient.

Can you explain the lack of guidelines from the NCCN and the lack of coverage from insurance for AI use in oncology practice?

Barzilay: As far as I know, neither in breast nor lung [cancer] are there [NCCN guidelines on the] utilization of any AI tools. Maybe it would have been an okay situation if the current non-AI tools would be sufficient to predict the likelihood that a patient develops cancer. What has been extensively documented in the literature already is that models like Tyrer-Cuzick and Gail have very low predictive accuracy and are racially biased.

As a result, we are currently recommending chemotherapy prevention, additional MRI screening, and other things based on inaccurate tools. The hope is that as the community continues to learn more about the power of AI tools, we will be able to stratify patients using state-of-the-art techniques and not techniques that we know are not working well and missing many patients who can benefit from an intervention.

Can you give some background on Learning to Cure?

Barzilay: The idea of Learning to Cure was to not only develop AI tools for screening and detecting the future likelihood of disease, but to take these tools and retrospectively evaluate and test them in different hospitals around the world. The problem with the [current] screening tools and risk assessment tools is that humans cannot validate them at a time when they are using it. The radiologist can tell when a patient has a particular mark, and they can say, “Yeah, it makes sense that this patient will be predicted as having potential cancer; they need a biopsy.” But when you are thinking about predicting risk from the country, [testing] is not there yet. When there is nothing that the human eye can detect, human radiologists cannot validate whether the prediction that the patient is at a high risk is correct or incorrect.

That’s why it’s important to take these tools and test them on many different patients from different hospitals, countries, ethnicities, and ages. That’s exactly what we have done. MIRAI, which is an image-based breast cancer risk assessment tool, has been tested with close to 2 million patients, and we see that the results are consistent, which gives us confidence about using it in the clinic.

How has AI use impacted breast cancer and lung cancer screening?

Barzilay: There are many retrospective and prospective trials already showing that you can be more accurate in cancer detection [with AI]. You can be more accurate in identifying patients who need to be screened. In more accurate screening modalities like MRI, we can determine the optimal screening duration for the patient. Unfortunately, none of these utilizations are part of the guidelines, and we are still using the same methodologies that were developed decades ago and [have been] extensively criticized.

For instance, lung cancer screening may omit many patients who will, in the future, develop lung cancer because current screening guidelines are [overly] focused on smoking. We see increasing lung cancer in patients who never smoked or were exposed to smoke, so we need to have a different way to identify these high-risk patients. Unfortunately, guidelines have not adjusted to this new reality.

What can be done to ensure that AI is equitable and fair in treating different patient populations?

Barzilay: There are lots of techniques to ensure that AI can process a diverse population. First of all, it depends on how the model was trained. Was it trained on diverse populations? Also, [it depends on] the testing. Was it tested on diverse populations? For a majority of tools that are currently used, you can find them in diverse populations and see that they work. But [when] talking about AI inequity, we have broadly documented the issues with inequity-rich tools that are currently in guidelines, and nobody seems to be eager to change this troublesome situation.

Is there anything else you would like to highlight regarding ongoing AI research in clinical practice or Learning to Cure?

Barzilay: I would like to emphasize that [one of the] main barriers to [AI integration] to change outcomes for patients is designing clinical protocols that can utilize these tools effectively. Like everything else, these are not miracle tools. They have a lot of power, but they also have limitations. Ideally, what would happen here is we will have clinicians who are taking these tools and developing a pipeline that shows when we bring these tools into patient care, they improve outcomes and reduce costs.

Another question relates to the regulations and the policies. It’s interesting to see that many of the tools that the NCCN guidelines recommend are supposed to be FDA-regulated. They are not FDA-regulated, which is an interesting situation. When it comes to the tools being AI, they have to be regulated. All the tools have to be regulated, but they are not for historical reasons. We need to apply the same standards for all the tools that they use in patient care to provide patients with the best tools. Guidelines would play an important role in bringing this change.

Reference

MIRAI is redefining the future of breast cancer screening. Jameel Clinic. Accessed March 4, 2025. https://tinyurl.com/mpa3pwnd

Recent Videos
Given resource scarcity, developing practice strategies for resource-constrained settings would require aid from commercial and government stakeholders.
According to Megan Mullins, PhD, MPH, challenging cultural norms surrounding death and dying may reduce the receipt of low-value end-of-life cancer care.
Earlier and more frequent talks about disabling ICDs with patients receiving end-of-life care and their families may help avoid excessive pain.
Large international meetings may facilitate conversations regarding disparities of care outside of high-income countries.
AI-powered tools may help alleviate doctor burnout and give clinicians more time to directly engage with patients.
Artificial intelligence may have the potential to enrich pathology practices to help identify aspects of tumor biology not seen with the human eye.
Efficacy results from the MASAI trial preceded the creation of the UK-funded EDITH trial, assessing 5 AI platforms in 700,000 women undergoing mammography.
Related Content