U.S. healthcare organizations start assessing generative AI tools

Tuesday, December 3, 2024
AI
News

The availability of healthcare AI solutions has been growing like crazy for more than a year. Besides solutions developed dedicated with, by and/or for healthcare institutions, the supply of AI tools from (big) tech companies such as Google, Microsoft and Amazon is also growing. Those companies have been engaged in a race to market with generative AI tools for the healthcare sector since 2022. But, as a healthcare organization, how do you decide which AI tool is best suited for your organization? How do you compare different solutions? That is becoming an increasing challenge. Several U.S. healthcare organizations are now starting to address that.

So, for healthcare organizations, it is becoming increasingly difficult to evaluate these different AI tools and make the right choice. Which tools are best suited for our organization? And which tools are simply not good enough? To answer that and help healthcare organizations make a choice, several U.S. healthcare organizations, including Mass General Brigham and Emory, are teaming up. To that end, they recently launched the Healthcare AI Challenge Collaborative.

Generative AI tools to be tested and assessed

Within this collaboration, physicians from participating healthcare institutions can test the latest AI solutions in simulated clinical environments. Physicians will pit models against each other in a mutual competition and at the end of the year, produce a public ranking of the commercial tools available, and tested by them.

“The rate at which AI innovations for healthcare are being launched continues to increase. This unprecedented growth leads clinicians to struggle with determining the effectiveness of these innovations in terms of safely delivering value to healthcare providers and our patients. The Healthcare AI Challenge is a collective response to the complexities involved in advancing the responsible development and application of AI in healthcare. This new approach aims to put clinicians at the helm so they can evaluate the utility of different AI technologies and ultimately determine which solutions best meet and improve patient care,” says Keith Dreyer, DO, PhD, and leader of Mass General Brigham in Boston, among others.

Clinicians will evaluate the models for generating concept reports, key findings, differential diagnosis and other factors. The measures for evaluating the models are subject to change, depending in part on the tool's clinical use case. The accuracy of the AI tool will always weigh heavily, but there are conceivable cases, such as when the tool is used to produce a text report, where readability may be more important. “Some of those cases are very subjective. Like, do I feel like the style in which this text is presented is more readable or accessible to patients?” says Richard Bruce, associate professor of radiology and vice president of informatics at the University of Wisconsin School of Medicine and Public Health.

A ranking of reviews of AI tools

Eventually, the partnership will create a “ranking” of AI tools. That will then be used to provide feedback to technology companies and help health systems search for technology. Healthcare institutions can use that list to help make the right choice in purchasing a particular AI tool.

“Healthcare institutions can use the transparent rankings to inform decision-making and establish benchmark standards,” he said. Insights and best practices from the consortium can be adopted by non-participating healthcare systems,” says Dushyant Sahani, professor and chair of the department of radiology at the University of Washington.

Frameworks for evaluation and guidelines

Despite the rapid spread of AI in healthcare, the industry has been slow to agree on how to assess quality. Industry attempts to roll out evaluation frameworks and guidelines have so far not progressed beyond the concept stage. The Healthcare AI Challenge Collaborative should change that.

Without standardized evaluation methods, it is difficult to compare even the most similar tools. “Are there [common] metrics that directly compare them? As far as I know, tools are not currently compared directly to each other, aside from user surveys and anecdotes. There is no easy way to compare apples to apples,” said Richard Bruce.

So far, Emory Healthcare, the radiology departments at the University of Wisconsin School of Medicine and Public Health and the University of Washington School of Medicine, and the industry group the American College of Radiology are participating in the collaboration. Mass General Brigham also plans to further expand the partnership.