Learn how GE Healthcare used AWS to build a new AI model that interprets MRIs

MT HANNACH
11 Min Read
Disclosure: This website may contain affiliate links, which means I may earn a commission if you click on the link and make a purchase. I only recommend products or services that I personally use and believe will add value to my readers. Your support is appreciated!

Join our daily and weekly newsletters for the latest updates and exclusive content covering cutting-edge AI. Learn more


MRI images are naturally complex and data-intensive.

For this reason, the developers formation of large language models (LLM) for MRI analysis had to cut the images captured in 2D. But this only results in an approximation of the original image, thus limiting the model’s ability to analyze complex anatomical structures. This creates challenges in complex cases involving brain tumorsskeletal disorders or cardiovascular diseases.

But GE Health appears to have overcome this enormous obstacle, introducing the industry’s first Research Base Model (FM) for whole-body 3D MRI research at this year’s show. AWS re:Invent. For the first time, models can use full 3D images of the entire body.

GE Healthcare’s FM was built on AWS from the ground up (there are very few models specifically designed for medical imaging like MRIs) and is based on over 173,000 images from over 19,000 studies. The developers claim they were able to train the model with five times fewer calculations than was previously required.

GE Healthcare has not yet commercialized the foundation model; it is still in an evolving research phase. One of the first evaluators, General Brigham’s Massshould start experimenting with it soon.

“Our vision is to put these models in the hands of technical teams working in health systems, providing them with powerful tools to develop research and clinical applications faster and more cost-effectively,” said Parry Bhatia, director from GE HealthCare AI, to VentureBeat.

Enable real-time analysis of complex 3D MRI data

Although a revolutionary development, generative AI and LLMs are not new territory for the company. The team has been working with cutting-edge technologies for over 10 years, Bhatia said.

One of its flagship products is AIR Recon DLa deep learning-based reconstruction algorithm that allows radiologists to obtain sharp images faster. The algorithm removes noise from raw images and improves the signal-to-noise ratio, reducing scan times by up to 50%. Since 2020, 34 million patients have been scanned with AIR Recon DL.

GE Healthcare began work on its MRI FM in early 2024. Since the model is multimodal, it can support image-to-text search, link images and words, and segment and classify diseases. The goal is to give healthcare professionals more detail in a single analysis than ever before, Bhatia said, allowing for faster, more accurate diagnosis and treatment.

“The model has significant potential to enable real-time analysis of 3D MRI data, which can improve medical procedures such as biopsies, radiotherapy and robotic surgery,” said Dan Sheeran, managing director of healthcare and life sciences at AWS, at VentureBeat.

It has already outperformed other publicly available research models in tasks such as classifying prostate cancer and Alzheimer’s disease. It showed up to 30% accuracy in matching MRI scans to text descriptions when retrieving images – which may not sound that impressive, but it’s a big improvement over to the 3% capacity exhibited by similar models.

“It’s gotten to a point where it’s showing really solid results,” Bhatia said. “The implications are enormous.”

Do more with (much less) data

THE MRI process requires different types of data sets to support various techniques for mapping the human body, Bhatia explained.

What’s called a T1-weighted imaging technique, for example, highlights fatty tissue and decreases the water signal, while T2-weighted imaging enhances water signals. The two methods are complementary and create a complete image of the brain to help clinicians detect abnormalities such as tumors, trauma or cancer.

“MRI images come in all different shapes and sizes, the same way you would have books in different formats and sizes, right? said Bhatia.

To overcome the challenges presented by diverse datasets, the developers introduced a “resize and adapt” strategy so that the model can process and react to different variations. Additionally, data may be missing in certain areas (an image may be incomplete, for example), so they taught the model to simply ignore these instances.

“Instead of getting stuck, we taught the model to ignore the gaps and focus on what was available,” Bhatia said. “Think of it as solving a puzzle with a few pieces missing.”

The developers also used semi-supervised learning between students and teachers, which is particularly useful when data is limited. With this method, two different neural networks are trained on labeled and unlabeled data, with the teacher creating labels that help the student learn and predict future labels.

“We’re now using a lot of these self-supervised technologies, which don’t require huge amounts of data or labels to train large models,” Bhatia said. “This reduces dependencies, allowing more to be learned from these raw images than in the past.”

This helps ensure that the model works well in hospitals with fewer resources, older machines and different types of data sets, Bhatia explained.

He also highlighted the importance of multimodality of models. “In the past, a lot of technology was unimodal,” Bhatia said. “It would only focus on the image, the text. But now they’re becoming multimodal, they can go from image to text, from text to image, so you can integrate a lot of things that were done with separate models in the past and really unify the workflow .

He stressed that researchers only use datasets to which they have rights; GE Healthcare has partners who authorize anonymized data sets and ensure compliance standards and policies.

Use AWS SageMaker to Solve Compute and Data Challenges

There is no doubt that creating such sophisticated models presents many challenges, such as limited computing power for 3D images that are several gigabytes in size.

“This is a huge volume of 3D data,” Bhatia said. “You have to put it into the model’s memory, which is a really complex problem.”

To help overcome this problem, GE Healthcare relied on Amazon SageMakerwhich provides high-throughput networking and distributed training capabilities across multiple GPUs, and leverages Nvidia A100 and Tensor Core GPUs for large-scale training.

“Because of the size of the data and the size of the models, they can’t send it into a single GPU,” Bhatia explained. SageMaker allowed them to customize and scale operations across multiple GPUs that could interact with each other.

Developers also used Amazon FSx In Amazon S3 object storage, which enabled faster reading and writing of datasets.

Bhatia pointed out that another challenge is cost optimization; Using Amazon’s Elastic Computing Cloud (EC2), developers were able to move unused or rarely used data to less expensive storage tiers.

“Leveraging Sagemaker to train these large models – primarily for efficient, distributed training across multiple high-performance GPU clusters – was one of the critical components that really helped us move faster,” Bhatia said.

He emphasized that all components were built with a data integrity and compliance perspective that takes into account HIPAA and other regulations and regulatory frameworks.

Ultimately, “these technologies can really streamline, help us innovate faster, as well as improve overall operational efficiency by reducing administrative burden and ultimately improve patient care because you are now providing more personalized care.”

Serve as a basis for other refined specialized models

Although the model is specific to the field of MRI for now, the researchers see excellent opportunities to expand to other areas of medicine.

Sheeran pointed out that historically, AI in medical imaging has been limited by the need to develop custom models for specific conditions in specific organs, requiring expert annotation for each image used in training.

But this approach is “inherently limited” due to the different ways in which diseases manifest from one individual to another, and presents problems of generalizability.

“What we really need are thousands of such models and the ability to quickly create new ones as we encounter new information,” he said. High-quality labeled datasets for each model are also essential.

Now, with generative AI, instead of training discrete models for each disease/organ combination, developers can pre-train a single base model that can serve as the basis for other specialized models refined downstream.

For example, GE Healthcare’s model could be extended to areas such as radiotherapy, where radiologists spend a lot of time manually marking organs that may be at risk. It could also help reduce analysis time during X-rays and other procedures that currently require patients to sit in a machine for long periods of time, Bhatia said.

Sheeran marveled that “we are not only expanding access to medical imaging data through cloud-based tools; we’re changing how this data can be used to advance AI in healthcare.

Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

What do you like about this page?

0 / 400