How Federated Learning Could Transform Healthcare
It’s an understatement to say that doctors are swamped right now. At the beginning of April, coronavirus patients had filled New York emergency rooms so thoroughly that doctors across specialties, including dermatologists and orthopedists, had to help out.
Short-term, doctors need reliable, proven technology, like N95 masks. Longer-term, though, machine learning algorithms could help doctors treat patients. These algorithms can function as hyper-specialized doctor’s assistants, performing key technical tasks like scanning an MRI for signs of brain cancer, or flagging pathology slides that show breast cancer has metastasized to the lymph nodes.
One day, an algorithm could check CT scans for the lung lesions and abnormalities that indicate coronavirus.
“That’s a model that could be trained,” Mona G. Flores, MD, global head of medical AI at NVIDIA, told Built In.
At least, it could be trained in theory. Training an algorithm fit for a clinical setting requires a large, diverse data set. That’s hard to achieve in practice, especially when it comes to medical imaging. In the United States, HIPAA regulations make it very difficult for hospitals to share patient scans, even anonymized ones; privacy is a top priority at medical institutions.
That’s not to say trained algorithms haven’t made it into clinical settings. A handful have passed muster with the Food and Drug Administration, according to Dr. Spyridon Bakas, a professor at University of Pennsylvania’s Center for Biomedical Imaging Computing and Analytics.
In radiology, for instance, algorithms help some doctors track tumor size and progression, along with “things that cannot be seen with the naked eye,” Bakas told Built In — like “where the tumor will recur, and when.”
If algorithms could train on data without puncturing its HIPAA-mandated privacy, though, machine learning could have a much bigger impact on healthcare.
And that’s actually possible, thanks to a new algorithm training technique: federated learning.
What Is Federated Learning?
Federated learning is a way of training machine learning algorithms on private, fragmented data, stored on a variety of servers and devices. Instead of pooling their data, participating institutions all train the same algorithm on their in-house, proprietary data. Then they pool their trained algorithm parameters — not their data — on a central server, which aggregates all their contributions into a new, composite algorithm. This composite gets shipped back to each participating institution for more training, and then shipped back to the central server for more aggregation.
Eventually, all the individual institutions’ algorithms converge on an optimal, trained algorithm, more generally applicable than any one institution’s would have been — and nearly identical to the model that would have arisen from training the algorithm on pooled data.
A Short History of Federated Learning
In December of 2019, at a radiology conference in Chicago, NVIDIA unveiled a new feature for Clara SDK. This software development kit, created expressly for the healthcare field, helps medical institutions make and deploy machine learning models with “a set of tools and libraries and examples,” Flores said.
The new tool was Clara Federated Learning — infrastructure that allowed medical institutions to collaborate on machine learning projects without sharing patient data.
NVIDIA’s not the only tech company embracing federated learning. Another medical AI company, Owkin, has rolled out a software stack for federated learning called Owkin Connect, which integrates with NVIDIA’s Clara. Meanwhile, at least two general-purpose federated learning frameworks have also rolled out recently: Google’s TensorFlow Federated and the open-source PySyft.
The concept of federated learning, though, dates back to years earlier. Like many innovations, it was born at Google.
Federated Learning Started to Improve Search Suggestions
In 2017, Google researchers published a paper on a new technique they hoped could improve search suggestions on Gboard, the digital keyboard on Android phones. It was the first paper on federated learning.
In a blog post, Google AI research scientists Brendan McMahan and Daniel Ramage explained the very first federated learning use case like this:
When Gboard shows a suggested query, your phone locally stores information about the current context and whether you clicked the suggestion. Federated Learning processes that history on-device to suggest improvements to the next iteration of Gboard’s query suggestion model.
In other words, by blending edge computing and machine learning, federated learning offered a way to constantly improve the global query suggestion model without tracking users’ every move in a central database. In other words, it allowed Google to streamline its data collection process — an essential given the Android OS’s more than two billion active users.
That’s just one of many potential applications, though. Bakas saw potential applications in medical imaging. This should come as no surprise: Bakas was the lead organizer of the BraTS challenge.
The Challenge With Data Science Challenges
Since 2012, the BraTS challenge — an annual data science competition — has asked competitors to train algorithms to spot signs of brain tumors, specifically gliomas, on MRIs. All the competing teams use the same benchmark dataset to train, validate and test their algorithms.
In 2018, that data set consisted of about 2,000 MRIs from roughly 500 patients, pulled from 10 different medical institutions, Bakas said.
Now, this is a tiny fraction of the MRIs in the world relevant to the BraTS contest; about 20,000 people per year get diagnosed with gliomas in the U.S. alone. But obtaining medical images for a competition data set is tricky. For one, it requires the patient’s consent. For another, it requires approval from the contributing hospital’s internal review board, which involves proving the competition serves the greater good.
The BraTS challenge is just one of many data science challenges that navigate labyrinthine bureaucracy to compile data sets of medical images.
Major companies rely on these data sets too; they’re more robust than what even Google could easily amass on its own. Google’s LYNA, a machine learning algorithm that can pinpoint signs of metastatic breast cancer in the lymph nodes, first made headlines by parsing the images from the 2016 ISBI Camelyon challenge’s data set more than 10 percent more accurately than the contest’s original winner. NVIDIA, meanwhile, sent a team to the 2018 BraTS challenge — and won.
“[A]n accurate algorithm alone is insufficient to improve pathologists’ workflow or improve outcomes for breast cancer patients.”
Even challenge-winning algorithms, though — or the algorithms that beat the winning algorithms — aren’t ready for clinical use. Google’s LYNA remains in development. Despite 2018 headlines touting it as “better than humans in detecting advanced breast cancer,” it still needs more testing.
“[A]n accurate algorithm alone is insufficient to improve pathologists’ workflow or improve outcomes for breast cancer patients,” Google researchers Martin Stumpe and Craig Mermel wrote on the Google AI blog.
For one thing, it was trained to read one slide per patient — but in a real, clinical setting, doctors look at multiple slides per patient.
For another, accuracy in a challenge context doesn’t always mean real-world accuracy. Challenge data sets are small, and biased by the fact that every patient consented to share their data. Before clinical use, even a stellar algorithm may need to train on more data.
Like, much more data.
Federated Learning Meets BraTS MRIs
Federated learning, Bakas saw, could allow powerful algorithms access to massive stores of data. But how well did it work? In other words, could federated learning train an algorithm as accurate as one trained on pooled data? In 2018, he and a team of researchers from Intel published a paper on exactly that.
“No one before has attempted to apply federated learning in medicine,” he said.
He and his co-authors trained an off-the-shelf, basic algorithm on BraTS 2018 MRI images using four different techniques. One was traditional machine learning, using pooled data; another was federated learning; the other two techniques were alternate “collaborative learning” techniques that, like federated learning, involved training an algorithm on a fragmented data set.
“We were not married to federated learning,” Bakas said.
It emerged as a clear success story in their research, though — the best technique for melding AI with HIPAA-mandated data privacy. In terms of accuracy, the algorithm trained via federated learning was second only to the algorithm trained on conventional, pooled data. (The difference was subtle too; the federated learning algorithm was 99 percent as accurate as the traditional one.) Federated learning also made all the different institutions’ algorithms converge more neatly on an optimal model than other collaborative learning techniques.
Once Bakas and his co-authors validated the concept of federated learning, a team of NVIDIA researchers elaborated on it further, Bakas explained. Their focus was fusing it with even more ironclad privacy technology. Though federated learning never involves pooling patient data, it does involve pooling algorithms trained on patient data — and hackers could, hypothetically, reconstruct the original data from the trained algorithms.
NVIDIA found a way to prevent this with a blend of encryption and differential privacy. The reinvented model aggregation process involves “transferring only partial weights ... so that people cannot reconstruct the data,” Flores said.
It’s worth noting that NVIDIA’s paper, like the one Bakas co-authored, relied on the BraTS 2018 data set. This was largely a matter of practicality, but the link between data science competitions and federated learning could grow more substantive.
In the long-term, Bakas sees data science competitions “facilitating algorithmic development”; thanks to common data sets and performance metrics, these contests help identify top-tier machine learning algorithms. The winners can then progress to federated learning projects and train on much bigger data sets.
In other words, federated learning projects won’t replace data science competitions. Instead, they will function as a kind of major league for competition-winning algorithms to play in — and they’ll improve the odds of useful algorithms making it into clinical settings.
“The end goal is really to reach to the clinic,” Bakas said, “to help the radiologist [and] to help the clinician do their work more efficiently.”
What’s Next for Federated Learning?
Short answer: a lot. Federated learning is still a new approach to machine learning — Clara FL, let’s remember, debuted less than six months ago — and researchers continue to work out the kinks.
So far, NVIDIA’s team has learned that clear, shared data protocols play a key role in federated learning projects.
“You have to make sure that the data to each of the sites is labeled in the same fashion,” Flores said, “so that you're comparing apples to apples.”
Open questions remain, though. For instance — when a central server aggregates a group of trained algorithms, how should it do that? It’s not as straightforward as taking a mathematical average, because each institution’s data set is different in terms of size, underlying population demographics and other factors.
“Which ones do you give more weight to than others?” Flores said. “There are many different ways of aggregating the data.... That’s something that we are still researching.”
Leapfrogging to Other Industries ...
Federated learning has major potential, though, especially in Europe, where privacy regulations have already tightened due to the General Data Protection Regulation. The law, which went into effect back in 2018, is the self-proclaimed “toughest privacy and security law in the world” — so stringent, Bakas noted, that it would prevent hospitals from contributing patient data to the BraTS challenge, even if the individual patients consented.
So far, the U.S. hasn’t cracked down quite as heavily on privacy as the European Union has, but federated learning could still transform industries where privacy is paramount. Already, banks can train machine learning models to recognize signs of fraud, using in-house data; however, if each bank has its own model, it will benefit big banks and leave small banks vulnerable.
“While individual banks may like this outcome, it is less than ideal for solving the social issue of money laundering,” writes B Capital venture capitalist Mike Fernandez.
Federated learning could even the playing field, allowing banks of all sizes to contribute to a global fraud detection model trained on more data than any one bank could amass, all while maintaining their clients’ privacy.
Federated learning could apply to other industries too. As browsers like Mozilla and Google Chrome phase out third-party cookies, “federated learning of cohorts” could become a way of targeting digital ads to groups of like-minded users, while still keeping individual browser histories private. Federated learning could also allow self-driving cars to share the locations of potholes and other road hazards without sharing, say, their exact current location.
... but Not World Domination, Exactly
One thing Bakas doesn’t see federated learning doing, even in the distant future: automating away doctors. Instead, he sees it freeing up doctors to do what they do best, whether that’s connecting with patients or treating novel and complex ailments with innovative treatments. Doctors have already dreamed up creative approaches to the coronavirus, like using massage mattresses for pregnant women to boost patients’ oxygen levels.
They just don’t really excel at scanning medical imaging and diagnosing common, well-documented ailments, like gliomas or metastatic breast cancer.
“They can identify something that is already flaring up on a scan,” Bakas said, “but there are some ambiguous areas that radiologists are uncertain about.”
Machine learning algorithms, too, often make mistakes about these areas. At first. But over time, they can learn to make fewer, spotting patterns in positive cases invisible to the human eye.
This is why they complement doctors so powerfully — they can see routine medical protocols in a fresh, robotic way. That may sound like an oxymoron, but it’s not necessarily one anymore.