Welcome to Part 2 in our miniseries on Federated Learning!

You can find all the details of our conversation so far in Part 1. In the first installment, we traveled through the continuum of machine data, learned about the different flavors of Federated Learning and pondered on its added value in healthcare. Ultimately, we were left with a question: what is holding back the adoption of Federated Learning applications?

Now it’s the time to address the still untapped potential of Federated Learning.

 

How to build a Federated Learning framework?

The challenges ahead 

At first glance, we might be quick to assume that a cross-silo approach for Federated Learning seems to be the easiest path to take implementation-wise: after all, we are dealing with a limited number of well-known, addressable edge systems, which are more powerful and reliable overall. However, this apparent ‘simplicity’ conceals another wide spectrum of issues to account for, either from a business, data integration, security or platform perspective. Let’s dive right in.

Business challenges 

In a Federated Learning network, there is a risk that edge nodes may behave ‘selfishly’ in order to compromise between model accuracy and cost [1]. This delicate balance of risk-reward is intimately tied to the governance of the network itself and has many implications on what is commonly known as ‘health justice’. In the context of GenoMed4All, this theme crystalizes into how we define and enforce ‘equity’ among nodes in the network and our capability to properly adjust for discrepancies in overall performance and model accuracy when onboarding a ‘dissonant’ node. Anticipating these ‘dissonances’ in an FL network is key, since participants may not be evenly matched in terms of the resources –human and material alike– they are able to commit to this joint enterprise. On this point, the research community has already dedicated quite a lot of effort to find out how we can maximize benefit for each node with limited engagement: the answer seems to lie in the way we estimate both the motivation and contribution of our network nodes.

On the topic of motivation, we may ask ourselves: how do I reward participation for each edge node in a way that ensures that the central server can maintain optimal quality? Putting in place incentive mechanisms that work for all participants involved is key, especially in such a highly heterogenous environment. These incentives or rewards may take multiple forms, like accessing specific central services, benefitting from models without contributing to their training, the opportunity to launch a new training plan… you name it. For the estimation of a node’s contribution, however, a reward can only be fixed if the ‘value’ each node brings to the network can be adequately quantified, and this is not a straightforward exercise: it has to consider both dataset size and quality, and the computation needs to then be correlated to the accuracy of the final model and updated with each training iteration [2].

Data integration challenges 

When considering data usage, we must be mindful of how to onboard organizations operating across multiple geographic, political and regulatory scenarios –especially those dictating data protection regulations– to this FL network. The first barrier we must be aware of in terms of data integration is the minimum anonymized dataset that needs to be shared for the initial FL model to be correctly tested, developed and bootstrapped. Another significant roadblock are the access policies that govern dataset extraction at the edge and define what is and is not allowed in terms of data science operations on metadata and model alike. As of today, there is a marked interest on how to strike the right integration between authorization policy language to encode these access policies and the technology required to enforce them.

Additionally, there is the ever-present matter of data quality, which the distributed nature of Federated Learning only aggravates [3]. From the qualifying and onboarding phases to integration to monitoring, it permeates the whole FL lifecycle. Well before onboarding new edge nodes –another hospital, for example– to our network, we should have in place a clear, auditable set of qualifying criteria (e.g. incentive model, hosting capabilities, training resources, available datasets…) that potential candidates are expected to meet in order to officially become nodes. This pre-selection step, though critical to the whole network performance, does not usually get the recognition it deserves, due to either monetary or time constraints.

Immediately after, in the onboarding per se, data quality must be assessed again. It also comes into play when using and integrating a Common Data Model (CDM), since training algorithms with datasets from heterogeneous sources –like Electronic Health Records (EHRs)– has a negative impact in network maintenance and scalability, which can only be mitigated by enforcing a single CDM for the central and edge nodes [4]. For GenoMed4All, our CDM pick is FHIR (Fast Health Interoperable Resources), a standard that defines how healthcare data may be exchanged between nodes regardless of how it is actually stored in those nodes. Compared to other standards alternatives, FHIR shows large (and growing) adoption rates among care providers and has sufficient support for genomic data representation, two key and decisive arguments in the context of GenoMed4All. However, the future healthcare industry seems to be slowing but surely edging towards more fluid scenarios that favor the co-existence of a wide variety of CDM standards – for instance, the emerging OpenEHR standard. This trend would be especially relevant in federated ecosystems like GenoMed4All’s, since they intend to amalgamate an ever-growing, wildly heterogenous landscape of hospitals under a unique distributed umbrella.

Monitoring data quality during training is also tricky: every time datasets are added to the network, there is a need to evaluate whether they are really up-to-standard, mainly to avoid entering a new training loop that ultimately pushes back an updated, poorer quality model to the central server.

On top of these sizeable pile of issues to consider lies the unescapable fact that we are operating in the healthcare realm, where challenges in data integration are always multi-faceted. New social determinants –linked to decision support, care pathways, medication…– and unconventional sources of information –social media, the Internet of Things– have started to permeate the way we look at and make sense of healthcare processes … and in turn, this heightened understanding has exposed a pressing need to outline and regulate data subject rights. As a result, the concept of ‘digital sovereignty’ has been coined to protect the individual’s right for autonomy in a predominantly digital world, and the EU has embraced this notion as the cornerstone of its strategy to usher in a new area of European digital leadership centered around ensuring citizens retain control over their personal data.

Security challenges 

In a cross-silo FL scenario, one of the most pressing issues currently under the spotlight is linked to data and client system security, or how to prevent information leaks during the multiple update iterations. Even if no data is exchanged between edge nodes and the central server, the model may still contain some patient-sensitive information in its parameters. The server is normally the one exploiting this vulnerability, since it centralizes client updates and has more control on the FL process as a whole. Solutions to this problem rely on Secure Multiparty Computation (SMC) to aggregate updates or Differential Privacy (DP) to distort client updates locally. Additionally, we might need to also protect the central server against potential malicious client attacks from the edge nodes: those aiming to compromise the convergence of the global model by either disrupting the training process or providing false updates [5]. For GenoMed4All this is not as relevant an issue since all partners participating as clients are considered trusted nodes in the network.

Platform challenges 

The current landscape of Federated Learning platforms –and their features– paints a picture of highly heterogenous, research-specific and not yet mature alternatives (e.g. Flower, Fedbiomed, Fate, TensorFlow Federated, PySyft, Paddle) that are emerging as an unequivocal sign of all the excitement and interest surrounding FL. Key platform capabilities like configurability, robustness, scalability, performance, user experience… are non-negotiable for GenoMed4All’s ambition. After all, we are working on a production environment that intends to serve an ever-growing community with an increasing number of algorithms and use cases.

But how to make this vision a reality? The problem is, modern workspace software environments are sorely missing a model federation dimension. Nowadays, AI platforms are mature enough to handle everything from data exploration, testing, pre-processing and transformation and feature engineering to model validation and deployment… and yet, they still have not figured out how to support a model federation approach. As a result, we are missing out on several fronts: first, on metadata exploration tools for data scientists to build their models and features on; and second, on workspaces with adequate debug, development and testing capabilities to handle models with longer lifecycles and incremental contributions from edge nodes [6]. The inevitable conclusion? Team productivity and efficiency are greatly impacted.

Another contender for top platform challenge in FL is data extraction. Data scientists follow complex workflows for model development and data extraction plays a major role in the selection, (cohort) transformation and feature extraction steps. These operations must be first formalized by the platform so they can then be automatically reproduced on the edge nodes. For data scientists, a platform that can provide easy-to-use tools to step away from manual configuration before jumping to model deployment is certainly a bonus. That is why we are taking care to integrate a flexible ETL (Extract, Transform, Load) tool –containing data cohort definition linked to the target model– to configure data extraction and transformation steps from the CDM to the algorithms in GenoMed4All’s platform.

All these challenges are represented in the scorecard below, described in the context of GenoMed4All and ranked in order of priority (i.e. we have marked with 3 stars those that we consider to be core challenges in the project).

 

The GenoMed4All project or why Federated Learning will serve rare disease research 

At GenoMed4All, we are building a Federated Learning platform where clinicians and researchers can work together in the definition, development, testing and validation of AI models to improve the way we currently diagnose and treat hematological diseases in the EU. We envision two complementary operational modes for this platform: a clinical mode, catering to the needs of healthcare professionals and patients in their daily practice; and a research mode, where data scientists can train and benchmark AI models from available data on hematological diseases.

For clinicians, GenoMed4All’s platform will act as a local decision support system to input new prospective and retrospective patient data, extracting insights from an ever-learning model. For researchers, GenoMed4All offers an AI sandbox to benchmark and train new AI models on real-world data and to ensure their clinical usability, a critical point that has so far hampered the real-world integration of AI applications in healthcare.

We believe that a radical shift in how we introduce these kind of tools to a clinical setting is sorely needed to ensure their accountability, transparency and usefulness among healthcare professionals. Drawing a parallel to how we rely on solid pharmaco-vigilance processes to monitor adverse reactions and ultimately confirm a certain drug is safe for use, we can certainly envision a similar clinical validation flow for these tools that successfully undergoes the same level of scrutiny and meets the required standards for performance excellence in a clinical setting.

 

All in all, we have seen that Federated Learning is indeed an emerging technology that is still finding its footing within the research field. The cross-silo approach we have followed does provide a number of unquestionably attractive capabilities for AI applications in the clinical research space, namely those in the data privacy domain. However, several challenges lurk in the horizon… and must be addressed before this approach can finally become mainstream practice in the healthcare industry, so that Federated Learning can effectively deliver on all the promises we have navigated through in this miniseries.

In this research space, GenoMed4All plays a pioneer role as it explores the large spectrum of issues raised by Federated Learning in healthcare: form platform technology selection and development all the way to defining the full data flow and Common Data Model, security, privacy and an end-to-end operational model. This close collaboration environment, spearheaded by multiple care providers in Europe, leading edge research institutions and recognized industrial partners (meet our stellar team here!) is our core strength to pave the way forward and deliver on new innovation opportunities.

If you enjoyed this miniseries on Federated Learning, stay tuned for future Knowledge Pills!

 


Missed anything? Check out these references!

 


This knowledge pill was created by Vincent Planat (DEDALUS), Francesco Cremonesi (DATAWIZARD) and Diana López (AUSTRALO) from the GenoMed4All consortium
Photo by Milad Fakurian on Unsplash