Combining Gene Mutation with Transcriptomic Data Improves Outcome Prediction in Myelodysplastic Syndromes

Abstract

Myelodysplastic syndromes (MDS) are myeloid neoplasms characterized by peripheral blood cytopenias and risk of progression to acute myeloid leukemia (AML). Disease management is challenged by heterogeneity in clinical courses and survival probability. Recently, the genomic screening integration (by Molecular International Prognostic Scoring System, IPSS-M) into patient’s assessment has resulted into a significant improvement in predicting clinical outcomes compared to the conventional prognostic score (Revised IPSS, IPSS-R). Many of the consequences of genetic and cytogenetic alterations will affect gene expression by means of transcriptional and epigenetic instability and altered microenviromental signaling. The aim of this project conducted by GenoMed4All and Synthema EU consortia is to link genomic information with transcriptomic data for possibly improving the prediction of clinical outcomes in MDS patients.


Data-Driven Harmonization of 2022 Who and ICC Classifications of Myelodysplastic Syndromes/ Neoplasms (MDS): A Study By the International Consortium for MDS (icMDS)

Abstract

The inclusion of gene mutations and chromosomal abnormalities in the 2022 WHO and ICC Classifications of MDS has enhanced diagnostic precision and is expected to improve clinical decision-making process. Although these two systems share similarities, clinically relevant discrepancies still exist and potentially cause inconsistency in their adoption in a clinical setting. In this study on behalf of the International Consortium for MDS (icMDS), we adopted a data-driven approach to provide a harmonization roadmap between the 2022 WHO and ICC classification for MDS. A modified Delphi Process consensus approach is currently ongoing among icMDS experts to finalize a harmonized MDS classification scheme.


Federated learning for causal inference using deep generative disentangled models

Abstract

In the context of decentralized and privacy-constrained healthcare data settings, we introduce an innovative approach to estimate individual treatment effects (ITE) via federated learning. Emphasizing the critical importance of data privacy in healthcare, especially when drawing on data from various global hospitals, we address challenges arising from data scarcity and specific treatment assignment criteria influenced by the availability of the medication of interest. Our methodology uses federated learning applied to neural network-based generative causal inference models to bridge the gap between decentralized and centralized ITE estimation on a benchmark dataset.


Opportunities and Challenges of Synthetic Data Generation in Oncology

Abstract

Widespread interest in artificial intelligence (AI) in health care has focused mainly on deductive systems that analyze available real-world data to discover patterns not otherwise visible. Generative adversarial network, a new type of inductive AI, has recently evolved to generate high-fidelity virtual synthetic data (SD) trained on relatively limited real-world information. The AI system is fed with a collection of real data, and it learns to generate new augmented data while maintaining the general characteristics of the original data set. The use of SD to enhance clinical research and protect patient privacy has drawn a lot of interest in medicine and in the complex field of oncology. This article summarizes the main characteristics of this innovative technology and critically discusses how it can be used to accelerate data access for secondary purposes, providing an overview of the opportunities and challenges of SD generation for clinical cancer research and health care.


Sickle cell disease landscape and challenges in the EU: the ERN-EuroBloodNet perspective

Abstract

Sickle cell disease is a hereditary multiorgan disease that is considered rare in the EU. In 2017, the Rare Diseases Plan was implemented within the EU and 24 European Reference Networks (ERNs) were created, including the ERN on Rare Haematological Diseases (ERN-EuroBloodNet), dedicated to rare haematological diseases. This EU initiative has made it possible to accentuate existing collaborations and create new ones. The project also made it possible to list all the needs of people with rare haematological diseases not yet covered health-care providers in the EU to allow optimised care of individuals with rare pathologies, including sickle cell disease. This Viewpoint is the result of joint work within 12 EU member states (ie, Belgium, Cyprus, Denmark, France, Germany, Greece, Ireland, Italy, Portugal, Spain, Sweden, and The Netherlands), all members of the ERN-EuroBloodNet. We describe the role of the ERN-EuroBloodNet to improve the overall approach to and the management of individuals with sickle cell disease in the EU through specific on the pooling of expertise, knowledge, and best practices; the development of training and education programmes; the strategy for systematic gathering and standardisation of clinical data; and its reuse in clinical research. Epidemiology and research strategies from ongoing implementation of the Rare Anaemia Disorders European Epidemiological Platform is depicted.


Synthetic Data Generation by Artificial Intelligence to Accelerate Research and Precision Medicine in Hematology

Abstract

Synthetic data are artificial data generated without including any real patient information by an algorithm trained to learn the characteristics of a real source data set and became widely used to accelerate research in life sciences. We aimed to (1) apply generative artificial intelligence to build synthetic data in different hematologic neoplasms; (2) develop a synthetic validation framework to assess data fidelity and privacy preservability; and (3) test the capability of synthetic data to accelerate clinical/translational research in hematology.


The need for multimodal health data modeling: A practical approach for a federated-learning healthcare platform

Abstract

Federated learning initiatives in healthcare are being developed to collaboratively train predictive models without the need to centralize sensitive personal data. GenoMed4All is one such project, with the goal of connecting European clinical and –omics data repositories on rare diseases through a federated learning platform. Currently, the consortium faces the challenge of a lack of well-established international datasets and interoperability standards for federated learning applications on rare diseases. This paper presents our practical approach to select and implement a Common Data Model (CDM) suitable for the federated training of predictive models applied to the medical domain, during the initial design phase of our federated learning platform. We describe our selection process, composed of identifying the consortium’s needs, reviewing our functional and technical architecture specifications, and extracting a list of business requirements. We review the state of the art and evaluate three widely-used approaches (FHIR, OMOP and Phenopackets) based on a checklist of requirements and specifications. We discuss the pros and cons of each approach considering the use cases specific to our consortium as well as the generic issues of implementing a European federated learning healthcare platform. A list of lessons learned from the experience in our consortium is discussed, from the importance of establishing the proper communication channels for all stakeholders to technical aspects related to –omics data. For federated learning projects focused on secondary use of health data for predictive modeling, encompassing multiple data modalities, a phase of data model convergence is sorely needed to gather different data representations developed in the context of medical research, interoperability of clinical care software, imaging, and –omics analysis into a coherent, unified data model. Our work identifies this need and presents our experience and a list of actionable lessons learned for future work in this direction.


DrOGA: An Artificial Intelligence Solution for Driver-Status Prediction of Genomics Mutations in Precision Cancer Medicine

Abstract

Precision cancer medicine suggests that better cancer treatments would be possible guiding therapies by tumor’s genomics alterations. This hypothesis boosted exome sequencing studies, collection of cancer variants databases and developing of statistical and Machine Learning-driven methods for alterations’ analysis. In order to extract relevant information from huge exome sequencing data, accurate methods to distinguish driver and neutral or passengers mutations are vital. Nevertheless, traditional variant classification methods have often low precision in favour of higher recall. Here, we propose several traditional Machine Learning and new Deep Learning techniques to finely classify driver somatic non-synonymous mutations based on a 70-features annotation, derived from medical and statistical tools. We collected and annotated a complete database containing driver and neutral alterations from various public data sources. Our framework, called Driver-Oriented Genomics Analysis (DrOGA), presents the best performances compared to individual and other ensemble methods on our data. Explainable Artificial Intelligence is used to provide visual and clinical explanation of the results, with a particular focus on the most relevant annotations. This analysis and the proposed tool, along with the collected database and the feature engineering pipeline suggested, can help the study of genomics alterations in human cancers allowing precision oncology targeted therapies based on personal data from next-generation sequencing.


Real-World Validation of Molecular International Prognostic Scoring System for Myelodysplastic Syndromes

Abstract

Myelodysplastic syndromes (MDS) are heterogeneous myeloid neoplasms in which a risk-adapted treatment strategy is needed. Recently, a new clinical-molecular prognostic model, the Molecular International Prognostic Scoring System (IPSS-M) was proposed to improve the prediction of clinical outcome of the currently available tool (Revised International Prognostic Scoring System [IPSS-R]). We aimed to provide an extensive validation of IPSS-M.


Challenges and Opportunities of Precision Medicine in Sickle Cell Disease: Novel European Approach by GenoMed4All Consortium and ERN-EuroBloodNet

Abstract

SCD research in Europe is limited by the relatively small, variable, and widely dispersed patient cohorts. In addition, the lack of harmonization among the repositories hampers world-wide collaborations. Artificial Intelligence (AI) can aid in analyzing large quantities of data, but only if it is collected and organized in a standardized way. To overcome some of the presented issues, GenoMed4All6 is a European initiative to provide personalized solutions for hematological diseases’ control and prevention by exploiting the power of AI.