Data-driven, harmonised classification system for myelodysplastic syndromes: a consensus paper from the International Consortium for Myelodysplastic Syndromes
Abstract
The WHO and International Consensus Classification 2022 classifications of myelodysplastic syndromes enhance diagnostic precision and refine decision-making processes in these diseases. However, some discrepancies still exist and potentially cause inconsistency in their adoption in a clinical setting. We adopted a data-driven approach to provide a harmonisation between these two classification systems. We investigated the importance of genomic features and their effect on the cluster assignment process to define harmonised entity labels. A panel of expert haematologists, haematopathologists, and data scientists who are members of the International Consortium for Myelodysplastic Syndromes was formed and a modified Delphi consensus process was adopted to harmonise morphologically defined categories without a distinct genomic profile. The panel held regular online meetings and participated in a two-round survey using an online voting tool. We identified nine clusters with distinct genomic features. The cluster of highest hierarchical importance was characterised by biallelic TP53 inactivation. Cluster assignment was irrespective of blast count. Individuals with monoallelic TP53 inactivation were assigned to other clusters. Hierarchically, the second most important group included myelodysplastic syndromes with del(5q). Isolated del(5q) and less than 5% of blast cells in the bone marrow were the most relevant label-defining features. The third most important cluster included myelodysplastic syndromes with mutated SF3B1. The absence of isolated del(5q), del(7q)/-7, abn3q26.2, complex karyotype, RUNX1 mutations, or biallelic TP53 were the basis for a harmonised label of this category. Morphologically defined myelodysplastic syndrome entities showed large genomic heterogeneity that was not efficiently captured by single-lineage versus multilineage dysplasia, marrow blasts, hypocellularity, or fibrosis. We investigated the biological continuum between myelodysplastic syndromes with more than 10% bone marrow blasts and acute myeloid leukaemia, and found only a partial overlap in genetic features. After the survey, myelodysplastic syndromes with low blasts (ie, less than 5%) and myelodysplastic syndromes with increased blasts (ie, 5% or more) were recognised as disease entities. Our data-driven approach can efficiently harmonise current classifications of myelodysplastic syndromes and provide a reference for patient management in a real-world setting.
MOSAIC: An Artificial Intelligence–Based Framework for Multimodal Analysis, Classification, and Personalized Prognostic Assessment in Rare Cancers
Abstract
Rare cancers constitute over 20% of human neoplasms, often affecting patients with unmet medical needs. The development of effective classification and prognostication systems is crucial to improve the decision-making process and drive innovative treatment strategies. We have created and implemented MOSAIC, an artificial intelligence (AI)–based framework designed for multimodal analysis, classification, and personalized prognostic assessment in rare cancers. Clinical validation was performed on myelodysplastic syndrome (MDS), a rare hematologic cancer with clinical and genomic heterogeneities.
Personalized Timing for Allogeneic Stem-Cell Transplantation in Hematologic Neoplasms: A Target Trial Emulation Approach Using Multistate Modeling and Microsimulation
Abstract
Decision about the optimal timing of a treatment procedure in patients with hematologic neoplasms is critical, especially for cellular therapies (most including allogeneic hematopoietic stem-cell transplantation [HSCT]). In the absence of evidence from randomized trials, real-world observational data become beneficial to study the effect of the treatment timing. In this study, a framework to estimate the expected outcome after an intervention in a time-to-event scenario is developed, with the aim of optimizing the timing in a personalized manner.
Clinical and Genomic-Based Decision Support System to Define the Optimal Timing of Allogeneic Hematopoietic Stem-Cell Transplantation in Patients With Myelodysplastic Syndromes
Abstract
Allogeneic hematopoietic stem-cell transplantation (HSCT) is the only potentially curative treatment for patients with myelodysplastic syndromes (MDS). Several issues must be considered when evaluating the benefits and risks of HSCT for patients with MDS, with the timing of transplantation being a crucial question. Here, we aimed to develop and validate a decision support system to define the optimal timing of HSCT for patients with MDS on the basis of clinical and genomic information as provided by the Molecular International Prognostic Scoring System (IPSS-M).
SYNDSURV: A simple framework for survival analysis with data distributed across multiple institutions
Abstract
This deliverable provides the first report of the GenoMed4All Ethical Advisory Board (EAB). The report has been compiled after the first meeting of the Ethical Advisory Board held on 19th January 2022. The meeting was presented with the current position with regards the project objectives and progress, data protection and ethics oversight and assessments for the project.
Fully Automated Detection and Segmentation Pipeline for the Bone Marrow of the Lytic Bone of Multiple Myeloma Patients
Abstract
Monitoring the changes in bone marrow during therapy for multiple myeloma patients is a crucial task. Osteolytic lesions can cause deformation of the bones, affecting the robustness of traditional segmentation tools. A two-model deep learning analysis is explored in this study. A detection model reduces pixel imbalances between the background and the bone marrow pixels, achieving a mAP of 0.878±0.005. A residual U-Net segments the bone marrow, yielding a DSC of 0.856±0.003. The proposed deep learning-based segmentation pipeline allows accurate and fast annotation of the bone marrow in multiple myeloma patients.
Ensemble of Heterogeneous Machine Learning Models with Multiple Inputs for Multi Omics Analysis
Abstract
Multiple myeloma is a plasma cell neoplasm with genetic complexity that originates in pre-malignant stages due to genomic alterations, leading to malignant plasma cell proliferation. The completeness of data is significantly affecting multi-omics studies since the more sources included in the analysis, the more likely it is for key data to be missing. In this study, an ensemble meta-model that uses transfer learning from multiple single-source models was developed to assess the progression of multiple myeloma by leveraging radiocytogenetics. The proposed meta-model achieved the highest performance with an AUC of 0.75±0.07 and a SP of 0.84±0.02 among other single-source and radiocytogenetic models.
Clinical Text Reports to Stratify Patients Affected with Myeloid Neoplasms Using Natural Language Processing
Abstract
The availability of multimodal patient data, such as demographics, clinical, imaging, treatment, quality of life, outcomes and wearables data, as well as genome sequencing, have paved the way for the development of multimodal clinical solutions that introduce personalized or precision medicine. The clinical report is an information layer that contains relevant information about the disease in addition to the patient’s point of view. Natural language processing (NLP) is a branch of artificial intelligence (AI) and its pre-trained language models are the key technology for extracting value from this data layer.
Synthetic Histopathological Images Generation with Artificial Intelligence to Accelerate Research and Improve Clinical Outcomes in Hematology
Abstract
Hematological malignancies are rare and complex diseases and as a consequence, multimodal data (ranging from clinical and genomic information to images) are required to improve diagnosis, prognosis and personalized treatments. However, collecting all these layers of information is challenging, in particular when collecting cytological and histological images from the bone marrow (BM) reproducing disease morphologic features. Synthetic data generation by Artificial Intelligence (AI) can circumvent these issues by generating images conditioned from textual inputs (i.e. reports from pathologists), which are widely available and contain many useful clinical information. This technology can enrich data with synthetic images, thus boosting translational research and improving the performances of precision medicine strategies based on multimodal information.
Combining Gene Mutation with Transcriptomic Data Improves Outcome Prediction in Myelodysplastic Syndromes
Abstract
Myelodysplastic syndromes (MDS) are myeloid neoplasms characterized by peripheral blood cytopenias and risk of progression to acute myeloid leukemia (AML). Disease management is challenged by heterogeneity in clinical courses and survival probability. Recently, the genomic screening integration (by Molecular International Prognostic Scoring System, IPSS-M) into patient’s assessment has resulted into a significant improvement in predicting clinical outcomes compared to the conventional prognostic score (Revised IPSS, IPSS-R). Many of the consequences of genetic and cytogenetic alterations will affect gene expression by means of transcriptional and epigenetic instability and altered microenviromental signaling. The aim of this project conducted by GenoMed4All and Synthema EU consortia is to link genomic information with transcriptomic data for possibly improving the prediction of clinical outcomes in MDS patients.